Bug 153907 - Apply an add-or-remove spaces heuristic when matching font family name
Summary: Apply an add-or-remove spaces heuristic when matching font family name
Status: RESOLVED DUPLICATE of bug 143095
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: PDF-Import-Draw PDF-Import-Writer
  Show dependency treegraph
 
Reported: 2023-03-01 20:59 UTC by Eyal Rozenberg
Modified: 2023-03-02 08:38 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
A PDF with Minion Pro and Myriad Pro fonts (43.61 KB, application/pdf)
2023-03-01 21:07 UTC, Eyal Rozenberg
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eyal Rozenberg 2023-03-01 20:59:21 UTC
(This is an issue I'm experiencing in the context of importation of PDFs, but I actually think it's more generally relevant.)

Some font families have names which are sequences of multiple words, e.g. Adobe's "Minion Pro" (an OpenType version of their Minion PostScript font). But such sequences often someone get mangled into no-spacing character sequences, e.g. "MinionPro". There are probably multiple reasons why this can happen; quite possibly something to do with input filters for file formats, but maybe already in the file being opened. Anyway, this happens.

So, it is not uncommon for LO to open a document which has a font of the  "FooBar" family, while the system has font of the "Foo Bar" family - or even vice-versa: The document has "Foo Bar", your system has "FooBar".

I believe that in these cases, and when an exact match of the family name is not found, LO will, fall back on the spaces-on or spaces-off alternative respectively. 

There are probably multiple places in the code where this heuristic could be applied. Obviously within input filter code, but also when you "just" open an ODT, in choosing which font to display with - so something like an implicit font substitution rule? Let developers decide. 

PS - It's not even a search for a lot of font family name combinations: Just one extra combination for removing spaces, and as many combinations as there are capital-separated words, i.e. FooBarMS would make us look for:

FooBarMS,

FooBar MS, Foo BarMS, FooBarM S,

Foo Bar MS, FooBar M S, Foo BarM S

Foo Bar M S

although maybe we could be more conservative with that.
Comment 1 Eyal Rozenberg 2023-03-01 21:07:21 UTC
Created attachment 185677 [details]
A PDF with Minion Pro and Myriad Pro fonts

An example PDF lifted from bug 153888 - has a lot of text in Minion Pro and Myriad Pro, which inside the PDF look like this:

/BaseFont/PMBACO+MinionPro-Bold

or 

/BaseFont/PMBABN+MyriadPro-SemiboldCond
Comment 2 V Stuart Foote 2023-03-01 21:09:56 UTC
@Eyal, isn't this already the issue of bug 143095, and dupes?
Comment 3 Eyal Rozenberg 2023-03-01 21:32:39 UTC
(In reply to V Stuart Foote from comment #2)
> @Eyal, isn't this already the issue of bug 143095, and dupes?

Well... this is not about only-PDFs, it's general. If you open an HTML or ODT document with "MyriadPro", and your system has a "Myriad Pro" font - you should get that. And if the PDF import filter "gave" LO MyriadPro - other code in LO should be able to figure it out.

Also, this bug is not about the appearance of variants in the fony family name.

So they're definitely related. If you're thinking about a different way to structure the bug pages, or limit this bug to not overlap with 143095, I could be ok with that.
Comment 4 ⁨خالد حسني⁩ 2023-03-01 22:40:24 UTC
I don’t see why this framed as a general issue, are there any example of this happening other than when importing PDF?
Comment 5 Eyal Rozenberg 2023-03-02 08:38:19 UTC
(In reply to خالد حسني from comment #4)
> I don’t see why this framed as a general issue, are there any example of
> this happening other than when importing PDF?

While there probably are somewhere, and there are in principle, I think I'll agree with Stuart's and your reservations and close this for now. If I see concrete non-PDF examples I'll reopen.

*** This bug has been marked as a duplicate of bug 143095 ***