Created attachment 191302 [details] Original smaller document: old.odt I have a strange finding. After saving a odt from 2021 ("File old.odt") without any changes in LibreOffice 7.6.2.1 the file size multiplicates ("File new.odt"). "File old.odt": 993,2 kB "File new.odt": 4.1 MB After investigating the extracted content of the odt/zip files I found the reason: By saving without any changes in recent LibreOffice build addiontinal fonts have been embedded into the odt file. Why? (see "embedded fonts.png") Version: 7.6.2.1 (X86_64) / LibreOffice Community Build ID: 60(Build:1) CPU threads: 8; OS: Linux 5.4; UI render: default; VCL: gtk3 Locale: de-DE (de_DE.UTF-8); UI: de-DE Ubuntu package version: 4:7.6.2~rc1-0ubuntu0.18.04.1~lo1 Calc: threaded
Created attachment 191303 [details] After saving with Writer: new.odt
Created attachment 191304 [details] embedded fonts.png
In addition I found: There are two checkboxed to set which fonts to embedd. It looks like "Only embed used fonts" seems to have no effect stand-alone. It produces the same file size as if there is non of these two checkboxes checkedd. This is why I ask to replace these two checkboxes by three three radio buttons. That would lead to a clear user experience.
I assume that all LibreOffice applications are affected by these issues. But I only tested Writer. Please also have a look at: https://www.reddit.com/r/libreoffice/comments/136b8ni/fonts_embedding_what_those_options_actually_do/
Two sum-up the situation... we have two thing here: 1. Clarification needed why additional fonts are embedded when saving the document with it's default settings and without any changes. Is it a regression? Has it to do with fonts versions on newer OS environments? 2. It looks like the idea of the embedd font settings is to allow the user to select between three settings. If this can be confirmed two checkboxes (that together provide 4 states) are a bad solution from UX perspective. A better solution would be radio boxes or a drop-down with three options.
I think you are experiencing the issue in bug 65353 - fonts that belong to default styles are embedded, even if those styles are not used in the document. *** This bug has been marked as a duplicate of bug 65353 ***
If the second checkbox "Only used fonts" relates to the first and should be indented and enabled only if the first is checked. Don't get the three settings idea. Shall we keep this ticket for the UI? (In reply to OfficeUser from comment #5) > 2. It looks like the idea of the embedd font settings is to allow the user > to select between three settings.
I agree to keep open to re-focus on the UI. See also bug 130185, which tries to clarify what exactly those settings do.
(In reply to Heiko Tietze from comment #7) > If the second checkbox "Only used fonts" relates to the first and should be > indented and enabled only if the first is checked. Tomaz, is this correct?
(as a side note, if anyone is wondering why saving the "File old" sample with LO 7.6 adds an extra Font_FreeSans_3.ttf: it was by design, from cc631df53da60b486ececd620064a65c6683a20c for bug 155486.)
(In reply to Heiko Tietze from comment #9) > (In reply to Heiko Tietze from comment #7) > > If the second checkbox "Only used fonts" relates to the first and should be > > indented and enabled only if the first is checked. > > Tomaz, is this correct? All are related to font embedding and have no effect font embedding is disabled. "Only embed fonts that are used in the document" checks what fonts are used in the document and just uses those, even if styles use other fonts. The 3 font scripts to embed options will embed or not embed (depends whatever you select) the font subsets for different scripts if font embedding is enabled. So if you are sure to never use CTL or CJK scripts in your document you can disable those and those will be skipped when embedding the fonts and will save you a good amount of space (depending on how script rich the font is).
This is not at all merely a dialog design problem! It's way more than that! First, I don't see why it's legitimate, that an editor whose native format is ODT would make changes to an opened ODT which is saved with no user actions. The input ODT is just fine as an output ODT. Second - the Save/Save As... dialog didn't ask me whether I wanted to embed fonts. Who told LO I was willing to embed _any_ fonts? I sure didn't! At the very least it should let me get to another dialog where that setting is present - if not outright have it in the Save dialog. I get asked about PGP encryption but not about font embedding; why? Third, how come the choice of whether or not to embed fonts is a property of the document? I mean, it can tell me whether it _has_ embedded fonts, but how can another person's document hold a preference of mine? This is weird. Does the ODF mandate this? If you believe these should separate issues, please say which belong here and which don't.
We discussed the topic in the design meeting. A simple English dummy text saves to 28k, if all font embedding options are on to 22000k, with "only used" 927k, and in case of all but limited to latin 1800k. It remains unclear what embedding all fonts exactly means; the help could be more informative here. To tackle the problem we suggest to check the "only used fonts" option by default and disable it until the first "do embed fonts" option is active. Furthermore it could make sense to check the script types depending on what is set under tools > options > language. So far unchecking CTL/CJK has no effect on the font embedding options (remains available and checked). However, if the document contains no CJK text it makes not much sense to embed such fonts - and we wonder if those options make sense.
(In reply to Eyal Rozenberg from comment #12) > First, I don't see why it's legitimate, that an editor whose native format > is ODT would make changes to an opened ODT which is saved with no user > actions. The input ODT is just fine as an output ODT. No idea what you're trying to say here. > Second - the Save/Save As... dialog didn't ask me whether I wanted to embed > fonts. Who told LO I was willing to embed _any_ fonts? I sure didn't! At the > very least it should let me get to another dialog where that setting is > present - if not outright have it in the Save dialog. I get asked about PGP > encryption but not about font embedding; why? True, I agree you should be able to review and change the settings from the "save as" dialog, not just font embedding, but various other properties as well like document descriptions, author, purge statistics, document signing,... > Third, how come the choice of whether or not to embed fonts is a property of > the document? I mean, it can tell me whether it _has_ embedded fonts, but > how can another person's document hold a preference of mine? This is weird. > Does the ODF mandate this? Because to embed fonts is a decision that needs to be done for each document depending on the use case and font copyright situation, and removing the embedded fonts choice is also something that needs to be explicit because it can have document structure implications. For example: - The document you use will always only be read and edited by you - no need to embed fonts. - Next document you will want that share so you want it to preserves the structure for others because there is a good chance that the fonts you're using they don't have - you select to embed fonts. - You get a document with embedded fonts, but you don't have the fonts installed on your computer and open the document, the structure of the document will be correct. Then you change the document and save it - the document will still have the embedded fonts - next time you open the document it will still display correctly. If this wouldn't be per-document choice, this use-case would fail silently.
(In reply to Heiko Tietze from comment #13) > To tackle the problem we suggest to check the "only used fonts" option by > default and disable it until the first "do embed fonts" option is active. I wouldn't check it by default. This should be checked when it is known the document will not have any major edits in the future, which would use styles that are not yet used in the document. > Furthermore it could make sense to check the script types depending on what > is set under tools > options > language. So far unchecking CTL/CJK has no > effect on the font embedding options (remains available and checked). That probably makes sense, but I can still write Japanese in the document even when I don't have "Asian" enabled in the settings. > However, if the document contains no CJK text it makes not much sense to > embed such fonts - and we wonder if those options make sense. How do you know what kind of text is written in the document - currently we don't know this. I rather the user makes a explicit decision about this. BTW. CJK and Complex will not have much effect unless you actually use a font that extensively supports CJK and Complex. For example use Noto Sans - the numbers I get in this case are (and I have no CJK or Complex language support enabled): Default: 9.4kB Embedded: 52.4MB Embedded, "only used": 31.7MB Embedded, "only used", only Latin: 1.5MB
(In reply to Tomaz Vajngerl from comment #15) > I wouldn't check it by default. This should be checked when it is known the > document will not have any major edits in the future, which would use styles > that are not yet used in the document. So you save 50MB Noto fonts just in case it might be used later? Quite unlikely that you share a document with people around the world who want to edit your document but don't have the proper font installed. The default should be off. > How do you know what kind of text is written in the document - currently we > don't know this. I rather the user makes a explicit decision about this. Why do we need the CTL/CJK/Latin options at all? Embedded should be enough. (In reply to Eyal Rozenberg from comment #12) > the Save/Save As... dialog didn't ask me whether I wanted to embed fonts. I suggest to discuss the consequences of shared documents in an extra ticket.
(In reply to Heiko Tietze from comment #16) > (In reply to Eyal Rozenberg from comment #12) > > the Save/Save As... dialog didn't ask me whether I wanted to embed fonts. > I suggest to discuss the consequences of shared documents in an extra ticket. I'm not sure what you mean by shared documents, I was talking about plain ODT documents like the ones we have in the attachment. But perhaps I'm misunderstanding. (In reply to Tomaz Vajngerl from comment #14) > (In reply to Eyal Rozenberg from comment #12) > > First, I don't see why it's legitimate, that an editor whose native format > > is ODT would make changes to an opened ODT which is saved with no user > > actions. The input ODT is just fine as an output ODT. > > No idea what you're trying to say here. Let me try to clarify with an example. Suppose you had a text editor, which, for some text files, when opening them, then saving them without any keypress or editing action - would change the text in the saved file relative to the original. Wouldn't that be a rather weird thing? The user does not expect something like this to happen. The same (one can argue) goes for the native editor of any format: It can de-serialize it from a file, then re-serialize to the exact same document. And this would be unlike importing from a "foreign" format like DOCX, where, when saving to a DOCX, you can expect some changes due to the conversion.
(In reply to Heiko Tietze from comment #16) > So you save 50MB Noto fonts just in case it might be used later? Quite > unlikely that you share a document with people around the world who want to > edit your document but don't have the proper font installed. The default > should be off. Read again what I wrote. I'm not saying embedded should be default. I'm saying "only used fonts" shouldn't be default, because you don't know if the user will use the unused styles in the future or not. > Why do we need the CTL/CJK/Latin options at all? Embedded should be enough. Because when you know you won't need embedded fonts for other languages you can disable those to be embedded and save you space. This is why I made the example with "Noto Sans", which explicitly show the benefit 31.7MB -> 1.5MB saving.
(In reply to Tomaz Vajngerl from comment #18) > Read again what I wrote. I'm not saying embedded should be default. I'm > saying "only used fonts" shouldn't be default, because you don't know if the > user will use the unused styles in the future or not. That's my point. Since the flag is set on the document, you should automatically store the used fonts in later editions if used. No need to plan ahead. Btw, what happens if I uncheck the option? Become fonts that were previously embedded purged? If not, how about switching the whole functionality into a one-click interaction, some command that embeds the fonts (and another to remove) thinking of it as a final step in processing. And if someone else runs the same command again new fonts would be added, but only if manually initiated.
(In reply to Eyal Rozenberg from comment #17) > Let me try to clarify with an example. Suppose you had a text editor, which, > for some text files, when opening them, then saving them without any > keypress or editing action - would change the text in the saved file > relative to the original. Wouldn't that be a rather weird thing? The user > does not expect something like this to happen. > > The same (one can argue) goes for the native editor of any format: It can > de-serialize it from a file, then re-serialize to the exact same document. > And this would be unlike importing from a "foreign" format like DOCX, where, > when saving to a DOCX, you can expect some changes due to the conversion. For a text if you open and save it without changing anything it may not necessary be equivalent to the original in all circumstances. You can have an editor configured to always save in UTF8 and if you open an old ISO-8859-1 file, it would save that in UTF-8. Content would still be the same however. For ODF it's similar but we constantly evolve the format so things like this happen more often especially as text files don't really have a structure except the way text is encoded. So re-saving an old ODF file in a new LibreOffice will in many cases have changes in the structure (mainly additional elements in the XML), but the document should still rendered the same. ODF structure also isn't 1:1 with the internal model (which is stored in structures that are more optimized for interactive editing), so there are consequences due to this (IIRC saving-loading sometimes can move the elements around). So no, we don't have this invariant requirement currently in LO for ODF files. Still not sure how this is related to font embedding.
(In reply to Heiko Tietze from comment #19) > That's my point. > > Since the flag is set on the document, you should automatically store the > used fonts in later editions if used. No need to plan ahead. Well, this is not the only use case. What if I create a template document and there is actually no real content, so most styles would be set as unused and most fonts not embedded. I don't think we can assume the reason for embedding the fonts is just so the shared document is only read or lightly edited. The option that is less likely to cause issues is the one that should be the default IMHO. Any other option should be made consciously by the user. > Btw, what happens if I uncheck the option? Become fonts that were previously > embedded purged? Well "save" action always creates a document from scratch so there is nothing purged, so the fonts would just not be embedded. The document and fonts currently open would still be there. > If not, how about switching the whole functionality into a > one-click interaction, some command that embeds the fonts (and another to > remove) thinking of it as a final step in processing. And if someone else > runs the same command again new fonts would be added, but only if manually > initiated. Font embedding is implemented in the filter - you have no idea which fonts have previously been embedded and which fonts haven't. Not sure what issue you want to solve with this..
(In reply to Tomaz Vajngerl from comment #20) > For a text if you open and save it without changing anything it may not > necessary be equivalent to the original in all circumstances. You can have > an editor configured to always save in UTF8 and if you open an old > ISO-8859-1 file, it would save that in UTF-8. Content would still be the > same however. Well, 1. In our case, the content is not exactly the same. 2. "Always force UTF-8" is a configuration setting which is not, I would think, the default in most/all text editors. So, yes, you could have your editor auto-correct certain things or canonicalize them - if you specifically asked it to. > So no, we don't have this invariant requirement currently in LO for ODF > files. Still not sure how this is related to font embedding. If such a requirement exists, the same fonts should be embedded in the new document as in the old one.
(In reply to Eyal Rozenberg from comment #22) > > Well, > > 1. In our case, the content is not exactly the same. What changed? Just some fonts were added for unknown reason. > 2. "Always force UTF-8" is a configuration setting which is not, I would > think, the default in most/all text editors. So, yes, you could have your > editor auto-correct certain things or canonicalize them - if you > specifically asked it to. Sure - just wanted to show that with the most banal example of a format possible it is not guaranteed that a no-change save will result in the exact file. > If such a requirement exists, the same fonts should be embedded in the new > document as in the old one. Even if the fonts were not available the first time the document was saved and are now? In any way, currently we don't track what fonts were embedded on load and carry the list over to save. Every time we search the styles for what fonts need to be embedded and do so.
(In reply to Tomaz Vajngerl from comment #23) > Even if the fonts were not available the first time the document was saved > and are now? Why, does the ODF has syntax for requesting that future editors of the file embed more fonts? If that were the case, then we could talk about it... > In any way, currently we don't track what fonts were embedded on load and > carry the list over to save. Every time we search the styles for what fonts > need to be embedded and do so. ... and that is a bug; it should at least be configurable, and I would say that the default would be not to embed additional fonts, when those were already in use in the document.