Bug 108350 - DOCX files must use C-fonts on IMPORT by default
Summary: DOCX files must use C-fonts on IMPORT by default
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Mike Kaganski
URL:
Whiteboard: target:5.5.0 target:5.4.0.1
Keywords: filter:docx
Depends on:
Blocks: Fonts DOCX-Styles
  Show dependency treegraph
 
Reported: 2017-06-05 20:45 UTC by Mike Kaganski
Modified: 2017-06-25 16:49 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
A file without font information that uses Calibri font in Word (1.24 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2017-06-05 20:45 UTC, Mike Kaganski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2017-06-05 20:45:35 UTC
Created attachment 133865 [details]
A file without font information that uses Calibri font in Word

Attached is a sample file *without* any font information, with a single paragraph. It is shown with Calibri 11 pt font by MS Word (since 2007, i.e. the first OOXML implementation).

To allow good layout match, we should use our bundled metric-compatible font Carlito in these cases. Currently, the default font is used instead.
Comment 1 Mike Kaganski 2017-06-05 20:49:52 UTC
A patch is submitted: https://gerrit.libreoffice.org/38421
Comment 2 Johnny_M 2017-06-05 21:09:34 UTC
Do you know how MS Word behaves if a font different than Calibri is configured as the default font? Does it still use Calibri when opening this file then?

Maybe it's a question of the default font configuration (should it be changed to Carlito in LO)? For me, on LO 5.3.3 on Ubuntu the font on opening the file is Carlito 11, which is what I had configured as the default font.

How was the attached file created? Using some sort of a script? (It's file properties don't show a Word version it was created with.) Does it even follow the DOCX standard when omitting the font information?
Comment 3 Mike Kaganski 2017-06-06 02:55:35 UTC
(In reply to Johnny_M from comment #2)
> Do you know how MS Word behaves if a font different than Calibri is
> configured as the default font? Does it still use Calibri when opening this
> file then?
> 
> Maybe it's a question of the default font configuration (should it be
> changed to Carlito in LO)? For me, on LO 5.3.3 on Ubuntu the font on opening
> the file is Carlito 11, which is what I had configured as the default font.

Yes, MS Word (both 2007 and 2016) uses Calibri 11 by default (i.e. when file doesn't include font info) when opening existing docx files, regardless of the default font set in its default template (normal.dotx) which is used for creation of new documents, not for getting options missing in existing documents. This isn't a preference setting, it's a matter of consistent layout that relies on hardcoded default.

Tested with default font set to Times New Roman 12.

> How was the attached file created? Using some sort of a script? (It's file
> properties don't show a Word version it was created with.)

This is a file that was originally generated by a third-party application that was a customer's bugdoc. Its XML was manually sanitized to exclude any sensitive information, and be minimal possible reproducer.

> Does it even
> follow the DOCX standard when omitting the font information?

Yes. It's been validated using MS Open XML SDK 2.5 Productivity Tool for Microsoft Office against all three targets (2007, 2010 and 2013 formats).
Comment 4 Johnny_M 2017-06-06 05:52:22 UTC
The proposed change looks plausible then, thanks for the explanation!
Comment 5 V Stuart Foote 2017-06-06 06:56:38 UTC
Confirmed on Windows 10 Pro 64-bit en-US with
Version: 5.3.3.2 (x64)
Build ID: 3d9a8b4b4e538a85e0782bd6c2d430bafe583448
CPU Threads: 8; OS Version: Windows 6.19; UI Render: GL; Layout Engine: new; 
Locale: en-US (en_US); Calc: group

Word 2007 opens test doc assigning Calibri 11pt, LO Writer opens with Libreation Serif 12pt.
Comment 6 Commit Notification 2017-06-06 13:17:57 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=5471a5585cba925bb0dcb2dc41e03ad563998166

tdf#108350: Use Carlito for DOCX import by default

It will be available in 5.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Cor Nouws 2017-06-06 13:51:34 UTC
Hi Mike,
Planning the same for Calc and Impress? Or don't Excel/Powerpoint produce files without font information?
Comment 8 Mike Kaganski 2017-06-06 14:03:04 UTC
Cor,

That was a real-life file, so I worked on a data to test. I have no respective files of other formats. If you happen to come across one - please share and I'll try to fix them, too.

Btw: MS Office itself doesn't produce such files AFAIK; the bugdoc was created by some third-party report-generating software.
Comment 9 Yousuf Philips (jay) (retired) 2017-06-06 14:12:03 UTC
Hi Mike,

Thanks for the patch, but we shouldnt set our metric-compatible font when importing such a document. We should set Calibri like Word and if the user doesnt have it, it fallsback on Carlito for rendering but still retains it as the set font, else when it is saved and reopened in Word, it wont look similar to the original.
Comment 10 Mike Kaganski 2017-06-06 14:15:40 UTC
(In reply to Yousuf Philips (jay) from comment #9)

Yes, that's reasonable.
Comment 11 Mike Kaganski 2017-06-06 14:24:48 UTC
https://gerrit.libreoffice.org/38456/ uses Calibri
Comment 12 Yousuf Philips (jay) (retired) 2017-06-06 14:50:40 UTC
In the same spirit of this fix, the 'Heading' style should also be set to use the Cambria font.

Alternatively if we want to be as accurate as possible, we should import the default styles.xml file found in Word, though there are some differences between Word 2010 and below and Word 2013 and above. If you are interested in doing this, i'll provide the two styles.xml that can be used.
Comment 13 Mike Kaganski 2017-06-06 14:58:49 UTC
(In reply to Yousuf Philips (jay) from comment #12)
> In the same spirit of this fix, the 'Heading' style should also be set to
> use the Cambria font.

This is not consistent with current import of docx with default font information. Please note that this fix is about default document font, not about re-creation of styles.
Comment 14 Commit Notification 2017-06-06 15:04:55 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "libreoffice-5-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6f2ad89b33d972f9642bb53eeb91f41df3b6b0e6&h=libreoffice-5-4

tdf#108350: Use Carlito for DOCX import by default

It will be available in 5.4.0.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Johnny_M 2017-06-21 22:14:22 UTC
(In reply to Mike Kaganski from comment #11)
> https://gerrit.libreoffice.org/38456/ uses Calibri

This looks fixed then. :) (There are no commit messages here due to a mistake in their TDF bug numbers.)
Comment 16 Yousuf Philips (jay) (retired) 2017-06-25 12:25:38 UTC
(In reply to Mike Kaganski from comment #13)
> This is not consistent with current import of docx with default font
> information.

MS documents have 2 default document c-fonts, 1 defined in 'Normal' (Calibri) and a 1 defined in 'Heading' (Cambria) and your fix only solves the c-font in 'Normal', so i was suggesting you also fix the one for 'Heading'.

To see the issue
1. Open attachment 133865 [details] Writer
2. Press ctrl+1 to set line to 'Heading 1'
3. Font is set to Liberation Sans
4. Open attachment 133865 [details] in Word
5. Press ctrl+alt+1 to set line to 'Heading 1'
6. Font is set to Cambria

> Please note that this fix is about default document font, not
> about re-creation of styles.

I'll open a separate bug report for the re-creation of styles.
Comment 17 Mike Kaganski 2017-06-25 16:21:49 UTC
(In reply to Yousuf Philips (jay) from comment #16)

Thanks, I'll take a look.
Comment 18 Mike Kaganski 2017-06-25 16:49:38 UTC
Well, I must close this one once again.

I must first say that I would be happy to do the change if there were a test document to see the issue. Having said that, here is what makes me believe that comment 16 does not relate to this issue, and also arguable.

This issue is not about styles. It is about default choice of settings that are used in absence of styles (specifically, fonts) information. And the fix does not touch any styles: rather, it controls what is on Tools-Options-LibreOffice Writer-Basic Fonts (and that incidentally controls styles as a side-effect). And there is an entry specifically for headings, so that would be trivial to extend the fix to that entry.

What stops me is that I have no document which would be opened differently depending on the default font for headings. I suppose that in MS Word, you just cannot declare a paragraph a heading, skip the font information and have it shown with Cambria (chosen by Word by default).

What you describe in steps 1-6, is totally different thing. The file opens identically, what differs is not the existing data, but the program's choice WRT how to create a previously-absent style; and I am unsure that we should definitely try to mimic Word here (I might be wrong).

Yet, I repeat: if there is a way in DOCX to define a heading, skip fonts, and get Cambria - that would be legitimate reason to reopen this and appropriately extend the fix.