Bug 66401 - FILESAVE: DOCX exported document shows 'Combined Characters' with wrong font in MSWord
Summary: FILESAVE: DOCX exported document shows 'Combined Characters' with wrong font ...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.2.0.0.alpha0+ Master
Hardware: Other All
: high major
Assignee: Andreas Brandner
URL:
Whiteboard: BSA target:6.0.0
Keywords: filter:docx, notBibisectable
Depends on:
Blocks: DOCX
  Show dependency treegraph
 
Reported: 2013-06-30 14:35 UTC by Adam CloudOn
Modified: 2017-11-09 03:14 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
DOCX containing 'Combined Characters' (15.99 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-06-30 14:35 UTC, Adam CloudOn
Details
DOCX exported by LO with combined characters lost (4.82 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-09-21 09:48 UTC, Adam CloudOn
Details
Screenshot comparison between original DOCX and exported DOCX (59.25 KB, image/png)
2013-09-21 09:49 UTC, Adam CloudOn
Details
Comparison between Original File and Roundtrip File (237.33 KB, image/png)
2013-12-13 10:47 UTC, surbhi.tongia
Details
File in LibreOffice (180.80 KB, image/png)
2013-12-13 10:49 UTC, surbhi.tongia
Details
After Exporting file (9.05 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-12-13 10:54 UTC, surbhi.tongia
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Adam CloudOn 2013-06-30 14:35:43 UTC
Created attachment 81747 [details]
DOCX containing 'Combined Characters'

Problem description: 
When loading a DOCX that has 'Combined Characters' to LO - they are not shown (there is an open bug for the import part here https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66400).
When saving the file back to DOCX - the 'Combined Characters' are lost.

Steps to reproduce:
1. Load the attached DOCX in LO
2. Save as a NEW.DOCX
3. Open the NEW.DOCX in Word - the 'Combined Characters' are lost.

Current behavior:
The 'Combined Characters' should are not exported back to the DOCX

Expected behavior:
The 'Combined Characters' should be exported correctly back to the DOCX

              
Operating System: All
Version: 4.2.0.0.alpha0+ Master
Comment 1 Jorendc 2013-07-01 09:24:14 UTC
I can confirm this behavior using Mac OSX 10.8.4 with LibreOffice Version: 4.2.0.0.alpha0+ Build ID: 9ab800829b8a0e44824dc11276b54b1870bc5b2b in combination of Word for Mac 2011.

'data loss' -> major high

Kind regards,
Joren
Comment 2 Adam CloudOn 2013-09-21 09:48:46 UTC
Created attachment 86232 [details]
DOCX exported by LO with combined characters lost
Comment 3 Adam CloudOn 2013-09-21 09:49:18 UTC
Created attachment 86233 [details]
Screenshot comparison between original DOCX and exported DOCX
Comment 4 surbhi.tongia 2013-12-13 10:45:14 UTC
Combined characters are exported correctly but font size is not getting preserved.

Verified on Build:libo-master~2013-12-11_02.11.28_LibreOfficeDev_4.3.0.0.alpha0_Win_x86.

Operating System:Windows 7.
Comment 5 surbhi.tongia 2013-12-13 10:47:36 UTC
Created attachment 90708 [details]
Comparison between Original File and Roundtrip File
Comment 6 surbhi.tongia 2013-12-13 10:49:15 UTC
Created attachment 90709 [details]
File in LibreOffice
Comment 7 surbhi.tongia 2013-12-13 10:54:07 UTC
Created attachment 90710 [details]
After Exporting file
Comment 8 QA Administrators 2015-04-19 03:20:07 UTC Comment hidden (obsolete)
Comment 9 Buovjaga 2015-06-15 10:08:38 UTC
Checking my exported docx with Word viewer, the characters are not lost, but they are not bold and there are commas between them (E,X m,p).

Win 7 Pro 64-bit Version: 5.1.0.0.alpha1+
Build ID: 01a189abcd9a4ca472a74b3b2c000c9338fc2c91
TinderBox: Win-x86@39, Branch:master, Time: 2015-06-14_07:46:28
Locale: fi-FI (fi_FI)
Comment 10 Buovjaga 2016-09-15 13:49:54 UTC
WFM is incorrect status, back to NEW.

I still get the same incorrect result as in comment 9.

Win 8.1 32-bit
MSO 2013
LibO Version: 5.3.0.0.alpha0+
Build ID: 8697d18f717c75ddeedfe08161091da71007b859
CPU Threads: 4; OS Version: Windows 6.29; UI Render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2016-09-15_02:55:18
Locale: fi-FI (fi_FI); Calc: group
Comment 11 Justin L 2016-12-09 06:38:13 UTC
This bug is confusing because the title hasn't been changed.  Comment 4 and comment 9 indicate that combined characters are no longer lost.  I can confirm that since I cannot reproduce in Linux.  Bibisect and bug 66400 indicate that combined characters were first supported for import in LO4.2.  As of last42onmaster, round-tripping always showed combined characters in Linux.  Spot checked in 4.4, 5.2 and 5.3beta1.  Confirmed that Word2007 could open the round-tripped file properly (without noticing the commas mentioned in comment 9).

So the outstanding problem is that *in Word* the round-tripped combined characters are now smaller/thinner. A non-printing item behind the combined characters indicates font Calibri size 11 instead of Arial Black size 24. (In LibreOffice the round-trip looks identical and shows no reference to calibri 11.)
Comment 12 Andreas Brandner 2017-10-20 11:20:56 UTC
Combined Characters in Word are represented as a Field of the type Equation. When importing a docx, Writer doesn't preserve the individual run-properties of all the runs of the Field, but only keeps the properties of the very first run.
This works fine in Writer regarding the formating, however Word interprets the missing run-properties as the run not having properties at all, rather than using the properties of the first run for all the runs inside the Field. It therefore falls back to a default style, as noted in Comment 11.
Comment 13 Andreas Brandner 2017-10-20 11:21:57 UTC

*** This bug has been marked as a duplicate of bug 38778 ***
Comment 14 Andreas Brandner 2017-11-08 14:50:36 UTC
After the fix for bug 38778, there was some manual work necessary to make Combined Characters work properly.

- The font-size has to be set to half of the normal value.
- The font has to be exported explicitly after all other properties, to make sure, that all runs receive a font-property. This may seem redundant, but is the way MS Word handles this.
Comment 15 Commit Notification 2017-11-08 23:48:54 UTC
Andreas Brandner committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=e4ccf5f597d84f5745d73d306e83594f665024bb

tdf#66401 don't lose docx-combined-characters' font props on roundtrip

It will be available in 6.0.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Commit Notification 2017-11-09 03:14:11 UTC
Andreas Brandner committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=ee57d2f8a57ac851c1250f2962ecd1fa987ee3d9

related tdf#66401 docx Combined Characters roundtrip unit test

It will be available in 6.0.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.