Bug 143557 - docx 16-color char highlight lost in round trip through odt format
Summary: docx 16-color char highlight lost in round trip through odt format
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.0.0.3 release
Hardware: All All
: lowest minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on: 131920
Blocks: Highlight-Color
  Show dependency treegraph
 
Reported: 2021-07-27 01:35 UTC by Luke Deller
Modified: 2022-09-20 14:10 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
highlighting.docx (22.31 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2021-07-27 01:35 UTC, Luke Deller
Details
highlighting-via-odt.docx produced with LO 7.1.5.2 (linux x64 deb) (14.92 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2021-07-27 10:00 UTC, Luke Deller
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Deller 2021-07-27 01:35:02 UTC
Description:
LibreOffice preserves text highlighting when opening a doc/docx file and saving it again.  However when the file is saved as odt, then the odt converted back to doc/docx, then the highlighting is converted to shading.  While this looks visually the same, it surprises users who can no longer remove the highlighting in Microsoft Word as expected.

Steps to Reproduce:
1. Open the attached highlighting.docx file (or any doc/docx produced by Microsoft Word containing highlighting will do)
2. Save As highlighting-output.docx
3. Save As highlighting-output.odt
4. Close document
5. Open highlighting-output.odt
6. Save As highlighting-via-odt.docx
7. For each of the two output docx files (highlighting-output.docx
 and highlighting-via-odt.docx) check whether this still contains highlighting

If you have access to Microsoft Word:
7A. Open the docx file in Word and try to remove the yellow highlighting according to the documentation at https://support.microsoft.com/en-us/office/apply-or-remove-highlighting-1747d808-6db7-4d49-86ac-1f0c3cc87e2e

If you do not have access to Microsoft Word:
7B. Unzip the docx file, open the contained word/document.xml, and search for an element starting "<w:highlight"

Actual Results:
7A:
✅ highlighting-output.docx: highlighting can be removed in Word
❌ highlighting-via-odt.docx: highlighting cannot be removed in Word

7B: 
✅ highlighting-output.docx: "<w:highlight" element is present
❌ highlighting-via-odt.docx: "<w:highlight" element is missing

Expected Results:
7A: removing the highlighting should work in both files
7B: both files should contain an element starting "<w:highlight"


Reproducible: Always


User Profile Reset: Yes



Additional Info:
Related item:
bug 131920: Full text highlight + shading support
(that one is a bigger item involving UI changes, more than what is required to solve this issue here)

Related setting:
Tools -> Options -> Load/Save -> Microsoft Office -> Character Highlighting
"Export As" defaults to "Shading".  Setting this to "Highlighting" avoids this issue here but it breaks export of shading, effectively reintroducing bug 125268.
Comment 1 Luke Deller 2021-07-27 01:35:55 UTC
Created attachment 173871 [details]
highlighting.docx
Comment 2 BogdanB 2021-07-27 09:06:31 UTC
I tested with version
Version: 7.1.5.2 (x64) / LibreOffice Community
Build ID: 85f04e9f809797b8199d13c421bd8a2b025d52b5
CPU threads: 4; OS: Windows 10.0 Build 19043; UI render: Skia/Raster; VCL: win
Locale: ro-RO (ro_RO); UI: en-US
Calc: threaded

No problem with any of the file. All of the versions have highlight.
Comment 3 BogdanB 2021-07-27 09:10:06 UTC
-<w:p>
-<w:pPr>
<w:pStyle w:val="Normal"/>
-<w:rPr>
<w:lang w:val="en-US"/>
</w:rPr>
</w:pPr>
-<w:r>
-<w:rPr>
<w:lang w:val="en-US"/>
</w:rPr>
<w:t xml:space="preserve">Example of highlighting: </w:t>
</w:r>
-<w:r>
-<w:rPr>
<w:highlight w:val="yellow"/>
<w:lang w:val="en-US"/>
</w:rPr>
<w:t>this text is highlighted yellow</w:t>
</w:r>
</w:p>
Comment 4 Luke Deller 2021-07-27 10:00:55 UTC
Created attachment 173875 [details]
highlighting-via-odt.docx produced with LO 7.1.5.2 (linux x64 deb)

Thanks for testing BogdanB!

I just retried the "steps to reproduce" from scratch to double check, and can still reproduce the issue (see new attachment highlighting-via-odt.docx)

The build I am using is the Linux deb downloaded from libreoffice.org:

Version: 7.1.5.2 / LibreOffice Community
Build ID: 85f04e9f809797b8199d13c421bd8a2b025d52b5
CPU threads: 16; OS: Linux 5.8; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

Some things to check to narrow down the difference:
* This XML snippet in your comment is from highlighting-via-odt.docx produced at step 6, right?  Are you sure the document was fully closed at step 4?

* do you have the following setting:
Tools -> Options -> Load/Save -> Microsoft Office -> Character Highlighting
set to "Shading"? (This is the default when testing with a new empty profile)

* My sample document highlighting.docx has an example of shading following the example of highlighting.  Did you see that one come through correctly *not* as highlighting in the output?
Comment 5 BogdanB 2021-07-27 21:23:40 UTC
Confirm with
Version: 7.1.5.2 / LibreOffice Community
Build ID: 85f04e9f809797b8199d13c421bd8a2b025d52b5
CPU threads: 4; OS: Linux 5.8; UI render: default; VCL: gtk3
Locale: ro-RO (ro_RO.UTF-8); UI: en-US
Calc: threaded

How I tested? Very easy in 7.1.5 with Style Inspector.

In the original document for the highlighted text we had:
-------------------------
Char Highlight 16776960
Char Locale en-US

And for the shading:
Char Back Color 0xdddff
Char Back Transparent false
Char Locale en-US
Char Shading Value 0
-------------------------

After step 7 we have:
-------------------------
Char Back Color 0xffff00
Char Back Transparent false
Char Locale en-US
Char Shading Value 0

And for the shading:
Char Back Color 0xdddff
Char Back Transparent false
Char Locale en-US
Char Shading Value 0
-------------------------

So, this is a confirmation of the transformation of highlight into a shading.
Comment 6 Justin L 2021-07-28 05:05:31 UTC
This is what the ODT file looks like:
<style:style style:name="T2" style:family="text">
  <style:text-properties fo:background-color="#ffff00"/>
</style:style>
<style:style style:name="T3" style:family="text">
  <style:text-properties fo:background-color="#ddddff" loext:char-shading-value="0"/>
</style:style>
Comment 7 Justin L 2022-09-06 18:34:59 UTC
ODF doesn't (and hopefully never will) have a concept of a 16-color highlight character property. So not much can be done here unless someone does that evil thing.
Comment 8 Timur 2022-09-06 19:10:26 UTC Comment hidden (obsolete)
Comment 9 Justin L 2022-09-06 20:32:17 UTC
(In reply to Timur from comment #8)
> So, here should be given an idea how to achieve this or it's NotOurBug or
> WontFix.
It probably comes automatically when depends-on bug 131920 is "fixed".
Comment 10 Timur 2022-09-07 09:16:15 UTC
So let's keep as Lowest.
Comment 11 Luke Deller 2022-09-15 01:17:04 UTC
(In reply to Justin L from comment #7)
> ODF doesn't (and hopefully never will) have a concept of a 16-color
> highlight character property. So not much can be done here unless someone
> does that evil thing.

Which element did you feel was evil: supporting the concept of character highlighting as distinct from background shading, or the fact that Word only offers 16 choices for the highlight colour?

I agree the limitation of colour choice is quite evil, probably a relic of the past.  The same limitation used to exist elsewhere, eg table border colour was limited to 16 colours until MS Word 2000.

However the concept of highlighting might have merit: I have seen it employed by Word users differently than background shading, with semantics matching a physical highlighter pen.  Indeed the online Word documentation says:
> Word contains many highlighters to make your text pop off the screen just
> as if you were highlighting paper with a fluorescent marker

( from https://support.microsoft.com/en-us/office/apply-or-remove-highlighting-1747d808-6db7-4d49-86ac-1f0c3cc87e2e )

What do you think of adding a loext: (extension) attribute to persist the highlighting information that LibreOffice currently holds in memory in this case?  I could maybe look at a patch for that if the concept sounds reasonable.
Comment 12 Justin L 2022-09-20 14:10:15 UTC
(In reply to Luke Deller from comment #11)
> Which element did you feel was evil: supporting the concept of character
> highlighting as distinct from background shading
Having two properties that do the exact same thing is atrocious.

> However the concept of highlighting might have merit:
Highlighting is exactly the same as character background shading - the only difference is that it is a separate property (with a separate tool in MSO) and is limited to 16-ish colours. Otherwise they are identical (other than implementation quirks like handling differently for number formatting and not being eligible for a character style).

> What do you think of adding a loext: (extension) attribute to persist the
> highlighting information that LibreOffice currently holds in memory in this
> case?
You already have the situation where a LO highlight edit will save as shading (although yes, there is a user setting for that). So MSWord users just need to learn that "paragraph shading" has nothing to do with paragraphs at all. The whole idea of maintaining compatibility MSO->ODF->MSO is rather too far fetched to merit any effort in order to support a horrible MSO "feature".