Bug 153082 - FILEOPEN DOCX List separator not considered with TOC using custom styles
Summary: FILEOPEN DOCX List separator not considered with TOC using custom styles
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Michael Stahl (allotropia)
URL:
Whiteboard: target:7.6.0 target:7.5.1
Keywords: filter:docx
Depends on:
Blocks: DOCX-TableofContents
  Show dependency treegraph
 
Reported: 2023-01-18 07:53 UTC by Gabor Kelemen (allotropia)
Modified: 2023-02-15 23:49 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
TOC with custom styles, using comma as list separator (14.39 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-01-18 07:53 UTC, Gabor Kelemen (allotropia)
Details
TOC with custom styles, using semicolon as list separator (14.55 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-01-18 07:54 UTC, Gabor Kelemen (allotropia)
Details
The first example file in Word and Writer with TOC settings open (156.39 KB, image/png)
2023-01-18 07:55 UTC, Gabor Kelemen (allotropia)
Details
The second example file in Word and Writer with TOC settings open (156.34 KB, image/png)
2023-01-18 07:56 UTC, Gabor Kelemen (allotropia)
Details
The TOC in the second example file, after updating the TOC it becomes empty (52.62 KB, image/png)
2023-01-18 07:56 UTC, Gabor Kelemen (allotropia)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Gabor Kelemen (allotropia) 2023-01-18 07:53:58 UTC
Created attachment 184730 [details]
TOC with custom styles, using comma as list separator

Attached example documents were made in Word with different list separator settings set in Windows control panel - Locale - More settings.
The TOC in these files use custom styles: Intense Quote, Custom1 and _MyStyle0. 
Writer uses the list separator to separate these and their level settings.

When the first file using comma as list separator is opened in Writer, the TOC is imported and saved correctly: Update Index does not change the index significantly.
When the second file using semicolon as list separator is opened in Writer, that is not considered and the 3 styles are listed as one in the TOC Additionaly styles settings. Consequently the Update Index command empties the index.

1, Open first example file
2, Right click the TOC, Edit Index
3, Press Assign styles button, observe that the _MyStyle0 has the heading level 1, Custom1 has the heading level 2, Intensives Zitat (file was made in a German UI Word - should be Intense Quote, this is a separate issue) has the heading level 3.
4, Open the second example file
5, Right click the TOC, Edit Index
6, Observe there is an entry in the Styles list "Intensives Zitat;3;Custom1;3;_MyStyle0;3"

The latter to work, Writer should look into the word/settings.xml and extract the <w:listSeparator w:val=";"/> and use this as list separator to interpret the TOC.

Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: f1830bff71847a9c17715cff52383956719847fe
CPU threads: 14; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: en-US (hu_HU); UI: en-US
Calc: threaded
Comment 1 Gabor Kelemen (allotropia) 2023-01-18 07:54:49 UTC
Created attachment 184731 [details]
TOC with custom styles, using semicolon as list separator
Comment 2 Gabor Kelemen (allotropia) 2023-01-18 07:55:38 UTC
Created attachment 184732 [details]
The first example file in Word and Writer with TOC settings open
Comment 3 Gabor Kelemen (allotropia) 2023-01-18 07:56:09 UTC
Created attachment 184733 [details]
The second example file in Word and Writer with TOC settings open
Comment 4 Gabor Kelemen (allotropia) 2023-01-18 07:56:49 UTC
Created attachment 184734 [details]
The TOC in the second example file, after updating the TOC it becomes empty
Comment 5 Stéphane Guillou (stragu) 2023-01-18 08:21:35 UTC
Reproduced in:

Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: f1830bff71847a9c17715cff52383956719847fe
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

Styles merged into one when using the semicolon separator, index emptied on update.
Comment 6 Commit Notification 2023-01-18 18:41:39 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/ecbad22fdf81c6f072b6c9f9c16dbba47fe4748c

tdf#153082 writerfilter: import locale-dependent TOC \t style names

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Commit Notification 2023-01-18 18:42:41 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/7b62d09090e5172e26141694fb97bc27562a81ce

tdf#153082 writerfilter,sw: import/export locale-dependent TOC ...

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Michael Stahl (allotropia) 2023-01-18 18:47:00 UTC
there was an additional complication because "Intense Quote" is actually a built-in style.

fixed on master
Comment 9 Michael Stahl (allotropia) 2023-01-18 19:18:54 UTC
oops, totally missed bug 153083 - the commit from comment #6 belongs there
Comment 10 Commit Notification 2023-01-19 12:44:48 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-7-5":

https://git.libreoffice.org/core/commit/5b8de6f5e19b4b85e9d13d86aa71ee6b7adae5f3

tdf#153082 writerfilter,sw: import/export locale-dependent TOC ...

It will be available in 7.5.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Michael Stahl (allotropia) 2023-02-04 09:09:03 UTC
i've just accidentally discovered that there is an element in word/settings.xml:

 17.15.1.56 listSeparator (List Separator for Field Code Evaluation)

which is only mentioned for function fields - does it have an effect on TOC?

... a quick test reveals that Word 2013 ignores it for TOC field.
Comment 12 Gabor Kelemen (allotropia) 2023-02-15 23:49:36 UTC
Verified in own build of 7.5-branch:

Version: 7.5.2.0.0+ (X86_64) / LibreOffice Community
Build ID: 1e7ee035992a0b29f42eac56ad82e2a1b0fe8ccd
CPU threads: 14; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: en-US (hu_HU); UI: en-US
Calc: threaded

and current master:
Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 14a36ad49518bcb5b606b0f1640e3ca56b636e89
CPU threads: 14; OS: Windows 10.0 Build 19045; UI render: default; VCL: win
Locale: en-US (hu_HU); UI: en-US
Calc: threaded

After opening both files now TOC update keeps all the contents as expected. This even works after saving and reloading the example files to docx.

Word seems to be trickier: it expects the list separator in the field to be the same as set in the Control Panel - Region - Customize date, time and number formats - More settings - List separator.
If I set # here for fun, then both Word-made example files here will not find any entries for the TOC when updating. 
Setting , here makes both exported files TOC update work, and the first original example files as well.

So when in ISO/IEC 29500-1:2016(E) you read:
17.15.1.56
listSeparator (List Separator for Field Code Evaluation)

If this element is omitted, the application shall use the default list separator of its current locale setting to
evaluate field instructions.

don't rely on that being followed by Word.