Bug 140235 - Plenty of list-related character styles created on DOCX export which clearly not needed
Summary: Plenty of list-related character styles created on DOCX export which clearly ...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
: 143237 (view as bug list)
Depends on:
Blocks: DOCX-Character
  Show dependency treegraph
 
Reported: 2021-02-07 11:11 UTC by Telesto
Modified: 2023-01-26 20:44 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
Example file (783.71 KB, application/vnd.oasis.opendocument.text)
2021-02-07 11:11 UTC, Telesto
Details
Simple reproducer file (9.08 KB, application/vnd.oasis.opendocument.text)
2021-03-26 10:12 UTC, NISZ LibreOffice Team
Details
The reproducer document saved to docx (5.25 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2021-03-26 10:14 UTC, NISZ LibreOffice Team
Details
The docx version of the reproducer saved back to odt (9.59 KB, application/vnd.oasis.opendocument.text)
2021-03-26 10:16 UTC, NISZ LibreOffice Team
Details
The original reproducer and its docx version (98.44 KB, image/png)
2021-03-26 10:19 UTC, NISZ LibreOffice Team
Details
The docx version and the odt saved from it (89.53 KB, image/png)
2021-03-26 10:20 UTC, NISZ LibreOffice Team
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Telesto 2021-02-07 11:11:14 UTC
Description:
Plenty of styles created on DOCX export which clearly not needed

Steps to Reproduce:
1. Open the attached file
2. Save as DOCX
3. File reload
4. Go to character styles.. notice a whole list of additional styles
5. CTRL+A
6. CTRL+C
7. CTRL+V
8. Save
9. File reload -> Most of the crap gone
10. Save to ODT -> Still no crap


1. Open the attached file
2. Save as DOCX
3. File reload
4. Go to character styles.. notice a whole list of additional styles
5. Save to ODT -> ODT filled with junk

Actual Results:
Plenty of junk styles.. and those styles opening/saving files pretty slow

Expected Results:
Less junk makes managing styles lovelier & speed-up


Reproducible: Always


User Profile Reset: No



Additional Info:
Version: 7.2.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 3ed9bba283a6a67864c0928186e277240be0d9ba
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: Skia/Raster; VCL: win
Locale: nl-NL (nl_NL); UI: en-US
Calc: CL
Comment 1 Telesto 2021-02-07 11:11:34 UTC
Created attachment 169550 [details]
Example file
Comment 2 Telesto 2021-02-07 11:14:08 UTC
@Justin
We had a bug about this somewhere.. And if I the styles code was to horribly broken (and to complex). But this might give an approach to to get rid of most of the junk styles without a big re-factoring
Comment 3 Justin L 2021-03-04 07:37:56 UTC
Specifically, these are character styles ListLabel1 - ListLabel171. A second round-trip brings it to ListLabel225.

A similar report for DOC is bug 133410 which says,
see LO 5.0.6's tdf#95213 DOCX import: don't reuse list label styles.

which was mollified somewhat by LO 6.3's tdf#92335 DOCX: fix multiplying of "ListLabel" styles.

Especially important is "However, making a change here would be fraught with danger."
Comment 4 Telesto 2021-03-04 08:58:01 UTC Comment hidden (no-value)
Comment 5 Miklos Vajna 2021-03-04 09:01:38 UTC Comment hidden (no-value)
Comment 6 NISZ LibreOffice Team 2021-03-26 10:11:52 UTC
I think the problem here is at import time. Doing this:

Steps to Reproduce:
1. Open the attached file
2. Save as DOCX
3. File reload
4. Go to character styles.. notice a whole list of additional styles

does not result in extra character styles being saved to the docx files styles.xml (since bug #92335 was fixed).

These extra styles appear only at docx import time, then these can be saved to odt (unfortunately).
Comment 7 NISZ LibreOffice Team 2021-03-26 10:12:36 UTC
Created attachment 170749 [details]
Simple reproducer file
Comment 8 NISZ LibreOffice Team 2021-03-26 10:14:16 UTC
Created attachment 170750 [details]
The reproducer document saved to docx
Comment 9 NISZ LibreOffice Team 2021-03-26 10:16:46 UTC
Created attachment 170751 [details]
The docx version of the reproducer saved back to odt

This has the extra character styles saved to sytles.xml
Comment 10 NISZ LibreOffice Team 2021-03-26 10:19:10 UTC
Created attachment 170752 [details]
The original reproducer and its docx version

The problem starts at the docx import.
Comment 11 NISZ LibreOffice Team 2021-03-26 10:20:43 UTC
Created attachment 170753 [details]
The docx version and the odt saved from it

Once these fake ListLabel character styles are created they are saved to odt.

Version: 7.2.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 2fb274950e5207ca55f4f52325fb522bd44024e1
CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: Skia/Raster; VCL: win
Locale: en-US (hu_HU); UI: en-US
Calc: CL
Comment 12 Justin L 2021-03-31 18:07:18 UTC
(In reply to NISZ LibreOffice Team from comment #11)
> Once these fake ListLabel character styles are created they are saved to odt.

I don't think we would want to do anything to try and stop this though.
However, one possibility would be to create a "clean up document" tool that goes through and removes excess styles, obsolete direct formatting, etc.  (This, of course, is a HUGE enhancement idea, and so far we can't even get someone to write a compress-all-pictures tool...)
Comment 13 NISZ LibreOffice Team 2021-07-13 07:05:50 UTC
*** Bug 143237 has been marked as a duplicate of this bug. ***
Comment 14 Eyal Rozenberg 2023-01-26 20:31:42 UTC
(In reply to Justin L from comment #12)
> (In reply to NISZ LibreOffice Team from comment #11)
> > Once these fake ListLabel character styles are created they are saved to odt.
> 
> I don't think we would want to do anything to try and stop this though.

Why? I mean, if styles are generated artifically, why not prevent that from happening? Or at least, use more stringent criteria to decide when a style is generated?

> However, one possibility would be to create a "clean up document" tool that
> goes through and removes excess styles, obsolete direct formatting, etc. 
> (This, of course, is a HUGE enhancement idea, and so far we can't even get
> someone to write a compress-all-pictures tool...)

and it would also be a separate bug IMHO.