Bug 146052 - orphaned objects and styles in content.xml after editing
Summary: orphaned objects and styles in content.xml after editing
Status: RESOLVED DUPLICATE of bug 136434
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.1.5.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-12-04 22:33 UTC by achim
Modified: 2022-11-28 12:05 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Example with orphanded objects (11.59 KB, application/vnd.oasis.opendocument.text)
2021-12-04 22:33 UTC, achim
Details

Note You need to log in before you can comment on or make changes to this bug.
Description achim 2021-12-04 22:33:51 UTC
Created attachment 176703 [details]
Example with orphanded objects

Most text modifications (add, change, spell-check, auto-correct, ...) add
additional XML objects in content.xml. These objects are not removed, when
the changes are reverted. Most of them are also not removed with
"Clear Direct Formatting", allthough this function might not be a proper
workaround as it also removes intended formattings.

This results in hundreds or more uselss object in files under heavy use.

For an example please see attached .odt.

Please contct me PM if interested in in a small tools to detect some of
such objects.
Comment 1 Dieter 2021-12-29 08:15:17 UTC
I confirm the observation, but I'm not able to assess, if this a bug or enhancement request or not. Bug 136434 deals with a similar topic. Might be worth to have a look at it. Perhaps hints in bug 136434 comment 3 might help.
Comment 2 Mike Kaganski 2021-12-29 08:52:14 UTC
(In reply to achim from comment #0)
> These objects are not removed, when the changes are reverted.

Only this would be a bug, *if* making a change, and then *using UNDO* would still produce a rsid for the changed-then-undine piece in saved document.

Otherwise, this is a correctly working, useful, *configurable* and documented feature [1]. I'd only say that the help page would benefit from mentioning the 'rsid' element used in ODT to store the random number, and allow that term to be searchable in the help.

[1] https://help.libreoffice.org/latest/en-US/text/shared/optionen/01040800.html
Comment 3 achim 2021-12-29 09:49:25 UTC
First of all, please apologies my lack of knowledge of LO internals, I'm a simple user.

I.g. my observations are very similar to those made in #tdm136434 but I digged not that deep. So I'd also expect some cleanup function "Tools->Purge" or whatever. Such a cleanup should remove all and any empty, unused objects.

I'm not used to LO's definition how "Compare Document" should work in detail (see reference to [1] in previous comment), but if "Edit->Track Changes->Record" is disabled (as it was in all my tests), I'd not expect any recordings in the document. But anything is recorded, and that's sounds like a bug to me.

As I have documents with countless changes over years, they grow and grow, even the visible content shrinks. Also, it's a real pain to parse the content (using some API) as everywhere unexpected objects appear. For that also, some kind of clenaup would be nice.

Another problem I observed is that exports (for example PDF) also exports unused objects, in particular fonts in my case. Up to now, I've not seen any functionality in LO to identify unused styles (which seem to be the culprit).

Also, exports consumes a lot of resources (memory, time) with such documents. Sorry, I'm not able to provide a proper example document for that.


I'm not deemed to make the destinction if these are bugs or feature requests or inexperienced user, but I'd engage you to provide a cleanup.
Comment 4 Mike Kaganski 2021-12-29 10:33:03 UTC
(In reply to achim from comment #3)
> I'm not used to LO's definition how "Compare Document" should work in detail
> (see reference to [1] in previous comment), but if "Edit->Track
> Changes->Record" is disabled (as it was in all my tests), I'd not expect any
> recordings in the document. But anything is recorded, and that's sounds like
> a bug to me.

No it is not a bug. Change tracking is a different feature, that records the state of the document before and after every change, with its author, date, etc; and the rsid feature is just an indication of "something changed here at some point", which does not keep any track that could be used to identify who or when did that, or what was there before. They therefore are configured separately - but *probably* it would make sense to combine their configurations into a single options page, to make it obvious for users what is disabled when. Feel free to create a separate issue specifically for that, if you consider that useful.

> As I have documents with countless changes over years, they grow and grow,
> even the visible content shrinks. Also, it's a real pain to parse the
> content (using some API) as everywhere unexpected objects appear. For that
> also, some kind of clenaup would be nice.

The cleanup tool is suggested in bug 136434, and supported there by Regina. If you feel it reasonable (as indicated by your closing clause), please feel free to mark this one as a duplicate of that bug, and suggest in that bug to focus that on the cleanup tool enhancement.

> Another problem I observed is that exports (for example PDF) also exports
> unused objects, in particular fonts in my case. Up to now, I've not seen any
> functionality in LO to identify unused styles (which seem to be the culprit).

Absolutely unrelated issue; please limit single issue to a single problem. But actually I'd doubt that *unused* fonts could be exported to PDF, because the export to PDF does not export styles at all (however, bugs are always possible -  please file a bug for that, and provide a sample ODF document that exports unused fonts to PDF).

> Also, exports consumes a lot of resources (memory, time) with such
> documents. Sorry, I'm not able to provide a proper example document for that.

Also unrelated issue. And impossible to handle without any reproducer (which would likely help, if posted to a separate dedicated issue).
Comment 5 Buovjaga 2022-11-28 12:05:07 UTC
No reply, but seems like closing as duplicate of 136434 is appropriate.

*** This bug has been marked as a duplicate of bug 136434 ***