Bug 148956 - DOCX Cleared direct formatting in styled paragraph lingers
Summary: DOCX Cleared direct formatting in styled paragraph lingers
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: target:7.6.0
Keywords: bibisected, bisected, regression
: 81575 (view as bug list)
Depends on:
Blocks: Clear-Formatting DOCX-Styles
  Show dependency treegraph
 
Reported: 2022-05-06 02:19 UTC by Aron Budea
Modified: 2023-05-13 14:40 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample DOCX (5.00 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2022-05-06 02:19 UTC, Aron Budea
Details
Original sample ODT (9.85 KB, application/vnd.oasis.opendocument.text)
2022-05-06 02:19 UTC, Aron Budea
Details
Sample ODT #2 (see comment 5) (10.13 KB, application/vnd.oasis.opendocument.text)
2022-06-24 20:05 UTC, Aron Budea
Details
direct_formatted_body2.docx: another scenario (6.00 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-05-03 13:09 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aron Budea 2022-05-06 02:19:04 UTC
Created attachment 179955 [details]
Sample DOCX

The attached sample, first created as ODT, then saved as DOCX, 
- has a line (paragraph) formatted with Heading 2 style, and some extra direct formatting (24 pt size and Italic)
- and another (paragraph) formatted with Text Body style.

If you press enter at the end of the heading, the new paragraph:
- is formatted with Text Body style (which is the "Next style" for Heading 2),
- keeps the direct formatting.

That is all expected so far. Not sure why the direct formatting is kept, but that seems to be on purpose.

Let's say you want to get rid of the direct formatting in the heading:
select the first line, and press Ctrl + M, then unselect, press Enter at the end of the line, and type a few characters.

=> In the ODT file, the newly added characters have no direct formatting, as expected.
In the DOCX file, the newly added characters have direct formatting, the same that was cleared a step before, which seems to be a bug.

Note that in the DOCX case after clearing the direct formatting, if the caret is at the end of line, it will still show the direct formatting set on the toolbar (font size, italic).

When reproducing, start with saving the ODT as DOCX and reloading, since the bug could have a DOCX export aspect.

Observed using LO 7.4.0.0.alpha0+ (75fe4051320ef9b1f4323fa958e8df3db2066882), 5.0.0.5 / Ubuntu.
In 4.4.0.3, Heading 2 gets a "1" numbering after saving to DOCX and reloading, and when pressing enter after clearing formatting, the Next style setting isn't honored, and Heading 2 style is kept.
In 4.1.0.4 and 4.3.0.4, when pressing enter after clearing formatting, the Next style setting isn't honored, and Heading 2 style is kept, including the direct formatting.
In 4.0.0.3, when pressing enter after clearing formatting, the Next style setting isn't honored, and Heading 2 style is kept, but direct formatting is gone.
In 3.6.0.4 it behaves as expected. => regression
Comment 1 Aron Budea 2022-05-06 02:19:28 UTC
Created attachment 179956 [details]
Original sample ODT
Comment 2 Aron Budea 2022-05-11 02:47:19 UTC
(In reply to Aron Budea from comment #0)
> In 4.0.0.3, when pressing enter after clearing formatting, the Next style
> setting isn't honored, and Heading 2 style is kept, but direct formatting is
> gone.
> In 3.6.0.4 it behaves as expected. => regression
Bibisected the first change to the following range using repo bibisect-43all:
https://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=37b9e290d9e3d20652df0abe1a1458412f3cfe2c..c3aa1cefdc6521d34a2a32c20bae1593e1edb5ba

Possibly one of Cedric Bosdonnat's changes from bug 53175.
Comment 3 Aron Budea 2022-05-11 03:46:01 UTC
(In reply to Aron Budea from comment #0)
> In 4.1.0.4 and 4.3.0.4, when pressing enter after clearing formatting, the
> Next style setting isn't honored, and Heading 2 style is kept, including the
> direct formatting.
> In 4.0.0.3, when pressing enter after clearing formatting, the Next style
> setting isn't honored, and Heading 2 style is kept, but direct formatting is
> gone.
Bibisected the second change between 4.0 and 4.1 to the following two consecutive commits (the earlier failed to build) using repo bibisect-41max.

> Let's say you want to get rid of the direct formatting in the heading:
> select the first line, and press Ctrl + M, then unselect, press Enter at the
> end of the line, and type a few characters.
A significant change with these commits is that when performing the steps listed above (note: start with saving ODT as DOCX and reloading), when you've moved to the end of line after clearing direct formatting, at that position the direct formatting is still there.

https://cgit.freedesktop.org/libreoffice/core/commit/?id=8c178a50334109b34ef456ca6aa51cd3d98699ae
author		Pierre-Eric Pelloux-Prayer <pierre-eric@lanedo.com>	2013-01-11 14:38:12 +0100
committer	Noel Power <noel.power@suse.com>	2013-01-14 15:50:07 +0000

"docx export: also export rPr in <pPr> (paragraph mark styling)"

https://cgit.freedesktop.org/libreoffice/core/commit/?id=1f2c079dd2bc9a2f5aa3597a8222bde3073a04da
author		Pierre-Eric Pelloux-Prayer <pierre-eric@lanedo.com>	2013-01-11 14:34:04 +0100
committer	Noel Power <noel.power@suse.com>	2013-01-14 15:51:14 +0000

"sax: add methods to duplicate current top marker and reapply it later"

I'd assume this bug mainly has to do with the way these paragraph mark stylings are handled.
Comment 4 Justin L 2022-06-24 18:46:06 UTC
No big surprise here. Heading formats are (almost by definition) numbering formats, and exporting numbering spams direct formatting all over the place. There are various places in DOCX export where style formatting needs to be spammed as direct formatting in order to emulate things.

To make this a real bug, can you reproduce with a "normal" style that is not part of chapter numbering, or contain any numbering aspect?
Comment 5 Aron Budea 2022-06-24 20:05:23 UTC
Created attachment 180952 [details]
Sample ODT #2 (see comment 5)

(In reply to Justin L from comment #4)
> No big surprise here. Heading formats are (almost by definition) numbering
> formats, and exporting numbering spams direct formatting all over the place.
> There are various places in DOCX export where style formatting needs to be
> spammed as direct formatting in order to emulate things.
If that was causing the problem here, my expectation would be that unexpected direct formatting gets removed as well, not that it persists regardless, which is kind of the opposite.

But anyway, here's a sample, see the second paragraph (line), formatted with a new style derived from Text Body, and having additional direct formatting.

Steps:
- Save ODT as DOCX, and reload DOCX,
- Select the second line, and press Ctrl + M to clear direct formatting,
- Unselect, and press Enter at the end of the line,
- Type a few characters.

=> The characters have direct formatting.
Comment 6 Justin L 2023-05-02 18:44:50 UTC
Using comment 5's example, and round-tripping it in 7.6, I bibisected with the resulting file, and observed the same behaviour all the way back to OOo 3.3.0.
Comment 7 Commit Notification 2023-05-03 12:18:53 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/92fe70072d16922af41ab0fd3746172f3a58e489

tdf#148956 sw: clear nAttrStart == paraEnd hints in DOCX

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Justin L 2023-05-03 13:09:06 UTC
Created attachment 187071 [details]
direct_formatted_body2.docx: another scenario

Same problem, different cause. (Same steps to reproduce)
Comment 9 Commit Notification 2023-05-04 08:35:21 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/0f4a3823a6bab01723f2a958d44159d39d137b97

tdf#148956 sw: clear RES_TXTATR_LIST_AUTOFMT in FN_FORMAT_RESET

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Justin L 2023-05-13 14:40:55 UTC
*** Bug 81575 has been marked as a duplicate of this bug. ***