Bug 117994 - Font size is too large in attached DOC in Word after roundtrip
Summary: Font size is too large in attached DOC in Word after roundtrip
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
5.2 all versions
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.5.0
Keywords: bibisected, bisected, regression
Depends on:
Blocks: DOC-Lists-Direct-Formatting 150613
  Show dependency treegraph
 
Reported: 2018-06-04 13:03 UTC by Aron Budea
Modified: 2022-09-06 01:21 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
Sample DOC (23.50 KB, application/msword)
2018-06-04 13:03 UTC, Aron Budea
Details
Before-after comparison screenshot (33.54 KB, image/png)
2018-06-04 13:04 UTC, Aron Budea
Details
tdf117994_fixAttempt.patch - a keep-changing-things-until-somehow-it-all-works patch (2.16 KB, patch)
2018-08-03 10:59 UTC, Justin L
Details
analyze117994.doc : bugs tested and the results - table list (16.50 KB, application/msword)
2018-08-04 16:41 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aron Budea 2018-06-04 13:03:20 UTC
Created attachment 142514 [details]
Sample DOC

Open and save attached DOC in Writer, then open it in Word again.

The font of the numbering, and one of the empty lines are too large, see screenshot of before-after comparison in Word. I think it is somehow related to the style settings.

The sample is the same as in bug 117988, only saved in DOC (in Word) this time.

Observed using LO 6.1 beta1 / Windows 7.
Comment 1 Aron Budea 2018-06-04 13:04:49 UTC
Created attachment 142515 [details]
Before-after comparison screenshot
Comment 2 Xisco Faulí 2018-06-07 11:21:12 UTC
Regression introduced by:

author	Luke Deller <luke@deller.id.au>	2016-04-27 02:09:48 +1000
committer	Miklos Vajna <vmiklos@collabora.co.uk>	2016-04-27 15:17:22 +0000
commit 817c3b17fa57a7c4c4f80569bc00a21001fb6807 (patch)
tree e0f3a1604313624c4688df2626be897f9bbff577
parent 99ea437f7570ab59f186583f368398919db67c3a (diff)
tdf#99474 close direct char fmt at end of para
When exporting to doc, ensure that the FKP entry for direct character
formatting is closed at the end of a paragraph, so that any direct
character formatting in the next paragraph does not apply to the
end-of-paragraph marker (CR).

Also revert the changes for i#119650 and tdf#87437 which targetted
more specific examples of this problem, as those issues should now be
covered by this fix.

Add a test case for the example from tdf#99474

Bisected with: bibisect-linux-64-5.2

Adding Cc: to Luke Deller
Comment 3 Justin L 2018-08-02 13:23:56 UTC
attachment 124593 [details] from the regression-causing bug 99474 is very interesting because it is a very similar document (numbering) and yet it requires the complete opposite of this bug in terms of OutputFKP(true/false).

From what I can see, "large.doc" wants NO FKP at all, while "testcase.odt" needs to have a forced FKP somewhere/anywhere. Best guess at the moment is because up until this point, there has been no character formatting at all until the red attribute?
Comment 4 Justin L 2018-08-03 10:59:52 UTC
Created attachment 143951 [details]
tdf117994_fixAttempt.patch - a keep-changing-things-until-somehow-it-all-works patch

So, Luke's idea that after the CR, you "force" the CR to be included in the previous character attributes seems to be right. Before, I think large.doc "just happened" to look OK, because the following sentence used the same formatting.

The reason Luke's patch worked only with his example is because he was working with an IsEmptySprm, which just got extended like he wanted. In this bug's case, there is an existing character format SPRM which triggers a new, empty, character format being created just for the CR.

So, my patch changes the logic of "force", so that it doesn't just ensure that a mark is made, but that it specifically extends the existing character format to include this position. (I hope that still works as intended for the other existing use of "force" in redlining.)

There really are two problems in this bug report.  The one is import related - since LO doesn't show the LARGE spacing that Word does. And unfortunately, bug 102334's unit test likely hits that same problem because ww8export complains about hidden text showing up after this patch, but when opening the file in Word, the text is still hidden.

So, first things first I guess, and that is to investigate the import bug. But anyone is welcome to check this export patch and tell me what regressions it will cause :-).
Comment 5 Justin L 2018-08-04 16:41:13 UTC
Created attachment 143965 [details]
analyze117994.doc : bugs tested and the results - table list

Other than one unit test (which looks better in Word, but worse in LO), the patch seems to be helpful, but not destructive. I'm planning to temporarily merge into master to see if any regressions are noticed, and then remove it again within a month.

patch in gerrit: https://gerrit.libreoffice.org/58594
Comment 6 Commit Notification 2018-08-13 04:08:53 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=091aedc63de2f6c8f0f4c60dd1fa93fe4c6ddde4

tdf#117994 ww8export: extend Chp over CR

It will be available in 6.2.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Justin L 2018-08-13 04:32:58 UTC
(In reply to Commit Notification from comment #6)
Temporary patch - this issue is not yet fixed because that patch will be reverted in a month.
Comment 8 Xisco Faulí 2018-08-29 16:15:06 UTC
(In reply to Justin L from comment #7)
> (In reply to Commit Notification from comment #6)
> Temporary patch - this issue is not yet fixed because that patch will be
> reverted in a month.

Hi Justin,
Using the office-interoperability-tools, I found some regressions introduced by 091aedc63de2f6c8f0f4c60dd1fa93fe4c6ddde4. What should I do with them? What is the reason behind pushing a patch that will be reverted one month later? IMHO, this shouldn't have been done unless there is a good reason to do so...
Comment 9 Justin L 2018-08-29 16:42:32 UTC
(In reply to Xisco Faulí from comment #8)
> I found some regressions introduced
> by 091aedc63de2f6c8f0f4c60dd1fa93fe4c6ddde4. What should I do with them?
Add a few examples to this document, especially if the problem manifests itself in a different way.

> What is the reason behind pushing a patch that will be reverted one month
> later? IMHO, this shouldn't have been done unless there is a good reason to
> do so...
This area is extremely dangerous to work in (as evidenced by Caolan ignoring an old regression that accumulated tons of varied examples), so although the fix seemed good on all the documents I tested, and passed existing unit tests, I don't trust it. Plus, once it is reverted, it will provide examples of things it fixed, but are now "broken" again.
Comment 10 Xisco Faulí 2018-08-29 17:11:32 UTC
(In reply to Justin L from comment #9)
> (In reply to Xisco Faulí from comment #8)
> > I found some regressions introduced
> > by 091aedc63de2f6c8f0f4c60dd1fa93fe4c6ddde4. What should I do with them?
> Add a few examples to this document, especially if the problem manifests
> itself in a different way.
> 
> > What is the reason behind pushing a patch that will be reverted one month
> > later? IMHO, this shouldn't have been done unless there is a good reason to
> > do so...
> This area is extremely dangerous to work in (as evidenced by Caolan ignoring
> an old regression that accumulated tons of varied examples), so although the
> fix seemed good on all the documents I tested, and passed existing unit
> tests, I don't trust it. Plus, once it is reverted, it will provide examples
> of things it fixed, but are now "broken" again.

Ok, fair enough.
What about not reverting it and fixing the regression instead? In any other case, it's the usual workflow.

So far I've found 2 different issues but I haven't checked all the results yet.I would prefer to report them separately, otherwise this bug would end up being a mess...
Comment 11 Commit Notification 2018-09-01 04:21:47 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=21f52dc70e0f74adc559375f560dff969b9498de

Revert "tdf#117994 ww8export: extend Chp over CR"

It will be available in 6.2.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Xisco Faulí 2018-09-01 09:32:22 UTC Comment hidden (obsolete)
Comment 13 Justin L 2018-09-01 13:08:59 UTC Comment hidden (obsolete)
Comment 14 Xisco Faulí 2018-09-04 18:53:08 UTC
(In reply to Justin L from comment #13)
> (In reply to Xisco Faulí from comment #12)
> > So, what's the plan now that the patch has been reverted ?
> Please attach a sample document for each of the different issues that you
> find.

Hi Justin,
Sorry for the delay,
I've identified two problems:

* Hidden fields are displayed on MSO Word after RT ( doc )
    - attachment 40533 [details] from bug 31884
    - https://bugzilla.abisource.com/attachment.cgi?id=4671
    - attachment 95012 [details] from bug 75559
    - https://bugzilla.abisource.com/attachment.cgi?id=2376

* Content is displayed as deleted by the tracking changes after RT ( doc )
    - attachment 137302 [details] from bug 103967
    - https://bugzilla.mozilla.org/attachment.cgi?id=413410
    - https://bugzilla.mozilla.org/attachment.cgi?id=521402

Hope it helps...
Regards
Comment 15 QA Administrators 2019-09-05 09:34:45 UTC Comment hidden (obsolete)
Comment 16 QA Administrators 2022-07-17 03:30:08 UTC Comment hidden (obsolete)
Comment 17 Commit Notification 2022-09-04 01:17:07 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/449317473740d98338a61aca1e4ffb8821681c8e

tdf#117994 doc: Revert "tdf#99474 close direct char fmt at end of para"

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Justin L 2022-09-06 01:00:41 UTC
This reverted 5.1.4 commit 817c3b17fa57a7c4c4f80569bc00a21001fb6807

Another reason for reverting it is bug 150613
whose fix depends on getting this fixed first.

Making any export changes based on the formatting of numbering
is extremely dangerous, especially in this time period after LO 4.1
when a wonky patch made wild assumptions about what was best.
Well, that patch was reverted in LO 7.3 for bug 108518,
and apparently obsoletes the usefulness this patch provided...

This patch that I reverted caused some visible regressions
SEEN ONLY IN WORD which further confirms that we
still have lots of wrong assumptions in regards
to numbering formatting.

I'm happy to see it reverts without causing any unit test failures
even before I make any changes.

I confirmed that reverting the 4.1 patch with
7.3 commit 343d4d32f00053bd72cfe240125835fe25ce264f
was what allowed testTdf99474 to still pass.

Yeah - no backporting.