Bug 107035 - FILEOPEN Missing numbering in inserted caption when document is saved in DOCX format and reopened
Summary: FILEOPEN Missing numbering in inserted caption when document is saved in DOCX...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
4.0.0.3 release
Hardware: All All
: medium normal
Assignee: Luke Deller
URL:
Whiteboard: interoperability target:6.1.0 target:...
Keywords: bibisected, bisected, filter:docx, regression
: 115774 (view as bug list)
Depends on:
Blocks: DOCX
  Show dependency treegraph
 
Reported: 2017-04-08 20:04 UTC by Aron Budea
Modified: 2018-03-17 07:32 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Screenshot (51.19 KB, image/png)
2017-04-08 20:06 UTC, Aron Budea
Details
DOCX exhibiting the underlying issue (4.14 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2017-10-22 12:33 UTC, Luke Deller
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aron Budea 2017-04-08 20:04:55 UTC
- Add an image to an empty document.
- Add caption to it via right click -> Insert Caption.... Make sure in Options..., Caption order: Numbering first is selected (no bug with Category first).
- Save as DOCX, then reopen.

=> Numbering is not shown. See attached screenshot, caption should start with a 1.

This is a regression, as the numbering appears after those steps in 5.2.0.4, but not in 5.3.0.3 / Windows 7.
Note that the frame is also wrong, that is due to bug 106132.
Comment 1 Aron Budea 2017-04-08 20:06:14 UTC
Created attachment 132407 [details]
Screenshot
Comment 2 Aron Budea 2017-04-08 20:07:55 UTC
Bibisecting points to the commit referenced below. Adding Cc: to Caolán McNamara, please take a look.

https://cgit.freedesktop.org/libreoffice/core/commit/?id=feedd45ba2dd308af2d3a1b2f64681b9467535b6
author		Caolán McNamara <caolanm@redhat.com>	2016-10-27 13:37:03 (GMT)
committer	Caolán McNamara <caolanm@redhat.com>	2016-10-27 13:37:03 (GMT)

"in msword the hard-break between image and caption has a width"
Comment 3 Luke Deller 2017-10-22 12:24:14 UTC
I think the commit identified in bibisection has merely triggered a pre-existing bug.

The underlying issue seems to be in the docx import, when there is direct character style formatting on the text immediately preceding a field, then the field is wrongly given that formatting.

The commit identified in bibisection happens to set up the above situation: it adds character formatting to the line break which precedes the caption, to mark the line break as "hidden".  The caption begins with a field for the numbering.  So this gives us direct character formatting immediately preceding a field, which triggers the underlying bug described above.  The consequence is that the field for the numbering is also hidden.

I will attach a simpler test case of the underlying issue (not sure if that should be described in a separate bug item or just addressed here?)

Steps to reproduce:
1. Create a new Writer document
2. Enter some text on the first line, followed by a field such as page number ("Insert" menu -> "Page number") still on the same line
3. Highlight all the text up to but not including the field, right click on it to bring up the context menu and select "Character..."
4. Change some visibly obvious character style (e.g. set the font colour to be green)
5. Save as docx, close document, reopen docx file

Observed output: the character style set at step 4 is now applied to the field

Expected output: the character style set at step 4 should not be applied to the field
Comment 4 Luke Deller 2017-10-22 12:33:53 UTC
Created attachment 137201 [details]
DOCX exhibiting the underlying issue

Here is a docx file produced by LO master (6.0.0alpha) using the steps in comment 3.

The page number field should be shown as black (I have verified this by opening in Microsoft Word 2016), but LO shows it as green.
Comment 5 Aron Budea 2018-02-17 00:46:06 UTC
*** Bug 115774 has been marked as a duplicate of this bug. ***
Comment 6 Aron Budea 2018-02-17 03:18:41 UTC
(In reply to Luke Deller from comment #3)
> The underlying issue seems to be in the docx import, when there is direct
> character style formatting on the text immediately preceding a field, then
> the field is wrongly given that formatting.
Luke, excellent analysis! And it is an import bug, since the color is fine in Word (and the numbering is there in the original example when opened in Word).

This is also a regression, from the following range:
https://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=c33019b36d613f951787ce9836e34d74bfbd6a1b..b67a51b40a4876f4bd97a2917103112006710b0c

The bug report is exactly about this issue, so let's keep it here.
Comment 7 Aron Budea 2018-02-17 04:18:51 UTC
Miklos, could it be one of your commits in that range?

I'm cautiously suspecting this one (just speculating):
https://cgit.freedesktop.org/libreoffice/core/commit/?id=232ad2f2588beff50cb5c1f3b689c581ba317583
author		Miklos Vajna <vmiklos@suse.cz>	2012-11-28 11:59:00 +0100
committer	Michael Stahl <mstahl@redhat.com>	2012-11-28 21:33:54 +0100

API CHANGE: add a "position" parameter to XParagraph/TextPortionAppend methods
Comment 8 Luke Deller 2018-02-17 11:13:39 UTC
Nice one Aron.  I did some testing to confirm that the issue was introduced in this change:

 - building from source at commit 232ad2f2588beff50cb5c1f3b689c581ba317583 I could reproduce the issue (using the steps in comment 3)
 - building at the preceding commit 85693bffad5c863e5cd4d4b3664856a9fec607d5 I could no longer reproduce the issue

Incidentally the old source at these commits did not build cleanly for me on Ubuntu 16.04 with gcc-5.4.  I got it going by switching to gcc-4.7, and making an adjustment to sd/source/ui/toolpanel/TaskPaneFocusManager.cxx (replace uintptr_t on line 35 with size_t).
Comment 9 Aron Budea 2018-02-17 18:24:34 UTC
Thanks for the confirmation, let's (re)add keyword bisected, then.

Also, the magic words: adding Cc: to Miklos Vajna.
Comment 10 Commit Notification 2018-03-05 09:27:20 UTC
Luke Deller committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=18cbb8fe699131a234355e1d00fa917fede6ac46

tdf#107035 Fix field character style DOCX import

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Aron Budea 2018-03-16 22:16:11 UTC
Verified fixed with 6.1 daily build (2537d6897ae516d3b4d50f0e2885dc24949841bf, 2018-03-16_02:34:17). Thanks, Luke!
Comment 12 Commit Notification 2018-03-17 07:32:00 UTC
Luke Deller committed a patch related to this issue.
It has been pushed to "libreoffice-6-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=afe0aa7fde7e4d4f9a928235e41953bf73e2ea6c&h=libreoffice-6-0

tdf#107035 Fix field character style DOCX import

It will be available in 6.0.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.