Bug 156223 - FILEOPEN DOCX: Wrong automatic numbering for a heading after a page break
Summary: FILEOPEN DOCX: Wrong automatic numbering for a heading after a page break
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
24.2.0.0 alpha0+
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Paragraph Character
  Show dependency treegraph
 
Reported: 2023-07-10 11:29 UTC by Hossein
Modified: 2023-07-11 19:34 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments
DOCX with 2 chapters (13.88 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-07-10 12:16 UTC, Hossein
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hossein 2023-07-10 11:29:22 UTC
Description:
In a Writer/Word document, sometimes a page is dedicated to the title of the new chapter. Then, after a page break, the title of the chapter is repeated. In this case, LibreOffice increases the chapter number, which is incorrect.

Steps to Reproduce:
1. Open the attachment in LibreOffice Writer
2. See page 2 and 3
3. Open the attachment in MS Word
4. See page 2 and 3
5. Compare the results

Actual Results:
On page 2 heading number is shown incorrectly as 2. This problem affects later chapters. In page 3, heading number for chapter 2 is incorrectly displayed as 3.

Side note: The paragraph is incorrectly shown as LTR, which is wrong. It should be RTL. This is another bug.

Expected Results:
On page 2, the heading should be shown without a number. In page 3, the heading number for chapter 2 should be displayed as 2.

Reproducible: Always


User Profile Reset: No


Additional Info:
Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 9a5329a266bd74abc4794f1fcbae3db07582dbde
CPU threads: 20; OS: Windows 10.0 Build 22621; UI render: Skia/Vulkan; VCL: win
Locale: en-US (en_DE); UI: en-US
Calc: CL threaded
Comment 1 Mike Kaganski 2023-07-10 12:02:39 UTC Comment hidden (obsolete)
Comment 2 Hossein 2023-07-10 12:16:45 UTC
Created attachment 188295 [details]
DOCX with 2 chapters

(In reply to Mike Kaganski from comment #1)
> (In reply to Hossein from comment #0)
> > 1. Open the attachment in LibreOffice Writer
> 
> An attachment is missing.
Sorry, this is the attachment.
Comment 3 Mike Kaganski 2023-07-10 12:35:04 UTC
So this is about unsupported hard page break inside paragraph.
The first *paragraph* of the document (having style "Heading1") has this structure:

  - Line break (w:br element)
  - "فصل "
  - "اول ("
  - "Chapter one"
  - ")"
  - "نگارش صحيح"
  - Page break (w:br element with "page" type)
  - "نگارش صحيح"

This is really a mess, with all this creating awful result in a ToC; and it indeed would be better made using a field on the second page, referencing the heading; but for this issue, this is not important. The important thing is that there is only one paragraph in Word (which includes a page break, and two identical texts on the both sides of the break) - so there is only one numbering item there; while in Writer, the hard page break splits the paragraph into two, with the same properties (including same level, and same list) - which creates two headings, and two numbers.

I bet there is already an issue for this.
Comment 4 Mike Kaganski 2023-07-10 13:00:01 UTC
Basically the same as bug 138139 (but there, the bad effect from split appeared on the first page: because the paragraph had bottom border, and the first page got the full empty paragraph with the border). That bug could have a workaround though, with a hackery analyzing if the first/last part is empty, and tweaking it somehow. This situation doesn't provide such luxury :)
Comment 5 Hossein 2023-07-10 13:31:23 UTC
(In reply to Mike Kaganski from comment #4)
> Basically the same as bug 138139 (but there, the bad effect from split
> appeared on the first page: because the paragraph had bottom border, and the
> first page got the full empty paragraph with the border). That bug could
> have a workaround though, with a hackery analyzing if the first/last part is
> empty, and tweaking it somehow. This situation doesn't provide such luxury :)
What do you suggest? Closing both, and adding a third one that covers the root cause? :-)
Comment 6 Justin L 2023-07-11 19:26:36 UTC
I suggest bug 138139 comment 6

The only way to "fix" it properly is to introduce a character-property page break into Writer - which would be a very stupid thing to do. This is a really stupid MSO feature that causes all kinds of odd situations.
Comment 7 Mike Kaganski 2023-07-11 19:34:06 UTC
(In reply to Justin L from comment #6)
> The only way to "fix" it properly is to introduce a character-property page
> break into Writer

I agree

> which would be a very stupid thing to do. This is a
> really stupid MSO feature that causes all kinds of odd situations.

I disagree ;)
Forcing a page break strictly between paragraphs is a very strange limitation. While it might (and definitely is) very useful for paragraph styles (say, headings), when used outside of styles, it is just a "formatting" tool, similar to line break. But line break (expressing author's intention to avoid automatic line break algorithm, and to it here) doesn't break paragraph, while page break does (and so disallows e.g. having useful grammar checking in the split paragraph, proper automatic paragraph formatting like when using indents - one would need to edit the split part manually to remove the indent, etc.).