Bug 156223 - FILEOPEN DOCX: Wrong automatic numbering for a heading after a page break
Summary: FILEOPEN DOCX: Wrong automatic numbering for a heading after a page break
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
24.2.0.0 alpha0+
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Paragraph Character
  Show dependency treegraph
 
Reported: 2023-07-10 11:29 UTC by Hossein
Modified: 2024-07-25 18:39 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
DOCX with 2 chapters (13.88 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-07-10 12:16 UTC, Hossein
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hossein 2023-07-10 11:29:22 UTC
Description:
In a Writer/Word document, sometimes a page is dedicated to the title of the new chapter. Then, after a page break, the title of the chapter is repeated. In this case, LibreOffice increases the chapter number, which is incorrect.

Steps to Reproduce:
1. Open the attachment in LibreOffice Writer
2. See page 2 and 3
3. Open the attachment in MS Word
4. See page 2 and 3
5. Compare the results

Actual Results:
On page 2 heading number is shown incorrectly as 2. This problem affects later chapters. In page 3, heading number for chapter 2 is incorrectly displayed as 3.

Side note: The paragraph is incorrectly shown as LTR, which is wrong. It should be RTL. This is another bug.

Expected Results:
On page 2, the heading should be shown without a number. In page 3, the heading number for chapter 2 should be displayed as 2.

Reproducible: Always


User Profile Reset: No


Additional Info:
Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 9a5329a266bd74abc4794f1fcbae3db07582dbde
CPU threads: 20; OS: Windows 10.0 Build 22621; UI render: Skia/Vulkan; VCL: win
Locale: en-US (en_DE); UI: en-US
Calc: CL threaded
Comment 1 Mike Kaganski 2023-07-10 12:02:39 UTC Comment hidden (obsolete)
Comment 2 Hossein 2023-07-10 12:16:45 UTC
Created attachment 188295 [details]
DOCX with 2 chapters

(In reply to Mike Kaganski from comment #1)
> (In reply to Hossein from comment #0)
> > 1. Open the attachment in LibreOffice Writer
> 
> An attachment is missing.
Sorry, this is the attachment.
Comment 3 Mike Kaganski 2023-07-10 12:35:04 UTC
So this is about unsupported hard page break inside paragraph.
The first *paragraph* of the document (having style "Heading1") has this structure:

  - Line break (w:br element)
  - "فصل "
  - "اول ("
  - "Chapter one"
  - ")"
  - "نگارش صحيح"
  - Page break (w:br element with "page" type)
  - "نگارش صحيح"

This is really a mess, with all this creating awful result in a ToC; and it indeed would be better made using a field on the second page, referencing the heading; but for this issue, this is not important. The important thing is that there is only one paragraph in Word (which includes a page break, and two identical texts on the both sides of the break) - so there is only one numbering item there; while in Writer, the hard page break splits the paragraph into two, with the same properties (including same level, and same list) - which creates two headings, and two numbers.

I bet there is already an issue for this.
Comment 4 Mike Kaganski 2023-07-10 13:00:01 UTC
Basically the same as bug 138139 (but there, the bad effect from split appeared on the first page: because the paragraph had bottom border, and the first page got the full empty paragraph with the border). That bug could have a workaround though, with a hackery analyzing if the first/last part is empty, and tweaking it somehow. This situation doesn't provide such luxury :)
Comment 5 Hossein 2023-07-10 13:31:23 UTC
(In reply to Mike Kaganski from comment #4)
> Basically the same as bug 138139 (but there, the bad effect from split
> appeared on the first page: because the paragraph had bottom border, and the
> first page got the full empty paragraph with the border). That bug could
> have a workaround though, with a hackery analyzing if the first/last part is
> empty, and tweaking it somehow. This situation doesn't provide such luxury :)
What do you suggest? Closing both, and adding a third one that covers the root cause? :-)
Comment 6 Justin L 2023-07-11 19:26:36 UTC
I suggest bug 138139 comment 6

The only way to "fix" it properly is to introduce a character-property page break into Writer - which would be a very stupid thing to do. This is a really stupid MSO feature that causes all kinds of odd situations.
Comment 7 Mike Kaganski 2023-07-11 19:34:06 UTC
(In reply to Justin L from comment #6)
> The only way to "fix" it properly is to introduce a character-property page
> break into Writer

I agree

> which would be a very stupid thing to do. This is a
> really stupid MSO feature that causes all kinds of odd situations.

I disagree ;)
Forcing a page break strictly between paragraphs is a very strange limitation. While it might (and definitely is) very useful for paragraph styles (say, headings), when used outside of styles, it is just a "formatting" tool, similar to line break. But line break (expressing author's intention to avoid automatic line break algorithm, and to it here) doesn't break paragraph, while page break does (and so disallows e.g. having useful grammar checking in the split paragraph, proper automatic paragraph formatting like when using indents - one would need to edit the split part manually to remove the indent, etc.).
Comment 8 Dieter 2024-07-20 13:18:14 UTC
Mike, so you vote for a new feature?

So let's add design-team.
Comment 9 Heiko Tietze 2024-07-22 09:46:40 UTC
I cannot wrap my mind around "page break within a line". What intention should that be? The limitation makes sense to me.

In any case, the automatic numbering is not wrong => NAB.
Comment 10 Hossein 2024-07-22 10:27:37 UTC
(In reply to Heiko Tietze from comment #9)
> I cannot wrap my mind around "page break within a line". What intention
> should that be? The limitation makes sense to me.
> 
> In any case, the automatic numbering is not wrong => NAB.
You can open the same file in MS Office and see that it is correctly rendered there. This is at least a compatibility bug, but also an unwanted limitation, as Mike described in comment 7.

It has an important use case, as visible in the attachment, which is having an extra page for the beginning of the chapter, while preserving numbering and formatting and everything in table of contents working similar to the situation without an extra page. If we want a clean TOC, we should have this issue fixed.
Comment 11 Mike Kaganski 2024-07-22 10:50:15 UTC
(In reply to Heiko Tietze from comment #9)

Are there legitimate reasons to use the already existing line break feature? I can answer: there are. People might want to override the *automatic* line breaking algorithm, even with all these hyphenation controls, smart justifications and so on. There are cases when the user needs to say: "you Writer think that this word fits into this line; but I tell you to move it to the next line within the same paragraph", because line breaks (both automatic and manual) are a norm within a paragraph.

The "page break within the paragraph" has 100% same rationale. Automatic page breaks happen within paragraphs. And people might want to say: "you Writer think that this part of the paragraph still fits into this page; but I have my reasons to tell you to move this part of the paragraph to the next page, without breaking the paragraph".
Comment 12 Heiko Tietze 2024-07-23 12:23:28 UTC
The UI would be easy to do: the Insert Break has a Restart Location option, which could provide another item Next Page. I could imagine that some users want this to be achieve per shift+ctrl+return since shift/ctrl+return add a line break/page break (the shortcut is currently assigned to Column Break- maybe "enhance" this function?).

The remaining question is whether a character page break (CPB) can trigger a different page style. Don't see a good use case beyond "everything should be possible" (actually I still would limit page break to paragraphs; admittedly we need to solve the compatibility issue).

Might be a good idea to change the summary.