Bug 131728 - FILEOPEN DOCX Support style separators
Summary: FILEOPEN DOCX Support style separators
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.4.2.2 release
Hardware: x86-64 (AMD64) Windows (All)
: medium enhancement
Assignee: Not Assigned
URL: https://c-rex.net/projects/samples/oo...
Whiteboard: target:25.2.0 inReleaseNotes:25.2
Keywords: filter:docx
Depends on:
Blocks: DOCX-Paragraph
  Show dependency treegraph
 
Reported: 2020-03-31 00:21 UTC by brucehatfield
Modified: 2025-01-28 01:13 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Word document to be saved in ODT format with equal no. lines (17.85 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-06-19 03:55 UTC, brucehatfield
Details
PDF exported from MSO 2013 (5.27 KB, application/pdf)
2020-06-19 11:33 UTC, Buovjaga
Details
Frame-inserted paragraph approach destroys document (1.07 MB, image/png)
2024-10-19 21:15 UTC, Piotr Osada
Details
Word format separator.docx (66.23 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2024-10-19 21:49 UTC, Piotr Osada
Details

Note You need to log in before you can comment on or make changes to this bug.
Description brucehatfield 2020-03-31 00:21:54 UTC
Description:
Unable to import word lines containing style separator. Lines break into two in Writer, making paragraphs incompatible and meaning incomprehensible

Steps to Reproduce:
1.Type say 3 lines in Word e.g (Heading 1) Apple (style separator, Normal style) shall mean a piece of fruit
2. repeat for say 3 lines
3. Open saved Word document in Writer

Actual Results:
3 lines Word docx format breaks up into 6 lines odf format rendering 3 paragraphs into 6 (as would be for contracts etc.) 

Expected Results:
odf should return 3 lines same as docx


Reproducible: Always


User Profile Reset: No



Additional Info:
Require formulation of equivalent to Word same line style separator so that docx format can be saved to odf with no breaking apart of sentences or paragraphs
Comment 1 Dieter 2020-03-31 12:51:00 UTC
Thank you for reporting the bug. It seems you're using an old version of LibreOffice. Could you please try to reproduce it with the latest version of LibreOffice from https://www.libreoffice.org/download/libreoffice-fresh/ ? I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' if the bug is still present in the latest version. Change to RESOLVED WORKSFORME, if the problem went away.
Comment 2 brucehatfield 2020-03-31 15:29:22 UTC
Tested on LO v6.4.3.1(x64) and still unable to import word lines containing style separator. Lines break into two in Writer, making paragraphs incompatible and meaning incomprehensible
Comment 3 Buovjaga 2020-06-18 18:15:25 UTC
Please attach an example document (coming from MS Word, not touched by LibreOffice).
Set to NEEDINFO.
Change back to UNCONFIRMED after you have provided the document.

Please also mention the version of MS Office you used.
Comment 4 brucehatfield 2020-06-19 03:55:19 UTC
Created attachment 162196 [details]
Word document to be saved in ODT format with equal no. lines

Attached is original Word document that cannot be replicated in Writer.
Need is for a style separator equivalent in Word.
Article 1 is typical in legal documents with paragraphs normally being 1-3 lines for say 15 to 30 separate definitions. In contracts the use of the style separator enables lines to be held together tightly. Further, the abbreviated definition headings are retains in PDF versions, making full use of the PDF bookmarks index without cluttering whole lines of text into each PDF bookmark.
Article 2 for usage reference (multiple style separators used, but only first entry in a given paragraph appears in the navigation pane)
Comment 5 Buovjaga 2020-06-19 11:33:21 UTC
Created attachment 162207 [details]
PDF exported from MSO 2013
Comment 6 Buovjaga 2020-06-19 11:44:01 UTC
Thanks for the reference.
I see you already reported this 2 years ago in bug 122159

Looks like this is a bit of an obscure feature as this article says MS doesn't include any help content for it: https://office-watch.com/2016/style-separators-in-word/
Comment 7 brucehatfield 2020-07-02 17:47:45 UTC
Word style separators are perhaps the key feature lacking in Libreoffice that prevents widespread Law office/legislator usage of LO Writer. I translate patents and other documents for US Federal court (+ other jurisdictions) and first came aware of the use of style separators to minimize navigation pane content to essential content (instructions from judge's clerk). Basically it means only the essential part of a heading can now be indexed in the navigation pane/pdf bookmarks (converted from word).
There is extensive discussion of style separators in the Microsoft community demonstrating the important - not obscure - role of this function:
e.g. 
•Formatting issue with style separator when combined with equations and automatic numbering (Office 2013) Nov 19, 2015
https://answers.microsoft.com/en-us/msoffice/forum/all/formatting-issue-with-style-separator-when/e32e9e4e-5edd-4103-88a8-9cdf65a94458
•Updating TOC breaks style separators (Ctrl+Alt+Enter) Jun 17, 2020
https://answers.microsoft.com/en-us/msoffice/forum/all/updating-toc-breaks-style-separators-ctrlaltenter/a248dd2e-31e0-47de-8874-22bf1473849e
•Run-in sidehead in TOC Nov 10, 2013
https://answers.microsoft.com/en-us/msoffice/forum/all/run-in-sidehead-in-toc/0c1b77e3-4c3c-4324-8277-e8f4f51d37d7
◧ As for discussion of use in university thesis writing of style separators as an example: 
"Creating an Inline Heading" mentioning that the APA style defines a level-three heading as being indented in italic on the same line as the first sentence in the paragraph.
https://wordribbon.tips.net/T012723_Creating_an_Inline_Heading.html
So the issue is VERY IMPORTANT as it is a programming omission that renders Writer unusable in perhaps all complex document preparation.
If the task is too difficult pls advise so we know whether this key feature can finally be included or not.
Comment 8 larrybradley 2022-01-23 21:00:21 UTC
This is an ongoing issue that has been reported in at least two separate bug report and the developers have responded once with "ways to accomplish this without developing an actual style separator" (my description) and, in this bug report, "Looks like this is a bit of an obscure feature as this article says MS doesn't include any help content for it."

In the case of both bug reports with which I am familiar, the users were trying to compose legal documents that have very specific formatting requirements. I have also had this requirement when I was composing Contract documents, and I can insure the developers that the suggested workarounds (they hate that word) are not acceptable to the attorneys. That is one reason why we were forced to use MS Word, because it has a built-in Style Separator. And I can assure the developers that if this is an "obscure" feature for most users, it is not a trivial or obsolete feature and it is not obscure for those whose work requires it. It it was trivial and unnecessary, MS would not have included it in Microsoft 365 Word, and they would have by now removed it from the desktop version of Word.

If the devs simply don't want to be bothered with something that is critical to some small (but significant) portion of users and potential users, just say so; however, the tone of this and other discussions on this topic is one seems to me to be a bit too incredulous and dismissive on the dev side. If there is no appetite for ever addressing this issue with the one acceptable solution—to develop a built-in Style Separator that mimics the outcome of the MS Style Separator—please spare us any more effort on our part making the request, providing the rationale, and sending example file. 

Thanks for all that you do, and thanks for listening. LO is a great product, much better than MS Word in most respects, but not an option for those people who work with legal documents, unfortunately.
Comment 9 Regina Henschel 2024-09-10 22:29:19 UTC
This is similar to the request for "inline heading", see bug 46023, bug 48459, bug 153904 and bug 160087.
Comment 10 Commit Notification 2024-09-15 10:57:52 UTC
László Németh committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/56588663a0fddc005c12afaa7d3f8874d036875f

tdf#131728 sw inline heading: fix DOCX paragraph layout interoperability

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 László Németh 2024-09-15 11:15:57 UTC
Fixed layout for the reported test document. New issues will be filed for the clean-up (e.g. for different paragraph margins of the heading and the following paragraph) and follow-up (adding DOCX export).

@brucehatfield@gmail.com and all: thanks for the bug report and feedback!
Comment 12 László Németh 2024-09-15 11:16:26 UTC
Commit description:

tdf#131728 sw inline heading: fix DOCX paragraph layout interoperability

Fix layout of paragraph – which contains two paragraph styles –
by importing OOXML style separator using a text frame.

Inline headings – where there is no paragraph break after
the heading, i.e. it's followed by the normal paragraph content –
specified by w:specVanish in OOXML, i.e. a special paragraph with
hidden paragraph mark. These headings were loaded as normal, separated
paragraphs, breaking the paragraph layout with their paragraph breaks.
Map inline headings to inline ODF text frames to keep paragraph layout.
The frame contains the original inline paragraph, still keeping
ODF ToC and PDF bookmark support.
Comment 13 Piotr Osada 2024-10-19 21:15:07 UTC
Created attachment 197156 [details]
Frame-inserted paragraph approach destroys document

Inserting into a frame of the paragraph, after which paragraph_style should be different, is only a workaround for document viewing. There should be implementation of the same/similar concept as style separator in MS Office Word:

The CTRL+ALT+Enter function in word creates a pilcrow with a dotted box around it.
https://answers.microsoft.com/en-us/msoffice/forum/all/formatting-issue-with-style-separator-when/e32e9e4e-5edd-4103-88a8-9cdf65a94458
https://answers.microsoft.com/en-us/msoffice/forum/all/ctrlaltenter-function-in-word-2010/3d3f2583-f56a-4aaf-8e38-267c5723eedc
https://answers.microsoft.com/en-us/msoffice/forum/all/updating-toc-breaks-style-separators-ctrlaltenter/a248dd2e-31e0-47de-8874-22bf1473849e

commit 56588663a0fddc005c12afaa7d3f8874d036875f
This commit is not yet a solution to the interoperability issue with the style separator. Having inserted improper "basic-paragraph" instead of actual style-separator is less harmful than present behavior, which destroys formatting after reopening.

This behavior should not be released, or a real style-separator should be implemented.



Used software

(1), (4):
Microsoft® Word dla Microsoft 365 MSO (wersja 2409 kompilacji 16.0.18025.20160) 64-bit


(2), (5):
Version: 7.6.5.2 (X86_64) / LibreOffice Community
Build ID: 38d5f62f85355c192ef5f1dd47c5c0c0c6d6598b
CPU threads: 4; OS: Windows 10.0 Build 22621; UI render: Skia/Vulkan; VCL: win
Locale: pl-PL (pl_PL); UI: en-US
Calc: CL threaded

(3), (6):
Version: 25.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 224fae69b224d28a1664c48117e77265ed67a136
CPU threads: 4; OS: Windows 11 X86_64 (10.0 build 22621); UI render: Skia/Vulkan; VCL: win
Locale: pl-PL (pl_PL); UI: pl-PL
Calc: CL threaded
Comment 14 Piotr Osada 2024-10-19 21:49:03 UTC
Created attachment 197157 [details]
Word format separator.docx

Result of commit 56588663a0fddc005c12afaa7d3f8874d036875f

1) In LO save "Word format separator.docx" as DOCX.
2) Reopen.

Result:
→ Appears an empty frame with width of paragraph length, then after this frame text of a separated paragraph overlaps the preceding paragraph.
→ Each next occurrence of frame shifts text to the right (as if using a tabulator).
Comment 15 Piotr Osada 2024-10-19 21:51:27 UTC
(In reply to Piotr Osada from comment #14)
> Created attachment 197157 [details]
> Word format separator.docx
> 
> Result of commit 56588663a0fddc005c12afaa7d3f8874d036875f
> 
> 1) In LO save "Word format separator.docx" as DOCX.
> 2) Reopen.
> 
> Result:
> → Appears an empty frame with width of paragraph length, then after this
> frame text of a separated paragraph overlaps the preceding paragraph.
> → Each next occurrence of frame shifts text to the right (as if using a
> tabulator).

ODT does not show such "tabulated result".
Comment 16 Buovjaga 2024-10-20 04:58:05 UTC
(In reply to László Németh from comment #11)
> Fixed layout for the reported test document. New issues will be filed for
> the clean-up (e.g. for different paragraph margins of the heading and the
> following paragraph) and follow-up (adding DOCX export).
> 
> @brucehatfield@gmail.com and all: thanks for the bug report and feedback!

Piotr: please be more careful and read what László wrote.
Comment 17 Commit Notification 2024-10-23 16:26:39 UTC
László Németh committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/d87cf67f8f3346a1e380383917a3a4552fd9248e

tdf#131728 sw inline heading: fix missing/broken DOCX export

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 László Németh 2024-10-23 16:53:56 UTC
Full commit description of the export fix:

tdf#131728 sw inline heading: fix missing/broken DOCX export

Fix layout interoperability during DOCX round-trip by grab-
bagging w:p/w:pPr/w:rPr/w:specVanish, i.e. the style separators.

Note: use FrameInteropGrabBag to select the text frames, which
are inline headings, exporting only their text content (a single
paragraph), and use also ParaInteropGrabBag to export w:specVanish.

Note: specVanish lost completely originally, converting inline
headings to normal paragraphs.
After commit 56588663a0fddc005c12afaa7d3f8874d036875f,
text frames (the workaround for inline heading/ToC/bookmark
support) were exported instead of plain paragraphs, which were
broken at least in LibreOffice.

Follow-up to commit 56588663a0fddc005c12afaa7d3f8874d036875f
"tdf#131728 sw inline heading: fix DOCX paragraph layout
interoperability".
Comment 19 László Németh 2024-10-24 07:47:54 UTC
@Piotr Osada: the DOCX export is already fixed, but there are a few other problems yet: 1) the order of the PDF bookmarks doesn't follow the ToC order, 2) no easy access to create inline headings in Writer. These are the next topics to add to the recent implementation. Thanks for your feedback!
Comment 20 Commit Notification 2024-11-01 09:24:35 UTC
László Németh committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/49765a9e7be41d4908729ff7d838755276b244cb

tdf#48459 tdf#131728 sw inline heading: new frame style: fix DOCX export

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 21 László Németh 2024-11-06 17:01:25 UTC
Note: see fixed PDF bookmark generation in Bug 95239 for the DOCX import.
Comment 22 László Németh 2024-11-13 09:44:51 UTC
Note: see fixed XHTML export in Bug 163874 for the DOCX import.
Comment 23 László Németh 2024-11-13 12:09:16 UTC
Note: see fixed HTML export in Bug 163873 for the DOCX import. (The previous fix was about the XSLT XHTML filter.)