Bug 137335 - FILEOPEN DOCX Whitespace should not define paragraph height (CR formatting)
Summary: FILEOPEN DOCX Whitespace should not define paragraph height (CR formatting)
Status: REOPENED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:25.2.0
Keywords:
: 155268 (view as bug list)
Depends on:
Blocks: DOCX-Paragraph
  Show dependency treegraph
 
Reported: 2020-10-08 12:27 UTC by NISZ LibreOffice Team
Modified: 2024-09-23 14:40 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Example file from Word (14.13 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-10-08 12:27 UTC, NISZ LibreOffice Team
Details
Screenshot of the original document side by side in Word and Writer (204.06 KB, image/png)
2020-10-08 12:27 UTC, NISZ LibreOffice Team
Details
Another example file with a single paragraph inside table cell (54.90 KB, application/msword)
2023-01-04 10:35 UTC, Gabor Kelemen (allotropia)
Details
Screenshot of the other example in Word 2016 and Writer (109.70 KB, image/png)
2023-01-04 10:37 UTC, Gabor Kelemen (allotropia)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description NISZ LibreOffice Team 2020-10-08 12:27:09 UTC
Created attachment 166186 [details]
Example file from Word

Attached docx file was minimized from attachment #166180 [details] of bug #38575
This has some paragraphs that contain TAB or space characters that has different font size than the paragraph ending pilcrow.
In Word the pilcrow has 5 pt character size and the TAB character 12 pt. The paragraph height is matching the pilcrows font size while in Writer it matches the TAB characters set size. 
The same happens with a paragraph containing 12 pt spaces and 8pt pilcrow.
This does not happen if there is only an empty paragraph.
In the original document this causes the last empty paragraph of the first page to 

Steps to reproduce:
    1. Open attached document.
    2. Compare the empty paragraph heights in Word and Writer

Actual results:
The paragraph after “Herrn / Frau” is taller than in Word because it is matched to the 12 pt formatting of the TAB character.
The paragraph after Telefon-Nr is also taller because it is matched to the 12 pt formatting of the spaces.
The empty paragraphs before “Herrn / Frau” and after “wohnhaft” are the same height as in Word, similarly to the non-empty ones.

Expected results:
If a paragraph only has whitespace characters its height should match the ending pilcrows font size.

LibreOffice details:
Version: 7.1.0.0.alpha0+ (x64)
Build ID: a883002d8e2fd77f80c43b7b2e6ac329d83d929d
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: default; VCL: win
Locale: en-US (hu_HU); UI: en-US
Calc: CL

Also happens in:
Verzió: 6.0.0.3
Build az.: 64a0f66915f38c6217de274f0aa8e15618924765
CPU szálak: 4; OS: Windows 6.3; Felületmegjelenítés: alapértelmezett; 
Területi beállítások: hu-HU (hu_HU); Calc: CL

Version: 5.0.0.5
Build ID: 1b1a90865e348b492231e1c451437d7a15bb262b
Locale: hu-HU (hu_HU)

Verzió: 4.0.0.3 (Build az.: 7545bee9c2a0782548772a21bc84a9dcc583b89)

LibreOffice 3.5.0rc3 
Build ID: 7e68ba2-a744ebf-1f241b7-c506db1-7d53735
Comment 1 NISZ LibreOffice Team 2020-10-08 12:27:23 UTC
Created attachment 166187 [details]
Screenshot of the original document side by side in Word and Writer
Comment 2 Xisco Faulí 2020-10-12 17:49:19 UTC
Reproduced in

Version: 7.1.0.0.alpha0+
Build ID: a9976a958b2857e308c6598532151878615bfd9f
CPU threads: 4; OS: Linux 5.7; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 3 Justin L 2021-09-17 15:37:47 UTC
I'd say this is a duplicate of bug 127368.

It seems the same as bug 117988, where we have some proof that the compat flag IgnoreTabsAndBlanksForLineCalculation already handles whitespace not defining the paragraph height. So the issue here really is that the pilcrow settings aren't able to define the paragraph height in LO.
Comment 4 Gabor Kelemen (allotropia) 2023-01-04 10:35:24 UTC
Created attachment 184481 [details]
Another example file with a single paragraph inside table cell

This is another example from another customer.
Here the first part of the paragraph inside a table cell, FFF is formatted as 12 pt, until the tab before the 5 pt formatted second part.
The 12 pt sized tab defines the line height, unlike in Word.
This causes the text not fit inside the cell in Writer.
Comment 5 Gabor Kelemen (allotropia) 2023-01-04 10:37:54 UTC
Created attachment 184482 [details]
Screenshot of the other example in Word 2016 and Writer

Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 44355a90b3450111ad87ad4b6607a564e41d7b54
CPU threads: 14; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: de-DE (hu_HU); UI: en-US
Calc: threaded

Not RTF specific, same saved as DOCX by Word looks the same in both.
Comment 6 Justin L 2023-05-25 11:22:07 UTC
*** Bug 155268 has been marked as a duplicate of this bug. ***
Comment 7 Gabor Kelemen (allotropia) 2024-07-22 15:37:02 UTC
Proposed patch in https://gerrit.libreoffice.org/c/core/+/168788
Comment 8 Commit Notification 2024-07-30 20:42:43 UTC
Oliver Specht committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/f806fc136b3410ec9a1e09320d100c78b33c867b

tdf#137335 calculate paragraph height in RTF/DOCX

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 Commit Notification 2024-09-09 05:13:42 UTC
Oliver Specht committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/b59df057122f5819140d844bd395cff95fbabfcc

tdf#137335 follow-up to f806fc136b3410ec9a1e09320d100c78b33c867b

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Gabor Kelemen (allotropia) 2024-09-17 13:12:32 UTC
Reopening, the commits above fixed a slightly different case. Original documents still look bad compared to Word. Status from internal tracker:


The mentioned bugdoc is a different problem than our customer document.
This document has tabs+text where the text height decides about the line height. Tab stop height is ignored.

The second document has paragraphs with 5pt height containing a tab stop with 12pt with a resulting line height calculated by the paragraph height (5pt). If normal text is added (in Word 12pt and in Writer it has 5pt) the line height is increased/reduced.
The current import creates paragraphs with 12pt and two portions tab stop 12pt and empty text in 5pt.
It requires a different import to apply the paragraph run properties 

<w:pPr>
  <w:rPr>
    <w:sz w:val="10"/>
  </w:rPr>
</w:pPr>

to the paragraph. And then the layout needs to change tab stop height calculation.