Bug 132447 - Words break according to <span> elements in exported PDF
Summary: Words break according to <span> elements in exported PDF
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
(earliest affected)
Inherited From OOo
Hardware: x86-64 (AMD64) All
: medium minor
Assignee: Not Assigned
Depends on:
Blocks: Hyphenation
  Show dependency treegraph
Reported: 2020-04-27 04:44 UTC by Jing Yuan Zhou
Modified: 2023-10-04 10:42 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:

Which is I am editing (77.59 KB, application/vnd.oasis.opendocument.text)
2020-04-27 04:44 UTC, Jing Yuan Zhou
Shows the bugs in red (217.73 KB, application/pdf)
2020-04-27 04:48 UTC, Jing Yuan Zhou

Note You need to log in before you can comment on or make changes to this bug.
Description Jing Yuan Zhou 2020-04-27 04:44:30 UTC
Created attachment 159982 [details]
Which is I am editing

Here, I attached 2 files. One is an ODT file that I am editing, the other one is a PDF file that copies the bugs. I show the bugs with red words.
The first bug, the word “toothed-track” (in red), in line 27 page 5, it is abnormal word breaking.
The second bug, the word “system” (in red), in line 13 page 6, it is abnormal word breaking.
The third bug, in line 14 page 7, (in red), the PDF file shows the format is I wanted. But, every time, I saved the ODT file, and then re-open, the line 14 of page 7 and its following moved to page 8 unexpectedly. Its paragraph format cannot be saved, and can be edited.
Comment 1 Jing Yuan Zhou 2020-04-27 04:48:11 UTC
Created attachment 159983 [details]
Shows the bugs in red
Comment 2 Buovjaga 2020-08-29 11:00:39 UTC
I reproduce the weird word breaking already in version 3.3.0.

If you unzip the file and look at the content.xml (it helps to use http://xmlbeautifier.com/ for beautifying it), you can see that the weirdly-breaking "toothed-track" is split into many span elements:

<text:p text:style-name="P30">
    <text:span text:style-name="T1">A toothed-roller array (3F) is that a cage (3L) restricts and synchronizes many toothed-rollers (3I). A toothed-roller has 1 bearing surface (8A) and many teeth (8B). Or rather, a toothed-roller (3I) is a roller, but with teeth (8B). While they are working, </text:span>
    <text:span text:style-name="T6">the synchronized toothed-rollers (3I) roll between the </text:span>
    <text:span text:style-name="T17">tooth</text:span>
    <text:span text:style-name="T18">ed</text:span>
    <text:span text:style-name="T17">-track</text:span>
    <text:span text:style-name="T6"> of the shell (3R) and the toothed-track of the piston (3Q), and mesh th</text:span>
    <text:span text:style-name="T1">eir teeth (8B, 3K, 3N)</text:span>
    <text:span text:style-name="T6">.</text:span>

I'm not sure, if this is normal or incorrect behaviour. Span elements are inline, so it seems strange that they would affect word breaking.

One issue per report, so your "third bug" should be reported separately.
Comment 3 BogdanB 2023-01-24 14:28:48 UTC
Buovjaga, I tried to replace the old "system" word with a new word "system". And the same problem. It's not PDF export related, is hyphenation related.