Bug 166148 - FILEOPEN DOCX: URLs break differently now than before / now than in MS Word
Summary: FILEOPEN DOCX: URLs break differently now than before / now than in MS Word
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
24.8.0.0 alpha0+
Hardware: All All
: low minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected
Depends on:
Blocks: ICU Word-Line-Break
  Show dependency treegraph
 
Reported: 2025-04-11 19:41 UTC by Justin L
Modified: 2025-04-12 14:41 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
moz1098664-1.docx: example document (14.38 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2025-04-11 19:41 UTC, Justin L
Details
moz1098664-1.pdf: how it looks in MS Word 2010 (40.21 KB, application/pdf)
2025-04-11 19:59 UTC, Justin L
Details
moz1445364-8.docx: similar differences with content like public keys (55.20 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2025-04-11 20:01 UTC, Justin L
Details
moz1445364-8.docx_import-1.png overlay: RED=Word 2019, black = LO24.8.6 (147.98 KB, image/png)
2025-04-12 14:11 UTC, Justin L
Details
tdf89506-1.docx_import-5.png: RED=Word 2019. first line in first paragraph should end with "(EL-" (94.41 KB, image/png)
2025-04-12 14:41 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Justin L 2025-04-11 19:41:24 UTC
Created attachment 200308 [details]
moz1098664-1.docx: example document

In this example, "manifestURL": now takes fill up 5 lines in the cell, while previously (and in MS Word 2010/2019) it only fills up 4 lines.

This started in 24.8 with commit 44699b3de37f07090ac6fee1cd97aa76036e9700
Author: Jonathan Clark on Wed Apr 17 09:09:50 2024 -0600
    tdf#49885 BreakIterator rule upgrades

Steps to reproduce:
-open moz1098664-1.docx

Notice that there i enough space to insert -82eb- but that it got bumped to the next line for some reason.

Found by Collabora's mso-test
Comment 1 Justin L 2025-04-11 19:59:33 UTC
Created attachment 200312 [details]
moz1098664-1.pdf: how it looks in MS Word 2010
Comment 2 Justin L 2025-04-11 20:01:25 UTC
Created attachment 200313 [details]
moz1445364-8.docx: similar differences with content like public keys
Comment 3 raal 2025-04-12 07:05:17 UTC
Confirm with Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: dcd3427149c33852428b4198c22f6f858125c294
CPU threads: 4; OS: Linux 6.8; UI render: default; VCL: gtk3
Locale: cs-CZ (cs_CZ.UTF-8); UI: en-US
Calc: threaded
Comment 4 Justin L 2025-04-12 14:11:48 UTC
Created attachment 200323 [details]
moz1445364-8.docx_import-1.png overlay: RED=Word 2019, black = LO24.8.6

While of course I am not concerned about how "public keys" are formatted, this example might help to develop a "generic" fix. The "very red" parts of this image highlight where LO has departed from its previous layout (because before this it was basically identical).
Comment 5 Justin L 2025-04-12 14:41:02 UTC
Created attachment 200324 [details]
tdf89506-1.docx_import-5.png: RED=Word 2019. first line in first paragraph should end with "(EL-"

bug 89506's attachment 113553 [details] (Niinepuu magistritöö.docx) on page 5 didn't split "(EL-27)", but it does in MS Word.