Bug 151985 - FILEOPEN: DOCX ZWNJ is lost when loading a file containing Arabic/Persian text
Summary: FILEOPEN: DOCX ZWNJ is lost when loading a file containing Arabic/Persian text
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
7.5.0.0 alpha0+
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: DOCX ZWNJ-ZWJ DOCX-RTL
  Show dependency treegraph
 
Reported: 2022-11-09 22:45 UTC by Hossein
Modified: 2023-05-25 14:43 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments
PDF output from MS Word 2019 (603.56 KB, application/pdf)
2022-11-09 22:45 UTC, Hossein
Details
PDF output from LO 7.5 dev master (283.96 KB, application/pdf)
2022-11-09 22:46 UTC, Hossein
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hossein 2022-11-09 22:45:38 UTC
Created attachment 183511 [details]
PDF output from MS Word 2019

Description:
Open attachment 182846 [details] from tdf#151364 which is a DOCX file containing many fields. Looking at several places in the file, one can see that ZWNJ is lost in many places. The problem with the fields is reported in tdf#151983. But this is not limited to the fields. ZWNJ is lost in many places.

It should be noted that in some places, ZWNJ is actually preserved. For example, you can see both می‌شود (ZWNJ preserved) and میشود (ZWNJ is lost) in LibreOffice while in both cases the original text is می‌شود (with ZWNJ).

Steps to Reproduce:
1- Open DOCX attachment
2- Compare the document to what MSO displays (PDF output from MSO is attached)

Actual Results:
In many places, ZWNJ is lost. This is easily visible in the TOC, headings and bold text, but it is not limited to those, and can be visible in the normal text.

Expected Results:
ZWNJ should be preserved while loading the DOCX file

Reproducible: Always


User Profile Reset: No


Additional Info:

Version: 7.5.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: a0dec4bc9a48b263be182ad7bbe4ba3f8cbb27e1
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 1 Hossein 2022-11-09 22:46:45 UTC
Created attachment 183512 [details]
PDF output from LO 7.5 dev master

Created with the latest LO dev master

Version: 7.4.0.3 / LibreOffice Community
Build ID: f85e47c08ddd19c015c0114a68350214f7066f5a
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 2 افشین 2022-11-10 08:46:44 UTC
I can reproduce this bug in LO 7.4.2.3

Version: 7.4.2.3 / LibreOffice Community
Build ID: 40(Build:3)
CPU threads: 2; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: fa-IR (en_US.UTF-8); UI: en-US
Ubuntu package version: 1:7.4.2~rc3-0ubuntu0.22.04.1~lo1
Calc: threaded