Bug 149089 - FILEOPEN: docx: get extra mini space between Chinese character when open docx file
Summary: FILEOPEN: docx: get extra mini space between Chinese character when open docx...
Status: ASSIGNED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.4.0.0 alpha0+
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.4.0 target:7.5.0 target:7.4....
Keywords:
Depends on:
Blocks: CJK DOCX-Opening Text-Grid
  Show dependency treegraph
 
Reported: 2022-05-14 12:45 UTC by Zhang Qide
Modified: 2024-05-17 10:18 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
screenshot1 (27.46 KB, image/png)
2022-05-14 12:45 UTC, Zhang Qide
Details
screenshot2 (17.45 KB, image/png)
2022-05-14 12:46 UTC, Zhang Qide
Details
test document (98.36 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2022-05-14 12:47 UTC, Zhang Qide
Details
pdf file export from ms office 2013 (272.83 KB, application/pdf)
2022-05-14 12:47 UTC, Zhang Qide
Details
compare same doc with ms office 2013 side by side (124.10 KB, image/png)
2022-06-18 13:08 UTC, Zhang Qide
Details
long doc (15.28 KB, image/png)
2022-06-18 13:17 UTC, Zhang Qide
Details
another doc compare side by side (168.91 KB, image/png)
2022-06-18 13:40 UTC, Zhang Qide
Details
before bug 148940 (77.00 KB, image/png)
2022-06-18 13:46 UTC, Zhang Qide
Details
test document archive (588.91 KB, application/zip)
2022-07-12 01:49 UTC, Zhang Qide
Details
example PDF (142.74 KB, application/pdf)
2023-11-08 05:48 UTC, Chris Sherlock
Details
further test document (120.00 KB, application/msword)
2023-11-08 05:48 UTC, Chris Sherlock
Details
First screenshot showing LO vs Word (127.99 KB, image/png)
2023-11-08 05:49 UTC, Chris Sherlock
Details
Second screenshot showing LO vs Word (115.56 KB, image/png)
2023-11-08 05:50 UTC, Chris Sherlock
Details
reduced test case (78.50 KB, application/msword)
2023-11-09 16:33 UTC, Chris Sherlock
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zhang Qide 2022-05-14 12:45:58 UTC
Created attachment 180114 [details]
screenshot1

After fix bug 148940, LibreOffice Writer will get extra mini space between Chinese character that cause wrong text display when open docx file. see the attachment.


Version: 7.4.0.0.alpha1+ (x64) / LibreOffice Community
Build ID: 99e10099b5d63c30b9a960fc94fc438ae7ab63dd
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: zh-CN (zh_CN); UI: zh-CN
Calc: CL
Comment 1 Zhang Qide 2022-05-14 12:46:42 UTC
Created attachment 180115 [details]
screenshot2
Comment 2 Zhang Qide 2022-05-14 12:47:07 UTC
Created attachment 180116 [details]
test document
Comment 3 Zhang Qide 2022-05-14 12:47:41 UTC
Created attachment 180117 [details]
pdf file export from ms office 2013
Comment 4 Zhang Qide 2022-05-14 12:49:59 UTC
The date align is not correct also in the first page. see screenshot1
Comment 5 raal 2022-05-27 18:15:04 UTC
(In reply to Zhang Qide from comment #0)
> 
> After fix bug 148940, LibreOffice Writer will get extra mini space between
> Chinese character that cause wrong text display when open docx file. see the
> attachment.
> 
Adding CC to Mark Hung
Comment 6 Mark Hung 2022-05-29 07:57:42 UTC
I confirm the case.
Comment 7 Commit Notification 2022-06-08 06:15:19 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/3e754c07fabd1f74d57f42f273ea46e03dbdc094

tdf#149089 fix extra mini space in text grid.

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Zhang Qide 2022-06-09 13:17:01 UTC
the bug almost same as before

Version: 7.4.0.0.alpha1+ (x64) / LibreOffice Community
Build ID: 66b1ebd4ddc7127a923bf81eb569e7f99dd52022
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: zh-CN (zh_CN); UI: zh-CN
Calc: CL
Comment 9 Commit Notification 2022-06-16 12:40:28 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/47eff9bf12abf963907b4d3dcb90b73e0ccc646d

tdf#149089 snap to grid if IsSnapToChars() is false

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2022-06-16 19:16:10 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/b345858e92351ceb997cf8e77024d7fe573a99c6

tdf#149089 snap to grid if IsSnapToChars() is false

It will be available in 7.4.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Zhang Qide 2022-06-18 09:34:17 UTC
the bug almost same as before

Version: 7.5.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: f804b8d0b1fc0c215c8883c76344b2d256d5c003
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: zh-CN (zh_CN); UI: zh-CN
Calc: CL
Comment 12 Mark Hung 2022-06-18 09:51:49 UTC
@Zhang Qide: would you please take another screenshot ( with display grid on ) and highlight where it is still problematic?
Comment 13 Zhang Qide 2022-06-18 13:08:31 UTC
Created attachment 180820 [details]
compare same doc with ms office 2013 side by side

see the attachment
Comment 14 Zhang Qide 2022-06-18 13:17:40 UTC
Created attachment 180821 [details]
long doc

To make things worse, Libreoffice open a docx file Contains 51 pages long get 4 pages more than ms office because of the extra mini space.
Comment 15 Zhang Qide 2022-06-18 13:27:53 UTC
Maybe the terminology should called "kerning", or "字符间距" or "间隙" in Chinese
Comment 16 Zhang Qide 2022-06-18 13:32:08 UTC
The date align in the first page is correct after the commit.
Comment 17 Zhang Qide 2022-06-18 13:40:44 UTC
Created attachment 180822 [details]
another doc compare  side by side
Comment 18 Zhang Qide 2022-06-18 13:46:42 UTC
Created attachment 180823 [details]
before bug 148940

the kerning between Chinese character is more acceptable before fix bug 148940
Comment 19 Commit Notification 2022-07-07 13:19:08 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/23f80b26098bcf9a8ae870e8ded878cca6e0c541

tdf#149089 fallback GridMode.

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 Commit Notification 2022-07-07 14:59:54 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/ac0ab772d93bcf3197c1c6e2191cba74eb39718a

tdf#149089 fallback GridMode.

It will be available in 7.4.0.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 21 Zhang Qide 2022-07-12 00:43:58 UTC
The original issue has been fixed. But I still have some similar problem. Please see new attachment.



Version: 7.5.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 1f201d76d6e2fcc9d8af6504c38bd98c46e0798e
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: zh-CN (zh_CN); UI: zh-CN
Calc: threaded
Comment 22 Zhang Qide 2022-07-12 01:49:21 UTC
Created attachment 181234 [details]
test document archive

see zip archive
Comment 23 Chris Sherlock 2023-11-08 05:48:18 UTC
Created attachment 190710 [details]
example PDF
Comment 24 Chris Sherlock 2023-11-08 05:48:54 UTC
Created attachment 190711 [details]
further test document
Comment 25 Chris Sherlock 2023-11-08 05:49:35 UTC
Created attachment 190712 [details]
First screenshot showing LO vs Word
Comment 26 Chris Sherlock 2023-11-08 05:50:05 UTC
Created attachment 190713 [details]
Second screenshot showing LO vs Word
Comment 27 Chris Sherlock 2023-11-08 05:51:28 UTC
Zhang Qide, it looks like the spacing between the characters is too much. Is this the case?
Comment 28 Chris Sherlock 2023-11-08 10:47:04 UTC
There is a definite problem here.

Paste in 

根据10.1(37BA) Eng TEST-TEST.doc,

You'll see the characters in brackets moves on top of each other.
Comment 29 Chris Sherlock 2023-11-08 10:49:10 UTC
Please ignore my last comment.
Comment 30 Chris Sherlock 2023-11-08 15:05:15 UTC
I think I can't see the issue because I'm using MacOS and I don't have the font Microsoft YaHei installed, so it is falling back to font substitution. 

What OS are you using Zhang?
Comment 31 Chris Sherlock 2023-11-09 16:21:08 UTC
OK, I just found a copy of the font on a Windows machine and installed it on MacOS. There is a big hole in keepalivd and , in that document.
Comment 32 Chris Sherlock 2023-11-09 16:33:09 UTC
Created attachment 190770 [details]
reduced test case

A reduced test case showing the document with three characters, and clearly showing that there is too much space after the comma. File has Arial embedded in the document.
Comment 33 Chris Sherlock 2023-11-09 16:48:20 UTC
So the font goes like this:

First character is a "增"- Microsoft YaHei
Second character is an "e" - font is MS Arial
Third character is a "<' BUT this font is in Microsoft YaHei. 

This is actually working as intended! Change the comma to Arial and it fixes the issue. 

I think we can actually mark this as closed as the last issue is just the font being used.