Description: When importing Greek text from WordPerfect 5.x documents the WP characters 8,38 and 8,39 produce the wrong unicode symbols. This problem does not occur when importing from WP 6.x and later. The wrong characters are: WP character 8,38, SIGNMA terminal (i.e. upper-case terminal Sigma, which should be unicode 0x03a3 (Greek Capital Letter Sigma). In WP 5 import, the character unicode 0x03f9 is used instead; this is a different character, the Greek Capital Lunate Sigma, which has NOT a terminal Sigma. The correct character is present in the WP6 set. WP character 8,39, sigma terminal, should be unicode 0x03c2, Greek small letter final sigma. In WP 5 import, the wrong character is used: 0x03db, an ordinary (not terminal) signa. The correct character is imported from WP 6+ documents. The error seems to derive from a mistaken commit in libwpd, as documented here: https://sourceforge.net/p/libwpd/tickets/22/ It seems that the changes were made in order to accommodate a program called Printer Polyglott. But changes made to accommodate that obsolete program should not be carried through to LibreOffice. It seems that the developers of libwpd will not fix this error, so perhaps LibreOffice can fix it? Steps to Reproduce: 1. Either open WordPerfect 5.1 CHARACTR.DOC in LibreOffice, or create a WP file in WordPerfect 5.x with WP characters 8,38 and 8,39. 2. Open the WP 5.x file in LibreOffice. 3. Actual Results: The wrong unicode characters are in the converted WP 5 document for WP characters 8,38 and 8,39. Expected Results: The correct characters appear (as they do when converted from WP 6.x+) Reproducible: Always User Profile Reset: No Additional Info: Version: 7.4.4.2 / LibreOffice Community Build ID: 85569322deea74ec9134968a29af2df5663baa21 CPU threads: 8; OS: Mac OS X 13.1; UI render: default; VCL: osx Locale: en-US (en_US.UTF-8); UI: en-US Calc: threaded
David/Fridrich: do you think the patch https://sourceforge.net/p/libwpd/code/ci/0bacfbb3e035174308cb7dd87acfca320dda3912 can be reverted in libwpd or should we just add a patch on LO to revert it only in LO? (or perhaps you got another idea?)
There are three lines with changes in the original commit. I think the first and third lines need to be reverted; but I think the second line MAY correct a real error. I would have take another look at the WP6 code to be certain.
Created attachment 187360 [details] WPDOS 5.1 file with Greek characters affected by issue
Created attachment 187361 [details] The GREEKWP5.WP file opened in WPWin and saved to DOCX format
Created attachment 187362 [details] GREEKWP5.WP opened in LibreOffice and saved as ODT
I'd like to revive this bug. I've attached three files: GREEKWP5.WP - a WPDOS 5.1 document containing the four Greek characters relevant to this issue GREEKWP5fromWPWin.docx - the same WPDOS 5.1 file, opened in WordPerfect for Windows 2021 and saved from WPWin in DOCX format, showing the correct Unicode mappings of the characters. GREEKWP5.WP.odt - the same WPDOS 5.1 file opened in LibreOffice, and saved in ODT format, showing the three wrong character mappings in libwpd as used by LibreOffice. The wrong mappings were introduced many years ago by someone who wanted to print WP files in obsolete software. That is no reason to continue using the wrong mappings today.
(In reply to em36 from comment #6) > The wrong mappings were introduced many years ago by someone who wanted to > print WP files in obsolete software. That is no reason to continue using the > wrong mappings today. I reverted the whole change from 2010. Now, can you cross-check and indicate whether I did not change too many things with my commit? I have no way to generate the documents now? If you find that I did too zealous change, please indicate which Unicode point I should replace by which one. The original commit did these changes: - replace lunate small sigma with small stigma - replace one occurrence of capital Sigma by lunate capital sigma - replace one occurrence of capital Ypsilon by small eta with tonos - replace variant of small rho by rho with tonos
Thank you! I commented on this in the libwpd SourceForge site. The reversion makes three characters correct, but restores an error that was evidently fixed after the original bad commit. I've specified exactly which hex string to change in my comment on SourceForge. And thank you for this quick response!
Just to repeat what I wrote on SourceForge: the latest commit leaves one character incorrect. I've posted the details in libwpd on SourceForge.
Thanks to Fridrich, this is now fixed. I hope the fix can be incorporated in the LibreOffice code before long.
I've submitted this patch: https://gerrit.libreoffice.org/c/core/+/156768 where I retrieved Fridrich's commits concerning this part. No idea when libwpd 0.10.4 will be released but let's avoid to wait more time here for just 2 changed lines.
This is already fixed in 7.6.0.3. No need to do anything more about it.
Created attachment 189459 [details] screenshot with master sources Here's the result I got with master sources updated today.
Created attachment 189460 [details] screenshot with master sources + patch Here's the same export with master sources + the patch.
I'm sorry - you're right and I was wrong. (I used the wrong test file.) The patch is needed. Apologies for wasting bandwidth!
(In reply to em36 from comment #15) > I'm sorry - you're right and I was wrong. (I used the wrong test file.) The > patch is needed. Apologies for wasting bandwidth! No pb :-)
Julien Nabet committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/5424bcd28f89b3622f85783633c725be643a0595 tdf#153034: Three wrong Greek characters in WordPerfect 5 import It will be available in 24.2.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Cherry-pick on 7.6 waiting for review here: https://gerrit.libreoffice.org/c/core/+/156736
Julien Nabet committed a patch related to this issue. It has been pushed to "libreoffice-7-6": https://git.libreoffice.org/core/commit/ffbbc643fdac9ef23387f59373437a06a669fea7 tdf#153034: Three wrong Greek characters in WordPerfect 5 import It will be available in 7.6.2. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.