In LibreOffice Writer, when building an index in the Romanian language, I used an UTF-8 encoded concordance file, encoding which retains correctly all the letters with diacritics (ș, ț, â, î, ă etc.). After loading it into Writer, the diacritic characters are transformed from UTF-8 into ASCII, which causes replacing characters as ș, ț, â, î, ă etc. for ?. For this reason, the index cannot be build in the Romanian language (and perhaps also in other languages that have letters with diacritics), but only manually (which is very time-consuming).
I noticed this feature works correctly on Linux, but not on Windows (I use Windows 7 SP1)
Please attach an example concordance file and any other files needed to reproduce this. Set to NEEDINFO. Change back to UNCONFIRMED after you have provided the document(s).
Created attachment 134513 [details] Concordance UTF-8 file containing characters with diacritics Here it is
(In reply to George Acu from comment #3) > Created attachment 134513 [details] > Concordance UTF-8 file containing characters with diacritics > > Here it is Ok, now what to do with this?
Created attachment 134514 [details] Concordance UTF-8 file containing characters with diacritics
Created attachment 134515 [details] Sample .fodt file that uses that .sdi attachment
Please ignore comment #3. I attached again a sample .fodt file, which uses the .sdi second attached file (containing terms with diacritics). The language used in the .fodt file is Romanian.
Simply open the .fodt document, go to page 12, to Alphabetical index, right-click - Edit Index, see on the first tab Concordance file. Open it, and you'll see some altered characters (LOW converts the concordance file from UTF-8 to ASCII on Windows 7).
Thanks. I found a duplicate. I tested on Linux and this seems to be Win-only. Curiously, on Windows it produces absolutely nothing in the index.. no entries. *** This bug has been marked as a duplicate of bug 81409 ***
Ah, sorry I made a mistake, it is not a duplicate. Well, I guess this can be set to NEW as I could not produce anything. Version: 6.0.0.0.alpha0+ (x64) Build ID: e0f67add2ec56706ce06a03572535266f21c0303 CPU threads: 4; OS: Windows 6.19; UI render: default; TinderBox: Win-x86_64@42, Branch:master, Time: 2017-06-27_23:04:56 Locale: fi-FI (fi_FI); Calc: group Version: 4.4.7.2 Build ID: f3153a8b245191196a4b6b9abd1d0da16eead600 Locale: fi_FI Arch Linux 64-bit, KDE Plasma 5 Version: 6.0.0.0.alpha0+ Build ID: 9eed346b0b745f0598eefc572c789d58353b5e31 CPU threads: 8; OS: Linux 4.11; UI render: default; VCL: kde4; Locale: fi-FI (fi_FI.UTF-8); Calc: group Built on July 5th 2017
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
Versione: 6.4.0.3 (x64) Build ID: b0a288ab3d2d4774cb44b62f04d5d28733ac6df8 Thread CPU: 16; SO: Windows 10.0 Build 18363; Resa interfaccia: GL; VCL: win; Versione locale: it-IT (it_IT); Lingua interfaccia: it-IT Calc: threaded I confirm the existance of this bug under windows 10: All diacritics are gone because of writer importing index.sdi file ANSI encoded, instead of using utf8. Using attached files result in the same experience. Version: 6.1.5.2 Build ID: 1:6.1.5-3+deb10u5 CPU threads: 16; OS: Linux 4.4; UI render: default; VCL: x11; Locale: en-US (en_US.UTF-8); Calc: group threaded In debian linux subsystem at the very same time (I've opened the same .odm document twice, both in windows and linux) the index works flawlessly. Using attached files result in the same experience. Notepad++ confirms UTF8 encoding Using attached files result in the same experience. How to replicate: 1)Create a new Writer document 2)type some words with diacritics (eg. "Paweł" "Lukáš") 3)Insert an Analytical Index, select "use concordance file" - new file 4)Paste your words into the first column of the table 5)Confirm 5.1)Click on edit file, your diacritics are now mere "?" 6)Save and enjoy your empty index. 7)Open the very same file in any linux distro 8)Copy your words with diacritics 8.1)Edit index - edit concordance file - paste your words again - save - ok 9)Enjoy your index. Could it be a problem related with Calc? It is involved into the creation of the table
I'm still trying to update my index file and the very same problem happens with latest Writer 7.0.1.2 on windows 10 2004 v19041.508
Version: 7.1.0.3 (x64) / LibreOffice Community Build ID: f6099ecf3d29644b5008cc8f48f42f4a40986e4c CPU threads: 16; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win Locale: it-IT (it_IT); UI: it-IT Calc: CL I tried again wih no luck
*** Bug 125496 has been marked as a duplicate of this bug. ***
Andreas Heinisch committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/7e6e0fd63eac57de0f76ab1efdb1283c22ad6e6c tdf#108910, tdf#125496 - read/write index entries using utf8 It will be available in 7.4.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Unfortunately, the widget does not yet support ui testing, but the test files are in the previous patch sets.
Andreas Heinisch committed a patch related to this issue. It has been pushed to "libreoffice-7-3": https://git.libreoffice.org/core/commit/4dc4dfe0f249f454291a2d57e28f11342421bb00 tdf#108910, tdf#125496 - read/write index entries using utf8 It will be available in 7.3.1. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.