Description: In the OOXML standard, there is a limitation for the bookmark names and for field references (the value of <w:instrText> tags) to maximum 40 characters. There is an encode/decode mechanism in LibreOffice for non ASCII characters in bookmark names and in field references, which mechanism creates more characters from non-ascii characters. For example %C5%91 from ő. If the truncation happens before the decoding, non ASCII characters will be counted as more than one characters, which means bookmark names or field references can be truncated if they contain non ASCII characters. Steps to Reproduce: 1. Create some text 2. Select some section of the text 3. click on insert menu, select bookmark 4. give it a name which contains non-ASCII characters and long enough (for example árvíztűrő tükörfúrógép, or 1é2á3ű4ő5ú6ö7ü8ó9í) 5. go to somewhere else in the document, for example to the end of document, create a new paragraph 6. click on insert menu, select cross-reference 7. select the value "Bookmark" in "Type" listbox, then select the value "Reference" in the "Insert reference to..." listbox 8. in the "Selection" listbox, double click on the previously named bookmark 9. save the file as docx and reload it 10. rename the file to .zip instead of .docx, unzip it, and check out document.xml in word folder 11. look at these tags: <w:bookmarkStart w:name="something" w:id="0"/> <w:instrText> REF something \h </w:instrText> Actual Results: Some bookmarks which are not longer than 40 characters will be truncated if they contain non ASCII characters. For example: 1é2á3ű4ő5ú6ö7ü8ó9í as a bookmark name will be truncated to 1é2á3ű4ő5ú6%C3% and árvíztűrő tükörfúrógép as a bookmark name will be truncated to árvíztűrő_tük%C The cross references are still working despite the truncation, this is only a cosmetic problem. Expected Results: In MS Word if a bookmark name contains non-ascii characters and its size is below 41 characters it wont be truncated. We should emulate this behavior. Reproducible: Always User Profile Reset: No Additional Info: See also: 113483
Created attachment 151419 [details] Example file with a truncated bookmark name.
Created attachment 151420 [details] Screenshot of a truncated bookmark name before save and after save and reload
Created attachment 151421 [details] The same as example1, but with a different value.
Created attachment 151422 [details] Screenshot about the unzipped document.xml file with the truncated bookmark name and field reference.
I confirm it's truncated. In my case kghčdžšđćškghčdžšđćškghčdžšđćš to kghčdžšđćš.
Adam Kovacs committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/+/1cbf0ee54519bf81d934609352e8a1a641d8a534%5E%21 tdf#125298 DOCX export: fix bookmark name truncation It will be available in 6.3.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Tünde Tóth committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/+/d137a6944e42f5a59d6c318999edbf97d05cb9fd%5E%21 clean up "tdf#125298 DOCX export: fix bookmark name truncation" It will be available in 6.4.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.