Created attachment 82294 [details] Screenshot If I have an East-Asian character in my (predominantly English) document, followed by a quotation mark (opening or closing), the quotation mark takes the font settings from the "Asian text font" section of the style definition. This results in very ugly copy. Steps to reproduce: 1. Type some western text into LO Writer, surrounded by quotation marks (e.g. "sun"). 2. Move the cursor to before the opening quotation mark, and type (or paste -- the IME is not relevant) an East-Asian character (e.g. 日). Current behaviour: The initial quotation mark takes the settings from "Asian text font" instead of "Western text font". The behaviour is the same if a (normal-width, western) space comes between the East-Asian character and the opening quotation mark. Expected behaviour: The opening quotation mark, being surrounded by a normal-width space on one side, and a Latin letter ("s" in this case) on the other, should take the "Western text font" settings. The only way to "work-around" this problem is to select the characters that have been rendered incorrectly and manually force the application of the "Western text font" settings. Of course, this breaks if "Clear Direct Formatting" is used. It's not clear to me why typing an opening quotation mark immediately after an East-Asian character results in the insertion of Asian punctuation (e.g. 「 or 『). If I wanted Asian punctuation, I would, of course, type Asian punctuation. I don't know if this is connected. ask.libreoffice.org link: http://ask.libreoffice.org/en/question/19750/problem-with-full-width-asian-punctuation/ May perhaps be linked to this bug: https://bugs.freedesktop.org/show_bug.cgi?id=60106 I'm currently using LO Version 4.0.4.2 (Build ID: 400m0(Build:2)) on Linux Mint 14 amd64, but the problem has been around as long as I can remember and on every platform I've tried.
Created attachment 85096 [details] test cases of English and Chinese quotes I confirm this bug in LibreOffice 4.0.5.2 and 4.1.1.2. I did some test in the attached file, see the highlighted part. Quotes are incorrect when in the first line or after a different language. When disable "double quotes replacement" in autocorrection option, everything is OK, so its a replacement problem.
Today I tested attachment 85096 [details] in 4.3.0.1, And it seems that it's getting worse. All the start quote which are at the beginning of paragraph are always shown as "half-width", regardless of whether the following chars are westen or Asian.
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present on a currently supported version of LibreOffice (4.4.1 or later): https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to "inherited from OOo"; 4b. If the bug was not present in 3.3 - add "regression" to keyword Feel free to come ask questions or to say hello in our QA chat: http://webchat.freenode.net/?channels=libreoffice-qa Thank you for your help! -- The LibreOffice QA Team This NEW Message was generated on: 2015-07-18
can confirm this bug is still present: Version: 4.4.4.3 Build ID: 40m0(Build:3) Locale: en_GB.UTF-8 (LO from "LibreOffice Fresh" PPA, on Linux Mint 17.2 (package base == Trusty).
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present on a currently supported version of LibreOffice (5.1.5 or 5.2.1 https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to "inherited from OOo"; 4b. If the bug was not present in 3.3 - add "regression" to keyword Feel free to come ask questions or to say hello in our QA chat: http://webchat.freenode.net/?channels=libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug-20160920
I think using “East Asian text font” is more suitable.
(In reply to Volga from comment #6) > I think using “East Asian text font” is more suitable. @simon does this helps?
Four years after the initial report, this bug still exists in LibreOffice 5.3.4 (running on Windows) with a mix of East Asian (CJK) and non-CJK fonts and text.
Created attachment 137355 [details] screenshot_including_complex_text_layout This also happens to languages like Thai (Complex text layout), please see attached PNG file. It's actually quite annoying ... -- Version: 6.0.0.0.alpha1+ Build ID: 81d50fd137fdf712a0f37988217c43278cf24c26 CPU threads: 4; OS: Linux 4.4; UI render: default; VCL: gtk2; TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2017-10-28_00:31:27 Locale: zh-TW (zh_TW.UTF-8); Calc: group --
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
I confirm that this bug is still present in: Version: 6.1.3.2 Build ID: 86daf60bf00efa86ad547e59e09d6bb77c699acb CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk2; Locale: en-US (en_US.UTF-8); Calc: group threaded
(In reply to tommy27 from comment #7) > (In reply to Volga from comment #6) > > I think using “East Asian text font” is more suitable. > > @simon > does this helps? Oh I made a misunderstand, but I thought that is more proper name.
As my suggestion, we can set language at the document level firstly, for example , in content.xml we can save the attribute as: <office:document-content lang="zh-CN"> Or alternatively, we save the locale language at settings.xml, then we can try to use CJK fonts to render such punctuations by default if LibreOffice recognized that, until we set another language to a paragraph.
As my suggestion, we can set the default language at the document level firstly, for example, in content.xml we can save the attribute as: <office:document-content lang="zh-CN"> Or alternatively, we save the locale language at settings.xml, then we can try to use CJK fonts to render such punctuations by default if LibreOffice recognized that, until we set another default language to a paragraph.
*** Bug 124657 has been marked as a duplicate of this bug. ***
*** Bug 126387 has been marked as a duplicate of this bug. ***
Anyone who has an idea for this?
The core issue of this bug, IMHO, is that curly double quotation marks (U+201C and U+201D) are widely used in both English and (simplified) Chinese, so LO has no way to know which style (western or Asian) it should apply to these quotation marks, and has to rely on context. There are potentially more characters that cause such problem, the most obvious being single quotation marks. But I've also seen the middle dot (U+00B7) and em dash (U+2014) with similar problems. The quotation marks are especially visible because the current bug makes them unsymmetrical, which brings quite some visual discomfort. So the obvious brute-force solution is that instead of determining their style according to context, LO can just make sure the quotation marks are consistently using the same style, either through some language/locale setting as comment 14 mentioned, or as an special setting that can be changed by the user. In other words, treat quotation marks differently than the other characters.
*** Bug 101751 has been marked as a duplicate of this bug. ***
Mr. Khaled, what do you think of?
(In reply to Volga from comment #20) > Mr. Khaled, what do you think of? I checked MS Word, and it seems to treat the quotation marks as western text unless their language is set to Chinese, then it treats them as Asian text regardless of the context. This seems simpler and more reliable than what we currently do. I wounder if it does this to all punctuation characters? It feels less smart, though. The smart, and more Unicode-compliant way is to try to resolve common characters based on context like we do know, except that our implementation is buggy. I’m not sure which is the better way, to be honest, as either option has compatibility considerations (either with older LO versions if we go MS way, or both if we fix our current way). I’m not sure who should decide this.
*** Bug 134350 has been marked as a duplicate of this bug. ***
I've seen someone made a tsukkomi for a long time. https://yongweiwu.wordpress.com/2014/12/18/a-complaint-of-odfs-asian-language-support/ Although MS Word set the good example for this, I believe implement a smart rules to assign would be better choice. In this way LibreOffice would be able to assign font face for such punctuations to make them match the mostly used language/locale without breaking down text style or file structure.
Created attachment 188133 [details] Screenshot on WordPad From the last comment I found this test file by blog author https://yongweiwu.files.wordpress.com/2014/12/odf_test.odt Then I remembered WordPad, a native word processor in Windows, so let's see what happened on WordPad.
Created attachment 188134 [details] The same file opened with LibreOffice Writer Then this screenshot is made after the same file opened with LibreOffice Writer, note both two apps are zh-CN locale when I see them. So Khaled, what happened if you open this ODT in WordPad or MS Word?
Have you checked Windows WordPad so far?
Seen from the commit d6efe8c302b81886706e18640148c51cf7883bbf, I think there is an to fix this bug, from which I believe this could be done by assigning font face to such punctuations dependes on surrounding texts. For characters that could be affcted by this bug, see: https://www.w3.org/International/clreq/#tables_of_chinese_punctuation_marks https://www.w3.org/International/jlreq/#cl-01 https://www.w3.org/International/klreq/#chars-grouping
See a related articles: 中西文混合排版中标点符号的渲染 https://blog.1a23.com/2020/06/28/zhong-xi-wen-hunhe-paiban-zhong-biaodian-fuhao-de-xuanran/ 中英混排中的标点符号问题 https://www.hutrua.com/blog/2018/07/22/punctuation.html
Unicode 16.0 made new definations for four quotation marks encoded in General Punctuation block. To my eyes, if they are accomplished with U+FE01, they should be displayed with CJK fonts. https://www.unicode.org/charts/PDF/Unicode-16.0/U160-2000.pdf
Unicode 16.0 made new definations for four quotation marks encoded in General Punctuation block. To my eyes, if they are accomplished with U+FE01, they should be rendered with CJK fonts whatsoever. https://www.unicode.org/charts/PDF/Unicode-16.0/U160-2000.pdf
This bug is due to the greedy algorithm we use to assign script types to weakly-associated characters. It does not properly handle punctuation. The current algorithm works something like this: - First, any weak characters at the start of a paragraph are assigned to the same script as the first strong character in the paragraph. - Then, the paragraph is scanned in reading order. Weak characters are assigned to the previously-seen script, with a few hard-coded exceptions (e.g. bug 112594). - Finally, we run the Unicode bidi algorithm, and reassign all right-to-left text to the complex script type. The last step hides the depth of the problem. The Unicode bidi algorithm accounts for nested punctuation, so the output seems correct-but-buggy for RTL languages (while not working at all for other language pairs). In my opinion, we should replace the current algorithm with one that extends the RTL behavior to all languages. Existing RTL documents depend on the current behavior, and impacted CJK documents likely already include manual formatting to achieve the same effect, so this seems like the least-disruptive option.
Jonathan Clark committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/537645c0834eab2d277113f1e3fcf039c994832d tdf#66791 sw: Treat weak punctuation as Asian in Asian paragraphs It will be available in 25.8.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Created attachment 198154 [details] Change illustration Screenshots comparing Word to LO, both with and without this patch. Blue-highlighted quotation marks have the Asian script type. Orange-highlighted quotation marks have the Complex script type.
While investigating this bug, I found that our script assignment implementation is broadly similar to (and shares many problems with) the algorithm used by Microsoft Word. The main difference is that Word treats certain punctuation characters as Asian script group when used in paragraphs containing CJ characters. Rather than risk compatibility, I applied a similar heuristic to our implementation. I also restructured the code so it will be easier to make changes in the future. This fix is narrow and sub-optimal. It's not possible to write an algorithm that perfectly assigns characters to script groups. The ideal solution is to let users specify language manually, and this is tracked by bug 151290.
Jonathan Clark committed a patch related to this issue. It has been pushed to "libreoffice-25-2": https://git.libreoffice.org/core/commit/73a96633d672f344f6415f050405b19174031f37 tdf#66791 sw: Treat weak punctuation as Asian in Asian paragraphs It will be available in 25.2.0.2. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Created attachment 198343 [details] Screenshot after the last commit The problem is still happened if a line is started with “ (U+201C, LEFT DOUBLE QUOTATION MARK). Version: 25.2.0.1.0+ (X86_64) / LibreOffice Community Build ID: 16b35a9ea05c9a1a566baf502236b45cfd628d11 CPU threads: 4; OS: Windows 10 X86_64 (10.0 build 19045); UI render: default; VCL: win Locale: zh-CN (zh_CN); UI: zh-CN Calc: threaded Test file: comment 23
Jonathan Clark committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/9b505f583954c88ce7b72a07c9bfd65d78d863ef tdf#66791 sw: Apply first-seen script type to leading weak characters It will be available in 25.8.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
(In reply to Volga from comment #37) > The problem is still happened if a line is started with “ (U+201C, LEFT > DOUBLE QUOTATION MARK). This was intentional, to avoid regressing an earlier fix (#94331#). However, while looking up that bug number for this post, I found a code comment deleted 24 years ago explaining that the current behavior was meant to be temporary.
Jonathan Clark committed a patch related to this issue. It has been pushed to "libreoffice-25-2": https://git.libreoffice.org/core/commit/e4b74e8bc282d0fd396265ec893491b0bebe576d tdf#66791 sw: Apply first-seen script type to leading weak characters It will be available in 25.2.0.2. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Created attachment 198496 [details] Another screenshot after the last commit I found another problem happened after the last commit. In this case Chinese quotation mark is used in English brackets. Version: 25.2.0.1.0+ (X86_64) / LibreOffice Community Build ID: 5acb7648c3eff7371385df442a627768762a7aa6 CPU threads: 4; OS: Windows 10 X86_64 (10.0 build 19045); UI render: default; VCL: win Locale: zh-CN (zh_CN); UI: zh-CN Calc: threaded Test file: https://bz.apache.org/ooo/attachment.cgi?id=81108 (also from comment 23)
I think there's need to have additional rules for texts within brackets.
Resetting to fixed. (In reply to Volga from comment #41) > I found another problem happened after the last commit. In this case Chinese > quotation mark is used in English brackets. Correctness in script assignment is subjective. No algorithm, or even human editor, can perfectly reconstruct authorial intent from raw text. The current algorithm guarantees matching pairs of quotation marks have the same font, which was the most distracting part of this bug. It also makes LibreOffice behave more like other office suites (which don't handle the parenthesized text case shown in this screenshot, either). In my opinion, the current state is good enough to consider this bug fixed. Instead of adding more complex language processing, we should fix bug 151290. This would give users/documents more control. We could also use that feature to handle cases like this example at proofing time, and not risk changing the behavior of existing documents from version-to-version.
I think they should have special exceptions for Variation Selectors, when they followed by VS01, they should be always rendered with Western text font, when they are followed by VS02, they should always rendered with Asian text font. See: https://www.unicode.org/charts/PDF/Unicode-16.0/U160-2000.pdf
Created attachment 198523 [details] Test cases for variation selectors This is made for above comment.
(In reply to Volga from comment #44) > I think they should have special exceptions for Variation Selectors And that would suit better for discussion in a new, separate bug report. I feel it is very impolite to repeatedly reopen a bug that you didn't report yourself, when the developer consider it fixed and has marked it so.
OK, I'm sorry, so let's go to bug 164700.