| Summary: | EDITING / FORMATTING Bad management of em dash in Spanish language texts | ||
|---|---|---|---|
| Product: | LibreOffice | Reporter: | José Moya <josemoya> |
| Component: | Writer | Assignee: | Not Assigned <libreoffice-bugs> |
| Status: | NEW --- | ||
| Severity: | enhancement | CC: | buzea.bogdan, erack, heiko.tietze, sophi, vsfoote, xiscofauli |
| Priority: | medium | Keywords: | needsDevAdvice |
| Version: | Inherited From OOo | ||
| Hardware: | All | ||
| OS: | All | ||
| See Also: | https://bz.apache.org/ooo/show_bug.cgi?id=122337 | ||
| Whiteboard: | |||
| Crash report or crash signature: | Regression By: | ||
| Bug Depends on: | |||
| Bug Blocks: | 103341 | ||
|
Description
José Moya
2019-11-24 15:21:22 UTC
Reading Unicode UAX#14 [1], it seems to be covered by the B2, and LB17 rules for handling U+2014 EM DASH. Although not clear it is fully resolved to all Unicode participants and ICU-8061 issue [2] remains unresolved. At some point should make it into ICU libs and be available to possibly use in LibreOffice edit engine(s). Until then it remains incumbent on users to mind their formatting. And, as suggested in the AOO see also i122337, while writing dialog one could set the AutoCorrect localized options for open and close 'Double Quotes' to U+2014 by picking the EM DASH character. Done from Tools -> AutoCorrect -> AutoCorrectOptions -> Localized Options. =-ref-= [1] http://www.unicode.org/reports/tr14/tr14-40.html [2] https://unicode-org.atlassian.net/browse/ICU-8061?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel Not clear if the dev effort needed to control break/linebreak handling of U+2014 EM DASH for es-XX locales should be done outside provisions of ICU lib handling. Durring editing insertion of a ZWNBS U+FEFF before or after the EM DASH is trivial. Annoying to have to do it, but easily done. Worth keeping open on the back burner? Or, best to issue a WONTFIX until Unicode and ICU can resolve? If resolved it has to be a NOB. But I would keep it. Would be interesting if the line break depends on OS and program, eg. I wonder how simple text editors deal with it. Hi! I have reported to Unicode Consortium. I can't believe the Spanish companies and institutions represented at Unicode Consortium did not report this, but, hey, they are Spanish, so they are not supposed to do their job. I want to clarify I am the same "Arcalaus" that wrote about quotes at the link https://bz.apache.org/ooo/show_bug.cgi?id=122337 But there is a misunderstanding. I am not talking about using the Autoquotes to change quotes into em dashes. That would not solve the situation. I am talking about what, in Unicode jargon, would be, "removing em dash from B2 and assigning it to QU if the language setting is Spanish". Also, please notice that inserting a ZWNBS before every em dash would be a pain in the *ss. Just imagine a 50k word novel with 10000 lines of dialogue, every one featuring between one and three em dashes. Inserting the ZWNBSP is difficult, and programming a macro to do it is way difficult (I can program a MsWord macro in minutes, but a OOo / LibreOffice macro gets me weeks). Finally, there is the question of compatibility between platforms. I am writing you because in my new job they use LibreOffice. I have word at home. A literature exam prepared in my home, with the em dashes in the right places, gets a piece of crap (em dashes put in places where they should not go) when I print it at my workplace. Yes, I could use PDF, but I am against PDF for my own reasons. Yours, José Moya (In reply to José Moya from comment #4) > I have reported to Unicode Consortium. I can't believe the Spanish companies > and institutions represented at Unicode Consortium did not report this... The issue was already reported and open with Unicode (at least the ICU project) as ICU-8061 [1]; and looks like your report was ICU-10754 [2]. Pending any LibreOffice dev comment, we'll either close WF, or set aside to see what comes out in ICU libs (or maybe in CLDR) that we might implement. =-ref-= [1] https://unicode-org.atlassian.net/browse/ICU-8061 [2] https://unicode-org.atlassian.net/browse/ICU-10754 (In reply to José Moya from comment #4) > please notice that inserting a ZWNBS before every em dash would be a pain Not zero but narrow non-breaking space U+202F is available under Insert > Formatting Mark or per shift+alt+space since 6.3 IIRC. And you could also create a new rule for autocorrection. Just as workaround, correct implementation is needed of course. |