Bug 97191 - AutoCorrect entry of Emoji using : to define is conflicting with entry of time in format HH:MM:SS -- the :10:, :11:, or :12: clock faces emoji replace input value
Summary: AutoCorrect entry of Emoji using : to define is conflicting with entry of tim...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
5.0.1.1 rc
Hardware: All All
: high major
Assignee: László Németh
URL:
Whiteboard: target:5.3.0 target:5.2.2
Keywords: bibisected, bisected, implementationError
: 97252 99974 100231 100232 101194 101304 104136 108775 114741 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-01-16 16:51 UTC by Frank
Modified: 2017-12-28 23:28 UTC (History)
21 users (show)

See Also:
Crash report or crash signature:


Attachments
libreoffice calc file containing the bug I described (16.36 KB, application/vnd.oasis.opendocument.spreadsheet)
2016-01-16 16:51 UTC, Frank
Details
Issue with HH:MM:SS (9.50 KB, application/x-ole-storage)
2016-02-08 20:29 UTC, Paul Ivaska
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Frank 2016-01-16 16:51:00 UTC
Created attachment 122004 [details]
libreoffice calc file containing the bug I described

This refers to LibreOffice ver 5.0.4.2 Writer and Calc.

LibreOffice Language settings: 
User Interface: Default-English (USA), or English (USA)
Locales affected (of what i have tested so far*)

English (Aus)
English (Ghana)
English (New Zealand)
English (Phillipines)
English (South Africa)
English (Trinidad)
English (UK)
English (USA)
Default-English (USA)
Amharic

The bug does not affect other English Locales.
* I have tested language locales alphabetically from "Afrikaans (Namibia) through "Basque".

The problem:
When entering time into a Calc cell or on a Writer line in the format HH:MM:SS.
Corruption of entered values occur immediately after typing the second colon (:) after entering minutes (MM) values of 10, 11, or 12.
Does not occur for any other values of minutes. This is not affected by the value entered for hours.

If time is copied and pasted from cells into new cells such that a sequence including the affected values (10, 11, 12) is generated, the bug does not show.
In other words, I can enter time values of say 01:01:00 through 01:09:00, and then copy and paste into additional cells so that the series continues:
01:10:00, 01:11:00, etc; there is no corruption.

When corruption occurs, it is displayed as digits being transposed over each other.
If I have entered into a cell or on a Writer line the following value of time:

01:11: (stopping with the entry of the 2nd colon), what I see in the cell is the digits 01 right-justified in the cell, but the 0 is transposed with a 1, such that the two digits appear in the same space, with the 1 to the right.

If I continue to enter the seconds digits after the 2nd colon, the value in the cell changes such that I now see the seconds digits I entered immediately to the right of the minutes (and hours) digits but without the colon between MM and SS, and the value is now left-justified in the cell.

The same appears in Writer when the values are typed onto a line.

If I change the locale from USA to one of the unaffected ones, such as Canada, values are displayed normally.

When I use print preview to check the values, I still see the same corruption.

Note: I have two copies of Ubuntu 14.04.3 LTS installed into separate partitions on the same machine. This bug occurs in either installation with LibreOffice version 5.0.4.2.

I also have a Windows 7 X64 install on the same machine, with LibreOffice 4.x installed. I will shortly upgrade that install to LibreOffice 5.0.4.2 (if it is available) and check for this bug. I will update when I have this data.

I am attaching a LibreOffice Calc file which contains the bad data.
Comment 1 Frank 2016-01-16 18:49:00 UTC
Tested LibreOffice 5.0.4 on Windows 7 X64 and found the same bug to exist.
Comment 2 FutureProject 2016-01-17 01:45:44 UTC
Hello, and thank you for bringing this issue to our attention.

I can confirm the described behaviour. After some testing, I found out that the issue emerged somewhere after

Version: 4.4.0.1
Build ID: 1ba9640ddd424f1f535c75bf2b86703770b8cf6f
Locale: de_DE

but before or at

Version: 5.0.3.2 (x64)
Build ID: e5f16313668ac592c1bfb310f4390624e3dbfb75-GL
Locale: de-DE (de_DE)

To reproduce quick and efficiently:
1. Open new Writer document
2. If not set: Tools -> Language -> For all text -> English (USA)
3. Write: 01:11:12
4. When writing the second colon the line resets back to reading "01"

I'm setting this report to NEW since I found it to be reproduceable very well through a number of builds.

--
Windows 10 Pro, Version 1511 (OS Build 10586.36)
Version: 5.0.4.2 Build ID: 2b9802c1994aa0b7dc6079e128979269cf95bc78
Locale: de-DE (de_DE)
Comment 3 V Stuart Foote 2016-01-17 15:39:56 UTC
This was introduced in the 5.0 release with autocorrect support for Emoji

The :10:, :11:, and :12: strings in Calc or Writer are being replaced with nothing--but could have been replaced with the clock face Emoji (I think that was the intent -- László? )

Simple work around to delete those lines from the autocorrect tables

Tools -> AutoCorrect -> AutoCorrect Options
Comment 4 Frank 2016-01-17 16:19:21 UTC
(In reply to V Stuart Foote from comment #3)
> This was introduced in the 5.0 release with autocorrect support for Emoji
> 
> The :10:, :11:, and :12: strings in Calc or Writer are being replaced with
> nothing--but could have been replaced with the clock face Emoji (I think
> that was the intent -- László? )
> 
> Simple work around to delete those lines from the autocorrect tables
> 
> Tools -> AutoCorrect -> AutoCorrect Options

Thanks for the work-around. I assume this will be fixed in next release?
Comment 5 V Stuart Foote 2016-01-17 16:39:16 UTC
(In reply to V Stuart Foote from comment #3)
> This was introduced in the 5.0 release with autocorrect support for Emoji
> 
> The :10:, :11:, and :12: strings in Calc or Writer are being replaced with
> nothing--but could have been replaced with the clock face Emoji (I think
> that was the intent -- László? )
> 
> Simple work around to delete those lines from the autocorrect tables
> 
> Tools -> AutoCorrect -> AutoCorrect Options

Actually the autocorrect replacement is happening, but the emoji's are not receiving font fall back replacement (pulling from OpenSymbol, or Segoe UI Emoji, Segoe UI Symbols font on Windows).

:10: gets U+1F559
:11: gets U+1F55a
:12: gets U+1F55b

Of course with unreliable fallback font substitution no glyph is being written to the canvas. So the time format is visually garbled as well as corrupted by the autocorrect.
Comment 6 V Stuart Foote 2016-01-17 16:58:15 UTC
(In reply to V Stuart Foote from comment #5)
> (In reply to V Stuart Foote from comment #3)
> > 
> > Simple work around to delete those lines from the autocorrect tables
> > 
> corrupted by the autocorrect.

Just verified that alternative to editing the autocorrect table, you can also simply issue <Ctrl>+Z for each incorrectly applied autocorrect.

So, :10: becomes 🕙 U+1f559, but immediate use of <Ctrl>+Z reverts to :10:

Just have to remember to do it--without the visual clue of the fallback font Emoji.
Comment 7 raal 2016-01-18 14:24:51 UTC
386d64546a50a4abfafc1c309bf4f12b31aa7289 is the first bad commit
commit 386d64546a50a4abfafc1c309bf4f12b31aa7289
Author: Norbert Thiebaud <nthiebaud@gmail.com>
Date:   Sun Aug 2 10:04:17 2015 -0700

    source sha:fa8089e1099c6c6668fef2cd3ac01373269d6ef9

    source sha:fa8089e1099c6c6668fef2cd3ac01373269d6ef9

:040000 040000 98617e71cc180b3c141980417a9949b72b0c87b7 714dc3d4608d9a92a27cf771617631ec18391b37 M      instdir

author	Christian Lohmaier <lohmaier+LibreOffice@googlemail.com>	2015-07-24 17:45:49 (GMT)
committer	Christian Lohmaier <lohmaier+LibreOffice@googlemail.com>	2015-07-24 17:46:21 (GMT)
commit	fa8089e1099c6c6668fef2cd3ac01373269d6ef9 (patch)
update emoji autocorrect entries from po-files
Comment 8 V Stuart Foote 2016-01-18 16:27:42 UTC
Not sure this is really a bug, we added support for Emoji input--choosing to bracket mnemonic with a single ":", but that also happens to be a valid input format for dates and time.

And the autocorrect affecting just these three HH:MM:SS formatted minute values--:10:, :11:, :12:--can be immediately <Ctrl>+Z reverted, or disabled in the autocorrect tables (deleted, or replaced).  And, is the same for those using YYYY:MM:DD or similar date entry--the middle value gets corrupted.

A solution might be to change the bracketing for entering the emoji--there is no standard--so bracketing with "::", or some other sequence, that would eliminate this autocorrect collision.

The autocorrect mechanism is performing the codepoint substitution as designed. But the fallback font handling to render glyph from a grapheme including the BMP or SEP codepoint is not correct--that is bug 71603.  The resulting substitution here is visually incorrect, but not unintended programmatically. And as noted can be reverted or prevented.
Comment 9 V Stuart Foote 2016-01-24 21:00:55 UTC
@László, * 

Any comment on feasibility/need to change the EMOJI delimiter from a single ":" to a double "::" and avoiding this conflict with date/time formats?  

Would changing the input to require two extra characters detract from using EMOJI feature? IMHO seems reasonable adjustment.
Comment 10 Paul Ivaska 2016-02-08 20:29:12 UTC
Created attachment 122459 [details]
Issue with HH:MM:SS

Attached file is the issue of HH:MM:SS with the :MM being a problem when entering 01 etc.
Comment 11 V Stuart Foote 2016-02-08 21:08:07 UTC
(In reply to Paul Ivaska from comment #10)
> Created attachment 122459 [details]
> Issue with HH:MM:SS
> 
> Attached file is the issue of HH:MM:SS with the :MM being a problem when
> entering 01 etc.

Known issue. Simple work around of opening Tools -> Autocorrect Options and deleting the "Emoji" replacements for

:10: ten-o'clock
:11: eleven-o'clock
:12: twelve-o'clock

Changing to enhancement -- removing the regression keyword

Changing auto-correct Emoji entry to use a double "::" bracketing (or possibly something else if more suitable) seems the correct way to consistently resolve this anti-feature.
Comment 12 V Stuart Foote 2016-05-20 22:05:19 UTC
*** Bug 99974 has been marked as a duplicate of this bug. ***
Comment 13 Aron Budea 2016-06-06 01:33:59 UTC
*** Bug 100231 has been marked as a duplicate of this bug. ***
Comment 14 Aron Budea 2016-06-06 01:54:21 UTC
Stuart, could this be considered a bug, and not an enhancement?

I don't think it's normal having to tinker with emoji autocorrect so it won't clash with the common way of inputting certain times. It's nice that there are several possible workarounds, but many users won't even know what's going on at first.
Comment 15 V Stuart Foote 2016-06-06 02:27:50 UTC
(In reply to Aron Budea from comment #14)

> I don't think it's normal having to tinker with emoji autocorrect so it
> won't clash with the common way of inputting certain times. It's nice that
> there are several possible workarounds, but many users won't even know
> what's going on at first.

The common practice for entry is as ":<emoji>:", we can't change what has become the norm. Meaning we otherwise leave users to cope with <Ctrl>+Z undo. Or alternatively,  we remove--or adjust the entry values--for those *few* emoji that actually conflict with other common data entry formats.

But that is László's call--the project has done the right thing to jump in with thousands of defined emoji's, ~900+ for each localization translated and available as autocorrect substitutions.

So yes IMHO ":10:", ":11:" and ":12:" clock faces (U+1f559, U+1f55a, U+1f55b) could simply be redefined to "::10::", "::11::", and "::12::" as suggested. Done either individually by the user--or in some other consistent fashion by László or other devs. But we've laid down the feature, and I see no reason to back away from it. Its an enhancement to what is otherwise working correctly.

Also, when the issues of bug 71603 to resolve fallback for SMP codepoints is solved, allowing the glyphs to show--the users will have a visual queue that the substitution has occurred.  Believe that is the issue that is causing the most grief here--not that the substitution is occurring rather that they are not visible without fallback rendering.
Comment 16 Aron Budea 2016-06-06 06:00:07 UTC
*** Bug 100232 has been marked as a duplicate of this bug. ***
Comment 17 Aron Budea 2016-06-06 07:11:06 UTC
(In reply to V Stuart Foote from comment #15)
> The common practice for entry is as ":<emoji>:", we can't change what has
> become the norm. Meaning we otherwise leave users to cope with <Ctrl>+Z
> undo. Or alternatively,  we remove--or adjust the entry values--for those
> *few* emoji that actually conflict with other common data entry formats.

Isn't entering time as XX:YY:ZZ also the norm? (or one of the norms)
I do consider this clash an issue, and would like to have it admitted as such. This doesn't mean anything is wrong with emoji recognition, but if two features don't work together properly, that's still a bug in the software, isn't it?

At the same time, I understand and appreciate the efforts made at looking for a solution. Let me also share an idea.

What if the code checked if the emoji pattern to be substituted matched to a valid ISO 8601 time prefix, and wouldn't substitute if it did? And then if the user continues with a whitespace, only then it follows through with the substitution.
There are already a few autocorrect entries that work this way (eg. "-->", "--"), so even if normally :emoji: is supposed to be a well-defined format, this behavior wouldn't be completely unexpected.
Comment 18 Aron Budea 2016-07-29 15:00:14 UTC
Apparently, this is also an issue when entering MAC addresses in XX:XX:XX:XX:XX:XX format (bug 101194). Unfortunately my previous idea wouldn't solve this.
Comment 19 Aron Budea 2016-07-29 15:01:13 UTC
*** Bug 101194 has been marked as a duplicate of this bug. ***
Comment 20 V Stuart Foote 2016-07-29 15:46:45 UTC
@László, *

To fix this, would like to formally propose that the autocorrect emoji entry for the clock face glyphs all be modified to use a pair of COLONs rather than a single COLON.

🕐🕑🕒🕓🕔🕕🕖🕗🕘🕙🕚🕛🕜🕝🕞🕟🕠🕡🕢🕣🕤🕥🕦🕧

Then add a usage note for entering the clock faces in the release notes and in the help article(s) (which has yet to be written).

So the autocorrect entries would become
::1::
::1.30::
::2::
::2.30::
::3::
::3.30::
::4::
::4.30::
::5::
::5.30::
::6::
::6.30::
::7::
::7.30::
::8::
::8.30::
::9::
::9.30::
::10::
::10.30::
::11::
::11.30::
::12::
::12.30::
Comment 21 Aron Budea 2016-08-04 17:39:05 UTC
*** Bug 101304 has been marked as a duplicate of this bug. ***
Comment 22 Eike Rathke 2016-08-04 17:51:08 UTC
This is a bug, not an enhancement, and gets in the way of inexperienced users when entering data in a spreadsheet who don't know how to disable the relevant AutoCorrect options or which entries or even where.

So, change all emojis that could lead to garbled data input as suggested above in comment 20.
Comment 23 Eike Rathke 2016-09-03 08:52:18 UTC
*** Bug 97252 has been marked as a duplicate of this bug. ***
Comment 24 László Németh 2016-09-10 06:59:13 UTC
Fixed in the master, and the proposed fix for LibreOffice 5.2 is
there in gerrit (https://gerrit.libreoffice.org/#/c/28761/).

Many thanks for your reports.

@Stuart, sorry for my late feedback. Thanks for your suggestion.

(My explanation from the l10n mailing list:

I have changed the bad

:1:
:2:
...
:12:

en-US emoji short names to

:1 h:
:2 h:
...
:12 h:

ones. After pushing the related commit
(https://gerrit.libreoffice.org/#/c/28756/3), it will be possible to
translate the new short names for the master branch, keeping this
international standard form in the translation, or giving better ones,
if they exist in your languages (for example, using :12.00: instead of
:12 h:)

(In libreoffice-5-2, possibly in the affected older branches the
clockface emoji replacements will be removed completely to solve their
conflict with time format, see
https://bugs.documentfoundation.org/show_bug.cgi?id=97191)

Please, avoid of the usage of middle colon in the translations. This
is exactly the ISO 8601 time format standard, but it doesn't work in
LibreOffice. For example, do not use "12:00" as a translations of the
recent en-US "12.00".

It's the same for the newer "11.30" like short names. I have removed
the never working ":1:30:"-like patterns from the da, en-GB, fr, ko,
nl-BE, lt, nl, ro, sk, sv, tr DocumentList.xml files (these files are
the final place of the translations).

(If I right think, it would be better the change the recent en-US
:11.30: format to the more natural :11 h 30:, but it will be still
possible to translate it to :11.30:. Also you can skip their
translations, using "nil" or to do nothing here.)

Thank you for your great work to translate the big emoji dictionary of LO.)
Comment 25 Commit Notification 2016-09-15 11:40:23 UTC
László Németh committed a patch related to this issue.
It has been pushed to "libreoffice-5-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=14f914eb105765386438ae53c83c5ce295648414&h=libreoffice-5-2

tdf#97191 fix emoji correction conflict with time format

It will be available in 5.2.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 26 Eike Rathke 2016-09-15 11:42:05 UTC
Pending review https://gerrit.libreoffice.org/28922 for 5-2-2
Comment 27 Commit Notification 2016-09-16 12:55:21 UTC
László Németh committed a patch related to this issue.
It has been pushed to "libreoffice-5-2-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=d17c9626f8997cef8ded6f4736003fcf694bc175&h=libreoffice-5-2-2

tdf#97191 fix emoji correction conflict with time format

It will be available in 5.2.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 28 tommy27 2016-10-05 08:13:09 UTC
thanks Lazlo,
I leave to you the honour to set the status to RESOLVED FIXED
Comment 29 Aron Budea 2016-10-05 18:55:12 UTC
Would it be possible to backport this to 5.1.6 in time?
Comment 30 tommy27 2016-10-06 08:27:48 UTC
if it's not technically too hard, there's still a lot of time to push the fix to 5.1.6 as well.

https://wiki.documentfoundation.org/ReleasePlan/5.1#5.1.6_release
Comment 31 László Németh 2016-10-10 11:56:59 UTC
Thanks. Solved in master and 5.2. I'll check the 5.1 back port soon.
Comment 32 Aron Budea 2016-11-24 14:44:26 UTC
*** Bug 104136 has been marked as a duplicate of this bug. ***
Comment 33 V Stuart Foote 2017-06-26 12:33:29 UTC
*** Bug 108775 has been marked as a duplicate of this bug. ***
Comment 34 V Stuart Foote 2017-12-28 23:28:48 UTC
*** Bug 114741 has been marked as a duplicate of this bug. ***