Bug 106350 - FORMATTING: literal tab \0x09 in text in cell lost after FILESAVE/FILEOPEN
Summary: FORMATTING: literal tab \0x09 in text in cell lost after FILESAVE/FILEOPEN
Status: RESOLVED DUPLICATE of bug 103829
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
5.2.5.1 release
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-06 02:38 UTC by James D Howard
Modified: 2017-11-12 05:44 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
.ODS file containing <HT> written out by Excel (2.77 KB, application/vnd.oasis.opendocument.spreadsheet)
2017-03-14 00:20 UTC, James D Howard
Details
.XLS file written by Excel containing <HT> (34.00 KB, application/vnd.ms-excel)
2017-03-14 00:21 UTC, James D Howard
Details
ods file not saving text with unicode 0-9,11,12,14-31 (19.05 KB, application/vnd.oasis.opendocument.spreadsheet)
2017-03-14 15:17 UTC, _sox_
Details

Note You need to log in before you can comment on or make changes to this bug.
Description James D Howard 2017-03-06 02:38:54 UTC
Description:
If prose text is entered into a spreadsheet cell and that text contains a literal ASCII/UTF-8 <HT> (0x09) character, it will display as expected UNTIL the spreadsheet is saved and re-opened.  On re-opening, the 'tab' <HT> character is lost in the cell's text.

I do not know if the <HT> (0x09) is lost during the FILESAVE operation, or lost in the subsequent FILEOPEN operation.

Steps to Reproduce:
1. Open new Cals spreadsheet
2. Enter text into a cell
3. Enter a literal ASCII/UTF-8 'tab' <HT> (0x09) in amongst the
      characters of the cell
4. Notice that text displayed in the cell correctly 'obeys' the
      intent of a horizontal tab, and stays that way if Calc
      saves the file (remains open, without [Close])
5. Go ahead and [Save] and [Exit]
6. Re-open the file with Calc
7. Notice that the 'tab' (amongst the characters originally entered)
      is now gone - not even converted to a space (0x20)
8. Re-insertion of a 'tab' <HT> character (as by paste from an entry
      in another editor) restores desired/expected visual appearance
      and alignment

Actual Results:  
'tab' <HT> (0x09) character completely eliminated from spreadsheet cell's textual data

Expected Results:
'tab' <HT> (0x09) character preserved through FILESAVE and FILEOPEN


Reproducible: Always

User Profile Reset: No

Additional Info:
This operation sequence works as expected in MSFT Excel, for example.


User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36
Comment 1 Xisco Faulí 2017-03-08 11:33:29 UTC Comment hidden (obsolete)
Comment 2 James D Howard 2017-03-11 22:56:36 UTC
Regrettably, a saved file example will not show the bug, because the editing process destroys the presence of the <HT> (0x09 tab).  The following detailed process easily and repeatably shows the editing bug.

1.) Start LibreOffice Calc from Windows Start menu icon
   -- Calc shows initial, blank, new unnamed spreadsheet
   -- Allow cells to remain in default formatting
2.) Select Calc spreadsheet cell A1 and enter some text
   -- Enter "ABCD"
2.) Start a plain text editor like MSFT Notepad
   -- Enter in text "abc<HT>def"
      -- where <HT> is 0x09 an ASCII horizonal tab ^I
   -- Visual effect of tab <HT> between 'c' and 'd' clearly visible
      -- multi-width whitespace shown between 'c' and 'd' verifies
            <HT> entry
   -- Select and [Ctrl]+C (copy) the <HT> within the text (1 char)
3.) Again, select Calc spreadsheet cell A1
   -- Note: Do this by double-click while cursor is pointing inside the cell
      -- Activates in-place, on-screen intra-cell editing
      -- Cursor turns to vertical bar text edit cursor
   -- Use left- or right-arrow keys (if needed) to move vertical bar text
         edit cursor to be between the 'B' and 'C'
   -- Do a 'paste' [Ctrl]+V into the cell, placing <HT> between 'B' and 'C'
      -- 'wide' whitespace now visible in cell
      -- Cell cursor remains in text-entry mode
   -- Additional verification of <HT>:
      -- Move text entry cursor to left of cell with [Home] key
      -- Again use [Ctrl]+V paste
      -- see visual effect of <HT>
      -- Cell cursor remains in text-entry mode
   -- Triple click on cell to get all cell text selected
   -- Copy selected text by either:
      -- Right-click on cell to get context menu & use [Copy] item, or
      -- [Ctrl]+C copy keyboard command
4.) Verify <HT> in cell's entry data
   -- Move GUI focus back to text editor
   -- Do [Ctrl]+V on a new line in text editor
   -- Note that visual effect of <HT> is present
      -- If your editor can show non-printing glyphs, you will see "tab"
            indicator(s)
5.) Return focus to Calc window
   -- (Should still see cell's selection highlight of all text in cell)
   -- Use cursor-and-click to select another cell
   -- Note: entered text in cell changes, eliminating <HT> effect
         and turns <HT> chars into spaces
6.) At this point, the user's desired text entry has been damaged/destroyed,
      and no longer contains <HT>

This is distinctly UN-like Excel, which preserves <HT>
Comment 3 raal 2017-03-12 18:49:09 UTC
(In reply to James D Howard from comment #2)

> This is distinctly UN-like Excel, which preserves <HT>


Please attach excel test file for easy reproduce. Thanks.
Comment 4 James D Howard 2017-03-14 00:20:46 UTC
Created attachment 131870 [details]
.ODS file containing <HT> written out by Excel
Comment 5 James D Howard 2017-03-14 00:21:44 UTC
Created attachment 131871 [details]
.XLS file written by Excel containing <HT>
Comment 6 James D Howard 2017-03-14 00:28:26 UTC
The 2 attached files written by Excel show <HT> (tab) characters within cell data.  The chars in the .XLS file can be seen in plain text.  The chars in the .ODS file can be seen if the file is unzipped - look for the sequence
    ...<text:tab>ab<text:tab/>cd...
in the unzipped data.

Regrettably, I find that the display of text with literal <tab> chars within a cell varies with version of Excel.  I'm most familiar with Excel 2007; a more recent Excel (2013) displays <tab> at the end of the text within the cell.  Nonetheless, all versions I've tried are tab-preserving with save and restore (re-read).

Sigh -- it appears there is general inconsistency about the behaviour around <HT> handling.
Comment 7 _sox_ 2017-03-14 15:17:17 UTC
Created attachment 131885 [details]
ods file not saving text with unicode 0-9,11,12,14-31
Comment 8 Buovjaga 2017-03-27 09:33:56 UTC
(In reply to _sox_ from comment #7)
> Created attachment 131885 [details]
> ods file not saving text with unicode 0-9,11,12,14-31

What is the relevance of your file to this report?
Comment 9 _sox_ 2017-04-01 12:40:45 UTC
all the following unicode characters (0-9,11,12,14-31) you can generate with this file and if you mark the characters for copying and do "paste-special-text" the characters are actually in the corresponding cells.

if you save the file and reopen it unicode characters (0-9,11,12,14-31) are gone
(as single characters & also in the middle of a word)
so the observed disappearing of <HT> is also with those other unicode characters
Comment 10 Carlos 2017-04-03 17:12:23 UTC
In my experience to input a tab you should:
  -  Select a cell 
  -  Click on the INPUT LINE
  -  Copy text containing tabs from another file already containing them
  -  Paste the text in the INPUT LINE. 

In my experience the tabs are lost (converted to spaces) after either:
  *  hitting ENTER
  *  clicking on another cell
  *  hitting TAB. 

So it happens before the FILESAVE operation.
Comment 11 Carlos 2017-04-03 20:04:09 UTC
This is similar to bug 
https://bugs.documentfoundation.org/show_bug.cgi?id=98815
Comment 12 Buovjaga 2017-04-04 04:21:18 UTC

*** This bug has been marked as a duplicate of bug 98815 ***
Comment 13 Carlos 2017-04-04 13:42:18 UTC
(In reply to _sox_ from comment #9)
> all the following unicode characters (0-9,11,12,14-31) you can generate with
> this file and if you mark the characters for copying and do
> "paste-special-text" the characters are actually in the corresponding cells.
> 
> if you save the file and reopen it unicode characters (0-9,11,12,14-31) are
> gone
> (as single characters & also in the middle of a word)
> so the observed disappearing of <HT> is also with those other unicode
> characters

_sox please transfer the file you uploaded in this bug to the following one:
https://bugs.documentfoundation.org/show_bug.cgi?id=98815
Comment 14 Xisco Faulí 2017-10-28 18:24:23 UTC

*** This bug has been marked as a duplicate of bug 103829 ***