Bug 131025 - Writer document with tables lost data in cells (apparently) replacing with 0
Summary: Writer document with tables lost data in cells (apparently) replacing with 0
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.0.0.3 release
Hardware: All All
: highest critical
Assignee: Not Assigned
URL:
Whiteboard: target:7.3.0 target:7.2.3
Keywords: bibisected, bisected, dataLoss, regression
: 119377 131904 135521 136730 137371 139889 141176 142202 142539 (view as bug list)
Depends on:
Blocks: Writer-Tables 106322
  Show dependency treegraph
 
Reported: 2020-02-29 11:54 UTC by asuntoswebpablo
Modified: 2021-10-25 08:42 UTC (History)
26 users (show)

See Also:
Crash report or crash signature:


Attachments
A document with table with data. It shows 0 in cells if the bug in tha application exists. (28.13 KB, application/vnd.oasis.opendocument.text)
2020-02-29 11:57 UTC, asuntoswebpablo
Details
ODT documents capturing results from the test scenario described in comment #23 (35.87 KB, application/zip)
2021-02-06 11:59 UTC, Tom
Details
XSLT Stylesheet to restore text values corrupted by this bug, see #41 (1.80 KB, text/plain)
2021-05-01 22:42 UTC, Jim DeLaHunt
Details

Note You need to log in before you can comment on or make changes to this bug.
Description asuntoswebpablo 2020-02-29 11:54:58 UTC
Description:
After saving an OpenOffice Write document (.odt) that have tables, and openin the same document with the OpenOffice Writer again is apparent that numerous data in the cells are replaced by the '0' character (one each cell) losing the content.
In reality the data is here (I was able to recover it opening the document with Windows WordPard) but the fright is great.


Steps to Reproduce:
1.Open the file that I will send with the bug report
2.
3.

Actual Results:
Majority of the cells on tables show 0

Expected Results:
They must show the data that I wrote in them.


Reproducible: Always


User Profile Reset: No



Additional Info:
I will attach a file that when is opend show the bug, but if is opened with WordPad it is clear that data is here. I am now using rtf format to save those texts and tables in LibreOffice and in this fileformat I am not havin that problem.
Comment 1 asuntoswebpablo 2020-02-29 11:57:25 UTC
Created attachment 158273 [details]
A document with table with data. It shows 0 in cells if the bug in tha application exists.
Comment 2 Roman Kuznetsov 2020-02-29 21:08:05 UTC
no repro from scratch. Table has the same data after saving and reopening
Comment 3 Oliver Brinzing 2020-03-01 07:44:48 UTC
reproducible with:

Version: 7.0.0.0.alpha0+ (x64)
Build ID: f2db813374b8d65e1edec1387fa0c534b40885e1
CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: default; VCL: win; 
Locale: de-DE (de_DE); UI-Language: en-US
Calc: threaded

steps to reproduce:

- new writer document
- insert a table
- format cell A1: #.##0,00
- insert text: Hello
- save & reload document
- cell A1 has format: @
- format cell A1 again: #.##0,00
- save & reload document
-> A1 has value 0,00
Comment 4 Oliver Brinzing 2020-03-01 07:54:54 UTC
reproducible with:

Version: 6.1.6.3 (x64)
Build-ID: 5896ab1714085361c45cf540f76f60673dd96a72
CPU-Threads: 4; BS: Windows 10.0; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE); Calc: 

and with:

Version: 5.4.7.2 (x64)
Build-ID: c838ef25c16710f8838b1faec480ebba495259d0
CPU-Threads: 4; BS: Windows 6.19; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE); Calc: single

the cell value will change immediatelly after reformat - a second save & reload cycle is not necessary
Comment 5 Oliver Brinzing 2020-03-01 09:51:13 UTC
> - format cell A1 again: #.##0,00
> - save & reload document
> -> A1 has value 0,00

also reproducible with:

Version: 6.0.7.3 (x64)
Build-ID: dc89aa7a9eabfd848af146d5086077aeed2ae4a5
CPU-Threads: 4; BS: Windows 10.0; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE); Calc: 


> the cell value will change immediatelly after reformat - a second save & reload > cycle is not necessary

same with

Version 3.6.7.2 (Build ID: e183d5b)
AOO 4.1.5
Comment 6 Oliver Brinzing 2020-03-01 12:33:42 UTC
This issue seems to be a regression from tdf#106322. As mentionied above, changing the cell format immediatelly changed cell content to "0". With tdf#106322 now cell value is kept, but after a save & reload cycle the cell value changes to "0".

content.xml after changing cell format:
<table:table-cell table:style-name="Table1.A1" 
                  office:value-type="float" office:value="0">
  <text:p text:style-name="P1">Hello</text:p>
</table:table-cell>

https://gerrit.libreoffice.org/plugins/gitiles/core/+/acf7e4c0a3dc0cca986bf4d4b7a65bafe7e70abc

commit	acf7e4c0a3dc0cca986bf4d4b7a65bafe7e70abc	[log]
author	  Eike Rathke <erack@redhat.com>	Fri Dec 01 19:46:45 2017 +0100
committer Eike Rathke <erack@redhat.com>	Fri Dec 01 20:05:50 2017 +0100
tree	ae49d7b966f3229caffe4642bf0f3acccde691ca
parent	350eec67a5989365560e38e9270990dcd0a019e8 [diff]

Resolves: tdf#106322 keep original cell content when assigning number format
... and content can't be parsed as number. Instead of converting 0.
Change-Id: Ief0c0a0284762fc0e801d6cc598720a97d733e31

/cygdrive/d/sources/bibisect/bibisect-win32-6.1
$ git bisect good
473fa5e7649113f1fbdded753cd7ea961fdf2de6 is the first bad commit
commit 473fa5e7649113f1fbdded753cd7ea961fdf2de6
Author: Norbert Thiebaud <nthiebaud@gmail.com>
Date:   Tue Dec 12 00:24:31 2017 -0800
    source sha:acf7e4c0a3dc0cca986bf4d4b7a65bafe7e70abc
    source sha:acf7e4c0a3dc0cca986bf4d4b7a65bafe7e70abc

:040000 040000 e2d939849e4bb08597fb9e76a7f43c5f4e85bdd5 7c628f65d132cafb24a544a1f75f6788529f9cc7 M      instdir

/cygdrive/d/sources/bibisect/bibisect-win32-6.1
$ git bisect log
# bad: [75d131082ce51ed5a898d97bdc2b7a9fe5ddb340] source sha:5b3765f4d881e7ddefd0c4aad6886a46f000b4fc
# good: [29d08f54c2f71ffee4fe12dbb24c5f5cbedecfd2] source sha:6eeac3539ea4cac32d126c5e24141f262eb5a4d9
git bisect start 'master' 'oldest'
# bad: [6227e15df9be101688e37cd891817cd858b49e03] source sha:b8b7f8a8f8d97088181d287bb75e74facece16c6
git bisect bad 6227e15df9be101688e37cd891817cd858b49e03
# bad: [f73e8407e9dd38ee6588002e02c30e29880abdca] source sha:27938e1bbd5f3405c47b9933be7489eeb03920f3
git bisect bad f73e8407e9dd38ee6588002e02c30e29880abdca
# bad: [9fc78e2f29e8823251694faba1a762cfc347fcb8] source sha:4f0a97a81dc9aa93dd6579590110a1ea71154351
git bisect bad 9fc78e2f29e8823251694faba1a762cfc347fcb8
# bad: [e451888cda13c56f0cfac959ffeeef3661f1bd83] source sha:c9904bdd5bf2645c9723a8135c5fbceadb6b9aed
git bisect bad e451888cda13c56f0cfac959ffeeef3661f1bd83
# good: [ac768805b7c0d6149638a7c0ee956a52b6b683e6] source sha:bb1fd2c9819d1ee5ba26c181d8fea8272b89b673
git bisect good ac768805b7c0d6149638a7c0ee956a52b6b683e6
# bad: [7e30547f5641156a964354b0bb75b98b66f346f9] source sha:aa0b08980aba7bc82ab75151129b0c643cde7dfa
git bisect bad 7e30547f5641156a964354b0bb75b98b66f346f9
# bad: [1012ee5cf876d567f685bf78c573a5127105b82a] source sha:fdd63585a802e158abb06aa9b87fad2635db5103
git bisect bad 1012ee5cf876d567f685bf78c573a5127105b82a
# bad: [a8a1a1814262b63b2932152335f6d4a19caa9bd6] source sha:aeff59771431dd273159c767080b3db0a4f93565
git bisect bad a8a1a1814262b63b2932152335f6d4a19caa9bd6
# good: [b3c6edc5b5075ca456b179d715204bcc553a8cc7] source sha:1d44bcf18712d899f9e53676b9bc54ddc88147eb
git bisect good b3c6edc5b5075ca456b179d715204bcc553a8cc7
# good: [3ad46a7e29451f573c2209617b0c00e354530922] source sha:dccfe8765c25caf8485e659711a6df6c43ed63a9
git bisect good 3ad46a7e29451f573c2209617b0c00e354530922
# bad: [5af3bd7d2c3a45cc0312ad713a099e3451efdbb2] source sha:ae745789704fd7ad86c84ff9875cda810ff915b0
git bisect bad 5af3bd7d2c3a45cc0312ad713a099e3451efdbb2
# bad: [473fa5e7649113f1fbdded753cd7ea961fdf2de6] source sha:acf7e4c0a3dc0cca986bf4d4b7a65bafe7e70abc
git bisect bad 473fa5e7649113f1fbdded753cd7ea961fdf2de6
# good: [ca79be85102959ed6c455b05683f68728506432c] source sha:350eec67a5989365560e38e9270990dcd0a019e8
git bisect good ca79be85102959ed6c455b05683f68728506432c
# first bad commit: [473fa5e7649113f1fbdded753cd7ea961fdf2de6] source sha:acf7e4c0a3dc0cca986bf4d4b7a65bafe7e70abc
Comment 7 Mike Kaganski 2020-03-02 11:47:03 UTC
The problem is when a cell has a structure like this:

> <table:table-cell table:style-name="Table1.A1" office:value-type="float" office:value="0">
>     <text:p text:style-name="P1">Some Text</text:p>
> </table:table-cell>

In this case, the value of the cell is 0, but the contents is "Some Text" textual string.

When import filter calls SwTableBox::ActualiseValueBox(), this sees that the value 0 formatted using the number format gives a string different from "Some Text", and replaces it.

Questions are:
1. Why it doesn't happen if using Standard number format?
2. What does standard say about such cases?
3. How should Writer tell if normalizing should or should not happen: say, in a document using "default" language; in case when a cell contains a formatted numeric text, using, e.g., en-US format, and the document is opened in Russia, the new formatted text would also be different, but in this case the change is legitimate.
Comment 8 Mike Kaganski 2020-03-02 11:51:52 UTC
So two problems here:

1. How should Writer open existing documents with such data
2. It should *not* generate such data itself as shown in comment 3. If the data is textual, it must not write cell value type "float".
Comment 9 Oliver Brinzing 2020-04-05 14:49:27 UTC
*** Bug 131904 has been marked as a duplicate of this bug. ***
Comment 10 Maxim Monastirsky 2020-09-14 19:18:01 UTC
*** Bug 136730 has been marked as a duplicate of this bug. ***
Comment 11 Joao Carvalho 2020-09-17 00:11:39 UTC
Shouldn't the importance of this bug be changed to "major" or "critical"?

Users may loose data because of this bug, because the data loss is only noticed after the file is closed and then opened again. At that point, you can't "undo" to recover the data you lost...
Comment 12 Mark van Rossum 2020-10-06 22:12:18 UTC
Here is a workaround how you can recover the data:

given file.odt

*unzip file.odt in a temporary directory.

* edit contents.xml.

  replace "float" => "string"

  delete  all occurrences of office:value="0" 
 
  save

* zip file_new.odt *

* open file_new.odt 

LO will see it is broken but can repair it.
Comment 13 digg33 2020-11-16 10:08:29 UTC
Also facing the same serious problem on:
Version: 6.4.5.2
Build ID: 6.4.5.2-5.fc32
CPU threads: 8; OS: Linux 5.8; UI render: default; VCL: gtk3; 
Locale: en-GB (en_GB.UTF-8); UI-Language: en-US
Calc: threaded

Steps to reproduce:
1. Create new Writer doc based on Default template.
2. Make new table [Table > Insert Table...]
   Options: Academic style
            3 Cols, 7 Rows
3. Add random data to rows; my test data:
   - Filled 3 rows + header
   - 1st Col: Numeric
   - 2nd Col: Alpha
   - 3rd Col: Alphanumeric
4. Save the doc
5. Now reload [File > Reload] - The error *might* manifest
6. If not:
   - Add a row [Table > Insert > Rows Below]
   - Save
   - Reload [File > Reload]
   - Error shown every time for me

Observations:
- Error did not occur for default table style in my case
  Created academic and default tables in same document and
  in different documents, only academic table style showed
  this problem
- With the above steps, the first column seems to remain
  untouched
Comment 14 LeroyG 2020-11-27 14:57:57 UTC
Also seen with Ubuntu 20.04.1 de 64 bits y la versión de LibreOffice Writer la 6.4.6.2 (https://ask.libreoffice.org/es/question/268771/). Original file (now .odt) was a .doc(x).
Comment 15 Jim DeLaHunt 2020-12-08 11:06:15 UTC
I encountered this problem with the following current LibreOffice version on macOS:

Version: 7.0.3.1
Build ID: d7547858d014d4cf69878db179d326fc3483e082
CPU threads: 8; OS: Mac OS X 10.13.6; UI render: default; VCL: osx
Locale: en-CA (en_CA.UTF-8); UI: en-US
Calc: threaded

I had success fixing the problem, guided by Mark van Rossum's comment12. I unzipped the .odt file, edited contents.xml, changed the value of "office:value-type" attributes from "float" to "string", and deleted "office:value" attributes.

I did this with the following XSLT file. I won't attempt to explain how to use XSLT here. It is tough to get working, but once it works, it's a great tool for the job (of patching XML files).

Once I edited contents.xml, re-zipped the document directory, and opened the new .odt file in LibreOffice, I saw a "document corrupted" message as Mark did. LibreOffice was able to repair the document, and it seems OK.

=====
<?xml version="1.0" encoding="UTF-8"?>
<!-- Restore_odt_table_cell_text.xslt
  by Jim DeLaHunt (jdlh.com), 2020-12-08. Donated to the public domain (cc0).
  
  This XSLT fixes a LibreOffice bug where Writer table cells containing 
  text turn into digit 0 when the document is re-opened. 
  This is Bug 131025 - Writer document with tables lost data in cells (apparently) replacing with 0
  <https://bugs.documentfoundation.org/show_bug.cgi?id=131025>
  It does this by fixing table cells like
  <table:table-cell table:style-name="…" office:value-type="float" 
  office:value="0">
	<text:p text:style-name="P71">desired text</text:p>
  </table:table-cell>
  
  by changing to office:value-type="string" and deleting office:value="0".
  -->
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0">

	<xsl:output method="xml" indent="no" encoding="UTF-8" />

    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

	<xsl:template match="/office:document-content/office:body/office:text/table:table/table:table-row/table:table-cell[@office:value-type='float'][@office:value='0'][not(text()='')]" >
		<xsl:copy>
			<xsl:attribute name="office:value-type">string</xsl:attribute>
			<xsl:apply-templates select="@*[name(.)!='office:value' and name(.)!='office:value-type']" />
            <xsl:apply-templates select="node()"/>
		</xsl:copy>
	</xsl:template>
</xsl:stylesheet>
=====
Comment 16 Eike Rathke 2020-12-09 19:39:35 UTC
(In reply to Jim DeLaHunt from comment #15)
> Once I edited contents.xml, re-zipped the document directory, and opened the
> new .odt file in LibreOffice, I saw a "document corrupted" message
That probably because this

(In reply to Mark van Rossum from comment #12)
> * zip file_new.odt *
> LO will see it is broken but can repair it.
zipping everything is wrong (apart from that it lacks subdirectories). Instead, copy the old document to file_new.odt and then *freshen* the zip using

zip -f file_new.odt content.xml

The reason is that the 'mimetype' file
a) MUST be the first entry in the zip
b) MUST be stored uncompressed plain text

Zipping everything (in shell expansion order if * is used) will violate both.

Alternatively, of course in a subdirectory that *only* contains the document's files and directories, create a new zip with

zip -0 file_new.odt mimetype
zip -r file_new.odt * -x mimetype
Comment 17 Jim DeLaHunt 2020-12-09 20:35:28 UTC
(In reply to Eike Rathke from comment #16)
...
> The reason is that the 'mimetype' file
> a) MUST be the first entry in the zip
> b) MUST be stored uncompressed plain text
> 
> Zipping everything (in shell expansion order if * is used) will violate both.
...

Eike Rathke, thank you for this valuable insight. I think this points to the value of having some documentation on how to open up an .odt archive into separate files, and how to generate the .odt archive again.  

Is this written down anywhere, that you know of?

If it is not written down, and I were to write it, where in your experience would be a good place for me to put that documentation?

Is there documentation about the file format which includes the constraint about the 'mimetype' file?

(In an ideal world, there would be no bugs, and no need to modify the contents of an .odt file outside an application. But in this imperfect world, it is useful to have that option.)
Comment 19 Jim DeLaHunt 2020-12-10 21:33:33 UTC
Awesome, thank you for the link.
Comment 20 Jim DeLaHunt 2020-12-14 08:10:45 UTC
Following the instructions from Eike Rathke in comment 16, I was able to reconstruct the .odt package using the ZIP command on my directory of XML files (with edits as noted in my comment 15). LibreOffice opened this .odt document with no error messages.

I agree that this is a severe bug. It causes data loss, for reasons which we can't expect users to understand. The workaround of opening up the .odt package and editing XML files is something we can't expect ordinary users to accept.

Part of what makes this problem harder to understand is the behaviour of the Table… Number Format… menu item when multiple cells are selected. When I selected cells where some were formatted as "Text" and some as "Number" "General", the Number Format displayed was "Text" only. This lulled me into thinking that all the cells were formatted as Text. But when I selected cells which were all formatted as "Number" "General", then Number Format displayed the format "Number" "General".

My table happend to have "Text" format for column A, and "Number" "General" format for remaining columns. When I selected the entire table, it looked like it was entirely formatted as "Text". That was incorrect.

Thus I suspect that the Number Format UI doesn't display clearly when a selection of cells have different formats. It should. That would be the subject of a different bug, an enhancement request.
Comment 21 Jim DeLaHunt 2020-12-17 08:40:05 UTC
For the record, I tried to make the repair process simpler by saving my document as a flat XML document file (.fodt), and modifying the XML in that file. Unfortunately, LibreOffice replaced the non-numeric cell contents by "0" when saving to the flat XML document. It was not possible to repair after that. 

I also tried opening the document suffering from this problem, selecting the non-numeric table cells displaying "0" values, selecting the Table… Number Format… menu item, and setting the format to Text. That did not restore the original non-numeric values to the cells. 

Thus, it appears that the only way to repair a document suffering from this problem is to uncompress the .odt document package. 

Also, anyone trying to repair documents of theirs affected by this problem might be interested in the following Q&A threads. I'm attempting to gather ideas for the best way to perform each step of the repair.

How can I uncompress a LibreOffice document to get its XML internals, then make a new document from them?
https://ask.libreoffice.org/en/question/282816/how-can-i-uncompress-a-libreoffice-document-to-get-its-xml-internals-then-make-a-new-document-from-them/

How to use XSLT to modify a LibreOffice document at XML level?
https://ask.libreoffice.org/en/question/282817/how-to-use-xslt-to-modify-a-libreoffice-document-at-xml-level/

What XSLT copies an XML file, deleting one attribute and modifying the value of another?
https://stackoverflow.com/questions/65318418/what-xslt-copies-an-xml-file-deleting-one-attribute-and-modifying-the-value-of/65318627
Comment 22 Timur 2021-01-26 11:26:54 UTC
*** Bug 139889 has been marked as a duplicate of this bug. ***
Comment 23 Tom 2021-02-04 13:45:31 UTC
Still problem in 7.0.4.2 - table cell values change to "0" after a save & reload cycle.

However, the problem seems to be only with tables that have a Table Style applied and/or those that have cell format applied. Plain "vanilla" tables with unedited formatting are saved as expected.

Steps to reproduce:

1. Open a new Writer document
2. Insert a table (3x4)
3. Populate cells with some text (e.g. A1, A2, A3, B1, B2, B3, ...)
4. Save & Reload
* PASS: all OK 

5. Apply "Elegant" Table Style to the table 
6. Save & Reload
* FAIL: this time cells that were not blank were saved as "0" (apart from the values in first row & first column which were preserved in my case).

7. Edit the cells again
8. Save & Reload
* PASS: after re-editing the values it was saved as expected

9. Now apply another ("Financial") style to the table
10. Save & Reload
* FAIL: the cells are again forced to "0" (again, all apart from the first row & the first column).

Note that when this happens, the original data is lost since the affected cells are saved with '0' as the value. The worrying thing is that even a backup file won't help much because this happens silently. This means, the user may not be even aware that parts of the document (some table cells) have been silently replaced with zeros until the next time she/he is going through all the tables in the affected document.

Just thinking - shouldn't LO at least warn that there was a problem with converting values (text to number in the above case), I don't feel this should be allowed to happen silently without some sort of warning - imagine this happens to a commercial document, an invoice, terms sheet, etc.
Comment 24 Tom 2021-02-06 11:53:24 UTC
Following up on my previous comment #23, just edited a table in a an old document that had a custom Table Style applied - after adding several new rows, saving and re-opening the document, all new as well as previous rows were replaced with '0'.

I thought I will give it a go and see if anything has improved in the daily version (LibreOfficeDev-7.2.0.0.alpha0_2021-02-04-x86_64.AppImage), unfortunately the result is exactly the same - data loss in tables.

I am attaching 4 files that capture the results using the steps in comment #23:

Steps 1-4: LO720-Table_test_bug_131025_Result1.odt
Steps 5-6: LO720-Table_test_bug_131025_Result2.odt
Steps 7-8: LO720-Table_test_bug_131025_Result3.odt
Steps 9-10: LO720-Table_test_bug_131025_Result4.odt

If anyone would like to repeat the test, I suppose one could start with the first file (LO720-Table_test_bug_131025_Result1.odt) and then use it to execute the remaining steps 5..10.

Hope this helps.

In my opinion, this bug should be *critical* since the Writer is silently modifying data (text content) entered by the user and there is no other way to find out about it than eyeballing each single table in a document every time the document is saved, and even then it's too late as the data has already been lost (not just those cells that have been recently edited, but potentially also other cells that previously saved OK).
Comment 25 Tom 2021-02-06 11:59:11 UTC
Created attachment 169525 [details]
ODT documents capturing results from the test scenario described in comment #23
Comment 26 Tim Nelson 2021-02-13 00:13:30 UTC
Just adding my vote to those who say that this bug should be critical.  I lose data to this regularly.
Comment 27 digg33 2021-02-14 22:09:10 UTC
Sorry for adding noise to bug reports, but I too would like to see this given critical priority. As noted previously, the data loss is only noticed after the file is saved, with no means to recover afterwards. For something as commonly used as a table, this could have disastrous consequences. What if a business report full of data in a table was being made and suddenly all the data were to disappear?

As tables are such an integral part of word processing, a lot of users stand to benefit from fixing this bug. I hate having to ask open source developers much, but please understand that not being able to correctly use something as essential as a table in a word processor could potentially lose LibreOffice many users (especially if, say, an organisation were to be affected by it) and possibly harm it's reputation.

Thanks,
digg33
Comment 28 Timur 2021-02-15 14:24:43 UTC
Upon multiple calls, let's call it critical - but it will still need a dev.
Comment 29 hgp 2021-02-15 17:02:39 UTC Comment hidden (off-topic)
Comment 30 Eike Rathke 2021-02-15 17:44:37 UTC Comment hidden (off-topic)
Comment 31 hgp 2021-02-15 18:42:01 UTC Comment hidden (off-topic)
Comment 32 Tom 2021-02-16 09:25:07 UTC
In an attempt to pinpoint the earliest affected release, I did some additional tests on earlier builds (using LibreOffice AppImages from https://libreoffice.soluzioniopen.com/old-versions/).

Following the scenario from my earlier comment #23, I was able to reproduce this with 6.0.0.3 (LibreOffice-6.0.0-x86_64.AppImage, Version: 6.0.0.3, Build ID: 64a0f66915f38c6217de274f0aa8e15618924765).

In 5.4.7, changing the number formatting / table style, results in table cells being immediately overwritten with 0's - this part was resolved (bug #106322) and the fix landed in 6.0.0.x.

Unfortunately, it appears that since 6.0.0.3 something else happens behind the scenes when the document is being saved to an ODT file, and which results in data loss that we experience now. Note: saving as RTF or DOCX is fine so perhaps this could be another clue as to which part of the code may be the culprit?

Btw, I've now updated the 'earliest affected' to 6.0.0.3
Comment 33 Mike Kaganski 2021-02-16 09:34:08 UTC
(In reply to Tom from comment #32)

The code pointers are in comment 7.
Comment 34 Mike Kaganski 2021-02-16 10:12:16 UTC
When a cell has a numeric format, and one types there something that does not convert to a number, the cell format is automatically converted to text (@). This makes sure that cell content (text) matches cell value type.

When a format is applied onto a cell with existing value, the format is applied, but (since the fix to tdf#106322) the value is retained in this session (and gets saved to ODF), so the problem arises (as mentioned in comment 7, it somehow does not affect "General" number format). It seems bad that the cell value type (and thus cell value) might differ from cell content type.

The standard says [1] that only for textual value type, the value is defined by cell content:

> If the value type is not string or if the <table:table-cell> element content
> differs from the value of the element, the corresponding Value Attribute(s)
> (Table 14 - Value attributes) shall contain the value(s) of the element.
> ...
> If the value type is string and the office:string-value attribute is not
> present, the element content defines the value.

So LO behaves correctly when opening the file: the 'office:value' attribute is authoritative in this case, and is rightfully overrides the text contained in the cell.

It seems that the change is needed when a format is *applied* to a cell: it should try to convert the existing data, and if failed, keep the previous number format along with previous data. So if one applies a numeric format string, and there's "123" text, it successfully converts to number, and the format is accepted (and office:value becomes 123); if there were "abc", then the conversion fails, and the format stays '@'. This way, the value type, value, and content will always be in sync. The drawback is that when user applies a number format string to a range of cells, some of the cells might refuse that format, and keep older format (unexpectedly to user). Possibly that is acceptable inconvenience (which would need a documentation) compared to data loss.

[1] http://docs.oasis-open.org/office/OpenDocument/v1.3/OpenDocument-v1.3-part3-schema.html#attribute-office_value-type
Comment 35 Cris 2021-02-16 10:17:46 UTC
(In reply to Mike Kaganski from comment #34)

(...)
> value, and content will always be in sync. The drawback is that when user
> applies a number format string to a range of cells, some of the cells might
> refuse that format, and keep older format (unexpectedly to user). Possibly
> that is acceptable inconvenience (which would need a documentation) compared
> to data loss.

I think this would be a desirable behavior, as long as the user is warned of the fact that not all of the cells accepted the new format due to data conversion problems.

Just my 2cents
Comment 36 Jim DeLaHunt 2021-02-16 19:10:31 UTC
The circumstance when I ran into this problem involved:

1. A stub table of just a couple of rows in a LO Write document. The stub table had the right number of columns. I set up the character and paragraph formatting I wanted for the cells in that table. I didn't define Table… Number Format for the cells in that table, because the cells contained text not numbers.

2. I defined Table Styles for the rows of the stub table.

3. A data table in a LO Calc document. This table had the same number of columns, and many rows, and no formatting. 

4. In LO Calc, I selected and copied the data table cells.

5. In LO Write, I selected the data rows of the stub table (not the heading row), and pasted. LO Writer extended the table with as many rows as were in the data table cells on the clipboard. LO Writer applied the Table Styles to the rows of the table.  

6. After the paste, all the rows of the table in LO Write had the expected, text values. 

7. I saved and closed the LO Write document. LO Write gave no indication that it was changing any values.

8. I opened the LO Write document. Now, several text value cells had changed to 0 values.

So the important thing about this scenario is that the cell contents are inserted by a Paste operation, not by user text entry; and the cell's Number Formats are being applied as part of Table Styles, not by individual user action.

I would like a fix for this bug to also cover the Paste and Table Styles scenario.
Comment 37 Xisco Faulí 2021-03-08 09:05:16 UTC
*** Bug 135521 has been marked as a duplicate of this bug. ***
Comment 38 Timur 2021-03-08 09:46:20 UTC
*** Bug 137371 has been marked as a duplicate of this bug. ***
Comment 39 LG 2021-03-09 11:46:27 UTC
In my little case (#137371) and after reading this thread, I checked the data format of my faulty cells : all of them are numerics... I don't know why, I never used this function. 
After changing the format to text, everything is back to normal.
Comment 40 Dieter 2021-04-07 16:08:47 UTC
*** Bug 141176 has been marked as a duplicate of this bug. ***
Comment 41 Jim DeLaHunt 2021-05-01 22:40:09 UTC
I wrote a series of blog posts to explain how I fixed the corruption resulting from this bug in my LibreOffice document. It involves opening the OpenDocument archive file, and using XSLT, so I spread the explanation over four blog posts. Hopefully it can help others with tables corrupted by this bug. 

"How to fix table contents turned to “0” in LibreOffice"
http://blog.jdlh.com/en/2021/04/30/fix-libreoffice-table-turned-to-0/
This is the overview, and links to other blog posts to explain techniques.

the method I used to fix this is:

1. Open up the working copy of the OpenDocument file, following the instructions in my earlier blog, "How to crack open LibreOffice .ODT documents for fun and bug fixing" <http://blog.jdlh.com/en/2020/12/31/crack-open-odt-documents/>. The result is a directory containing XML and other files.

2. Copy the XML file content.xml out of the directory to a place where you can work on it. Name it content_corrupted.xml, or something similar.

3. Install the tool xsltproc or similar. See my earlier blog, "How to use XSLT to modify XML files inside .ODT documents" <http://blog.jdlh.com/en/2021/01/31/xslt-odt-documents/>, for an explanation of this tool, and how to use it with OpenDocument files.

4. Apply the XSLT stylesheet below to content_corrupted.xml, as shown by the sample command below. It creates content_repaired.xml .

5. Move the file content_repaired.xml back into the OpenDocument directory, naming it content.xml .

6. Turn the directory of XML and other files back into an OpenDocument file, following the further instructions in "How to crack open LibreOffice .ODT documents for fun and bug fixing".

7. Open the repaired OpenDocument file and verify that your table contents are restored.

The xsltproc command I used to fix the problem was more or less:

xsltproc -o content_repaired.xml repair.xslt content_corrupted.xml 

I have attached it as file "Restore_odt_table_cell_text.xslt".
Comment 42 Jim DeLaHunt 2021-05-01 22:42:42 UTC
Created attachment 171580 [details]
XSLT Stylesheet to restore text values corrupted by this bug, see #41

This XSLT stylesheet attempts to reverse the corruption caused by this bug. It changes the number style of table cells to "text" and removes the incorrect numeric value of "0". See comment 41.
Comment 43 Timur 2021-05-12 11:03:30 UTC
*** Bug 142202 has been marked as a duplicate of this bug. ***
Comment 44 Chandanathil P. Geevan 2021-05-12 15:18:49 UTC
(In reply to Joao Carvalho from comment #11)
> Shouldn't the importance of this bug be changed to "major" or "critical"?
> 
> Users may loose data because of this bug, because the data loss is only
> noticed after the file is closed and then opened again. At that point, you
> can't "undo" to recover the data you lost...

It is a critical flaw. It is a huge problem. Until it is resolved, the best temporary solution is to remove the Table Autoformat Styles. Let it be just default and none, both are trouble-free.
Comment 45 Chandanathil P. Geevan 2021-05-12 15:27:21 UTC
How this bug manifests varies:
1) It is not the same in all Table Autoformat Styles, some of them with number format as the default has it (everything seems fine when the style is applied, but text becomes '0' after save/ close/ reopen.

2) When a row is inserted entries in the added cells do not have any problem

3) When a blank table is inserted with one of the predefined Table Format, and later entries added, the problem is not happening
----------

That said, until the problem is resolved, please remove these options. Let the distribution be made without the new Table Autoformat Styles.
Comment 46 Geoff 2021-08-04 13:12:03 UTC Comment hidden (obsolete)
Comment 47 Geoff 2021-08-06 13:12:35 UTC
I encountered the lost data in cells bug on a Windows 10 machine. Wordpad, provided with earlier versions of Windows, is still available in Windows 10.

I opened the file in Wordpad, which displayed a message "Wordpad does not support all the features of this document's format. Some content might be missing or displayed improperly." It worked in my case. I then saved the file to a new name as open document text, opened the new file in LibreOffice Writer, saved it then reopened it.  It seems to have recovered.  In technical terms I would not like to say what is going on.

LibreOffice Version: 7.0.6.2 (x64)
Build ID: 144abb84a525d8e30c9dbbefa69cbbf2d8d4ae3b
CPU threads: 12; OS: Windows 10.0 Build 19042; UI render: Skia/Vulkan; VCL: win
Locale: en-GB (en_GB); UI: en-US
Calc: CL

For systems other than Windows, there may be an alternative to Wordpad that can provide recovery. Otherwise the file could be transferred to a Windows 10 machine and hopefully recovered there.
Comment 48 Justin L 2021-10-20 08:49:18 UTC
*** Bug 142539 has been marked as a duplicate of this bug. ***
Comment 49 Justin L 2021-10-21 18:17:08 UTC
I have an attempted patch at https://gerrit.libreoffice.org/c/core/+/123904, but it might be a bit heavy handed.

I thought it might be an export problem (?since it only seems to affect certain locales using the Standard numbering format and not the General format?), but in both cases it wrote out what we see in comment 12 etc.

An easier workaround to fix it (and a clue that this might actually be an import bug) is to edit styles.xml and remove  the language/country
    number:language="ru" number:country="RU"
from <number:number-style>
Comment 50 Justin L 2021-10-22 18:04:17 UTC
*** Bug 119377 has been marked as a duplicate of this bug. ***
Comment 51 Justin L 2021-10-22 19:42:40 UTC
(In reply to Justin L from comment #49)
> A clue that this might actually be an import bug

Well, it took 3 days of digging just to unearth the details, but I think this is a more consistent/less dangerous way to resolve this bug.

http://gerrit.libreoffice.org/c/core/+/124080

-bug 136730: writer_table_insert_column_bug.odt  (fixed)
-bug 142202: table.odt (fixed)
-bug 142202: Table-test1A-save-close-reopen.odt (couldn't reproduce)
-bug 142539: Test_1.odt (fixed)
-bug 119377: Sample44.odt (fixed)
-bug 119377: bug119377.odt (fixed)
-bug 133611: ABCV3.odt (fixed)
-bug 133732: fruit test.odt (fixed)
-bug 137977: 5 Luminous Cities.odt (fixed)
-bug 137977: bug report.odt (fixed)
-bug 137977: bug report0.odt (fixed)
Comment 52 Commit Notification 2021-10-23 12:52:26 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/3e1d316734354c6b49696c8904e0fc431cfb5143

tdf#131025 ODF import: recognize SV_COUNTRY_LANGUAGE_OFFSET

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 53 Commit Notification 2021-10-25 08:17:59 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/d3b4ef0f7726ef1619717d9e3327963ceb4c065a

tdf#131025 ODF import: recognize SV_COUNTRY_LANGUAGE_OFFSET

It will be available in 7.2.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 54 Justin L 2021-10-25 08:42:00 UTC
This bug should be fixed now. Previously broken documents will import the text now (as long as they were only saved once).


The alternative/companion patch from comment 49 is still be considered. It would help prevent the dilemma of whether non-number text should be considered a number or straight text.