Bug 100458 - Data loss on loading Calc document with hidden 0
Summary: Data loss on loading Calc document with hidden 0
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.0.3.3 release
Hardware: All Linux (All)
: medium critical
Assignee: Jan-Marek Glogowski
URL:
Whiteboard: target:5.3.0
Keywords: bibisected, regression
Depends on:
Blocks:
 
Reported: 2016-06-17 15:00 UTC by Jan-Marek Glogowski
Modified: 2017-10-23 16:24 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
LO Calc looses the 0 in A1 on load (8.11 KB, application/vnd.oasis.opendocument.spreadsheet)
2016-06-17 15:00 UTC, Jan-Marek Glogowski
Details
Same document with re-added 0 saved on master (9.89 KB, application/vnd.oasis.opendocument.spreadsheet)
2016-06-17 15:11 UTC, Jan-Marek Glogowski
Details
Diff of content.xml, except all the style changes (1.97 KB, patch)
2016-06-17 15:11 UTC, Jan-Marek Glogowski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jan-Marek Glogowski 2016-06-17 15:00:29 UTC
Created attachment 125708 [details]
LO Calc looses the 0 in A1 on load

To reproduce:
 * load the document.

Result:
 * cell A1 is empty
 * cell C1 contains the string "ist leer" AKA "is empty"

Expected:
 * cell A1 contains a hidden 0
 * cell C1 contains the string "enthält 0" AKA "contains 0"

The attached document was produced with OOo 3.2.1.

Bibisect release repo pointed to: 1182cffff1081b6907d4568a093ecec2411e20da is the first bad commit
commit 1182cffff1081b6907d4568a093ecec2411e20da
Author: Matthew Francis <mjay.francis@gmail.com>
Date:   Tue Apr 14 22:42:38 2015 +0800

    libreoffice-4.0.3.1
    
    LibreOffice_4.0.3.1_Linux_x86-64_deb.tar.gz

Bibisect of repo 43all pointed to:
 89740762f0af849e492932bd71e59149cdcd5a00 is the first bad commit
commit 89740762f0af849e492932bd71e59149cdcd5a00
Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Date:   Mon Dec 10 01:57:45 2012 +0000

    source-hash-06f20d73da21342046a480a6b22af69901351328
    
    commit 06f20d73da21342046a480a6b22af69901351328
    Author:     Stephan Bergmann <sbergman@redhat.com>
    AuthorDate: Fri Jul 20 11:10:05 2012 +0200
    Commit:     Stephan Bergmann <sbergman@redhat.com>
    CommitDate: Fri Jul 20 11:10:05 2012 +0200
    
        Fix SAL_LOG area usage
    
        Change-Id: If8acc5e9fee2730796637dfb505e0c514f96f1a3

$ git log b0f170d7df9cff12535d2ecfd146b32b745a8ef8~1..06f20d73da21342046a480a6b22af69901351328 --pretty=oneline | wc -l
65

$ git log b0f170d7df9cff12535d2ecfd146b32b745a8ef8~1..06f20d73da21342046a480a6b22af69901351328 --pretty=oneline sc/ b37bcdded66ae82c5677fe60d8347b87281f3fa2 Prefer prefix ++/-- operators for non-primitive types
 61c39eae570d6d6040b65bfe93127b30e6080cc8 Display blank cell instead of zero in matrix cells with blank text result abdffd2e442356782ee6bd947936afc7fb9e4287 Add test case for matrix reference cells with blank text result

^^^ those look promising
 15929afc1a0da30ece4b193f4d2cf327886eb440 Revert "Let's see if stripping the const here will fix the clang build..." 52022b5f82385b28687bf46424d0d24be1964a4a Let's see if stripping the const here will fix the clang build... ce8ecf1d343224a6db823860e4dcf46d2dfe5654 Specify function object as template argument rather than local variable. 4c1d32263e580d41e4b8fea58aab6588f7579e96 This should be bool. 9f36df7943127d82aba46590aeebd8abec110d70 Added comment. 7ba5f709030a78616d2331bd3894752c99fbc1fa Pass contiguous cell data as an array to matrix. This is faster. ed5d2c652cd053cd10f54fb028f513f20d6c6086 Less indentations via early bailout. 32d9ed46f6075c66e233b3c8f8036beceb60d87c Slightly more efficient compareMatrix. 8a7ba10c71229f10ee56b263aca4174d7a856f6c It's now faster to start empty and fill non-empty elements. 229d86b238186e75621b6f617ebc4ee29756324b Use macro to define callbacks. 93c8b68ba62e392db6af85d0663e2ff31201eb22 Reorganized code to remove redundant calls to get_type(). 96610e8ed700f73592c65bee94d75d6fa9f9de96 Some cleanup and comments. e0704701ded22352f058acde6fe5d36bbd278d47 Turns out that C++03 doesn't support use of static double inside template. d3629a3942b681fcff006967c50394b3a9c344ee It's no longer possible to unionize value and string here... d13804609737114f30dbc7580e4bc76e66cb02a8 We no longer need density types. 891e5892b2798a6bc326a688ed50e79a4a9d336b This is no longer true. 2789bcb41b9b71551f2acd5e2d9df959c599a608 Support resizing matrix with a default value. c79305f2a1218a3cbf74c7db3c9745429e8b7883 For sum product, the initial accumulator value should be 1, not 0. 608fdfe12329094604d9424f1225e23d94ac7537 Don't forget to initialize the flag matrix. cc72060199d83e7d919edc3b1a295dc9f8f8b97b A little cleanup. 0ddcb7a07a42294200b86dac7e67a4c82e03fcf4 Implement Sum(), SumSquare(), etc... b4c774a51fab40569abf6298789f6dedfbebdd12 Now, we need to explicitly pass 0.0 as the initial value of a matrix. c31905c88ca6160a2c12565e036f5aa02a5be086 Initial cut on matrix backend swapping. Still lots of things to fix. b7dbc768a71ccfb567e3b2979e57d0d1318977cf resolved fdo#52205 do not force all text cells in CSV import
Comment 1 Jan-Marek Glogowski 2016-06-17 15:11:03 UTC
Created attachment 125710 [details]
Same document with re-added 0 saved on master
Comment 2 Jan-Marek Glogowski 2016-06-17 15:11:39 UTC
Created attachment 125711 [details]
Diff of content.xml, except all the style changes
Comment 3 Jan-Marek Glogowski 2016-06-17 18:49:45 UTC
 61c39eae570d6d6040b65bfe93127b30e6080cc8 Display blank cell instead of zero in matrix cells with blank text result

actually breaks the import.

Couldn't really prove it, as I wasn't able to build 4.0 to do a bisecting, but reading the commit message makes it quite clear.

And since in the content.xml diff "calcext:value-type" is set to "float" for current saved document from Calc on master and the current code runs into 

void ScXMLTableRowCellContext::EndElement()
{
    HasSpecialCaseFormulaText();
    if( ... )
    {
        CellType = util::NumberFormat::TEXT;

in debugrun for the original document, breaking the import.

Commenting the "CellType = util::NumberFormat::TEXT;" line fixes the import, but is definitly wrong. I still try to understand, what the code is expected to do.

There is this comment, which I think is exactly my use case

// If a matrix formula has a matrix reference cell that is intended to have
// a blank text result, the matrix reference cell is actually saved(export)
// as a float cell with 0 as the value and empty <text:p/>.
// Import works around this by setting these cells as text cells so that
// the blank text is used for display instead of the number 0.
Comment 4 Jan-Marek Glogowski 2016-06-17 18:51:13 UTC
Switch "commit message" in previous comment with code, especially the quoted code comment.
Comment 5 Jan-Marek Glogowski 2016-06-17 19:20:29 UTC
Ok - probably commenting

 CellType = util::NumberFormat::TEXT;

is actually correct … in a way.

I don't know, why we ever want to forcefully change the cell type.

Probably we should just change the type, if the current cell type is undefined, so we get

if (util::NumberFormat::UNDEFINED == nCellType)
    CellType = util::NumberFormat::TEXT;

That code survives: sc.check, sc.slowcheck and sc.subsequentcheck

And it fixes my bug.
Comment 6 Jan-Marek Glogowski 2016-06-17 19:55:56 UTC
To create the initial document:

1. Use OOo 3.2.1
2. Put 0 in cell A1
3. Change cell format to 'user-defined', value:
 [>0]Standard;[<0]"";""
 => The value 0 is hidden
4. Evaluate cell A1 via C1 by using 
 =IF(A1="";"is empty";"contains 0")
5. Save document.

Load in any LO >= 4.0.3
Comment 7 Jan-Marek Glogowski 2016-06-23 11:31:47 UTC
So my latest Gerrit patch @ https://gerrit.libreoffice.org/#/c/26435/ simply removes this line. At the end I still don't know the expected behaviour.

Commit 61c39eae570d6d6040b65bfe93127b30e6080cc8 introduced the overwriting cell type as a workaround.

// If a matrix formula has a matrix reference cell that is intended to have
// a blank text result, the matrix reference cell is actually saved(export)
// as a float cell with 0 as the value and empty <text:p/>.

This is the same in my document AKA 
 <table:table-cell office:value-type="float" office:value="0">
   ... annotation
   <text:p/>
  </table:table-cell>

And strangely this overwriting was just done exactly for this case, with the quoted comment in front of the code:

+    if( rStr.isEmpty() )
+    {
+        rnCellType = util::NumberFormat::TEXT;

The current code expands the overwriting to a broader range of cases (sc/source/filter/xml/xmlcelli.cxx:1511), and this might also be wrong. Actually my document works correctly _without_ any special handling of cell number type, just by resetting the text, and breaks _with_ the overwrite.

// Import works around this by setting these cells as text cells so that
// the blank text is used for display instead of the number 0.

This reads as we already fix the import problem with a workaround, and if the comment is correct, it's not needed any more, which is proved by my error case.
Comment 8 Jan-Marek Glogowski 2016-06-23 13:42:09 UTC
As bubli noticed, and I already forgot, there was also a commit for a unit test, which still seems to pass (I have to check, if the code is still there, to be sure).
 abdffd2e442356782ee6bd947936afc7fb9e4287 Add test case for matrix reference cells with blank text result
Comment 9 Commit Notification 2016-07-04 03:36:11 UTC
Jan-Marek Glogowski committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=21a3d78cf080dc4d86edab2a7378055a2d848bfe

tdf#100458 Don't forcefully change cell type

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Caolán McNamara 2016-07-07 20:19:58 UTC
So this is fixed in master given the above commit, right ?