Bug 105840 - FILESSAVE XLS: file size increases to 21mb after re-saving a particular Calc document
Summary: FILESSAVE XLS: file size increases to 21mb after re-saving a particular Calc ...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
5.2.0.0.alpha0+
Hardware: All All
: medium normal
Assignee: Bartosz
URL:
Whiteboard: target:5.4.0 target:5.3.1 target:5.2.6
Keywords: bibisected, bisected, filter:xls, regression
: 105716 105928 106104 106589 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-02-07 18:17 UTC by Justin L
Modified: 2017-05-18 16:33 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
filesizeBug.xls: when saved by LibreOffice 5.2, size grows to 20MB (6.00 KB, application/vnd.ms-excel)
2017-02-07 18:17 UTC, Justin L
Details
Original .xlsx file which cause issues with LibreOffice (8.86 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2017-02-08 11:11 UTC, Bartosz
Details
00-PRS-Orientation.xls: original, complex sample document (520.00 KB, application/vnd.ms-excel)
2017-02-10 14:14 UTC, Justin L
Details
Minimal .xlsx file with zeroHeight parameter which is causing dramatically increasing size (14.41 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2017-02-10 16:29 UTC, Bartosz
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Justin L 2017-02-07 18:17:12 UTC
Created attachment 130996 [details]
filesizeBug.xls: when saved by LibreOffice 5.2, size grows to 20MB

A regression in 5.2 or 5.3 causes saving this 6.7kb XLS file to become 21mb. It takes a couple of minutes to save.

Using Linux daily bibisect-52, the document saves at a normal size until it comes to a point where the original document can no longer be opened.  The exact commit on that day could not be guessed.
daily bibisect: ca5013c262274bd546f4cdc61622e6bf3a4329f5 is the first bad commit
5.2 master at that time:  author Norbert Thiebaud 2016-03-12 20:13:35
  commit 02de3a5206c7633d62ebc43edad37747e2c7a1de
  vcl graph: stop abusing a pointer for a bool

I couldn't open the document again until 5 months later in daily bibisect-53.
There are only 'skip'ped commits left to test. The first bad commit could be any of: c9f6e8694b6b58f421c843f5c55e58ff5b6e314f a7f2154ea5cf13f74acf57e6ae4baad5427ab7f5 fe9c60dd98b4b2e4a12bdde37c029d5a54792ac5 05b36582da87aa073ece143551b279ae00fab5ca d84f59360b9facd12144cab6f23191bc8781b6ef a7e3e7008c4f8aa164590a42ce8d2cd3e46d488a
somewhere around: 2016-08-03: source-hash-1b52171752d5e4f9fc101a8bc15f6feb6599aaa2

saving to .xlsx format gives a clue.  In early 5.2, the dimension was ref="A2:A1" with two row entries: <row r="2" s="2" customFormat="true" ht="12.75" hidden="false" customHeight="false" outlineLevel="0" collapsed="false"></row><row r="1048576" customFormat="false" ht="12.75" hidden="true" customHeight="false" outlineLevel="0" collapsed="false"></row>

In late 5.3, it was <dimension ref="A1"/> and one million rows were defined, ending again with <row r="1048576" customFormat="false" ht="15" hidden="true" customHeight="false" outlineLevel="0" collapsed="false"></row>
Comment 1 Xisco Faulí 2017-02-07 21:37:02 UTC
Regression introduced by:

author	Bartosz Kosiorek <gang65@poczta.onet.pl>	2016-06-17 14:21:06 (GMT)
committer	Markus Mohrhard <markus.mohrhard@googlemail.com>	2016-06-22 23:04:47 (GMT)
commit 228c25fd17727660a3372307e3f73dbcff5e71d2 (patch)
tree 9ba8688e731c96677288236d1d3dacf4ba29aaae
parent 92cee94a262a3a2f43c87bb940c50cb90a2ebd89 (diff)
tdf#98106 Preserving hidden and empty rows after xlsx export

Adding Cc: to Bartosz Kosiorek
Comment 2 Bartosz 2017-02-08 06:28:35 UTC
Does this bug is reproducible with .xlsx format?
If yes please attach it to bug report.
Comment 3 Justin L 2017-02-08 08:22:44 UTC
fixable by comparing hidden to the previous row, something like 
  ( bHidden != rDoc.RowHidden(nFrom - 1, nScTab) ) ||

Perhaps that also needs to be done with the other items that were recently added?
Comment 4 Xisco Faulí 2017-02-08 09:59:00 UTC
Please, do not change the status to NEEDINFO once it has been confirmed.
Comment 5 Xisco Faulí 2017-02-08 10:18:55 UTC
*** Bug 105716 has been marked as a duplicate of this bug. ***
Comment 6 Bartosz 2017-02-08 11:11:56 UTC
Created attachment 131008 [details]
Original .xlsx file which cause issues with LibreOffice
Comment 7 Bartosz 2017-02-08 12:09:52 UTC
In attached files, all columns are hidden by default:
    <cols>
        <col min="1" max="16384" width="0" style="1" hidden="1" />
    </cols>

Calc is not recognize it and is trying to save all rows, even if it is empty.
Comment 8 Bartosz 2017-02-10 01:28:13 UTC
I created initial review how I would like to resolve it:
https://gerrit.libreoffice.org/#/c/34111/
Comment 9 Justin L 2017-02-10 14:08:16 UTC
(In reply to Xisco Faulí from comment #1)
> Regression introduced by:
> tdf#98106 Preserving hidden and empty rows after xlsx export

revert tdf#98106 Preserving hidden and empty rows after xlsx export
It will be available in 5.2.6.
http://cgit.freedesktop.org/libreoffice/core/commit/?id=3e67dc9dbbd802dd82b92304098aaa44e70c014c&h=libreoffice-5-2

Still working with Bartosz on a fix for 5.3, but for 5.2.x series it will just be reverted.
Comment 10 Justin L 2017-02-10 14:14:18 UTC
Created attachment 131077 [details]
00-PRS-Orientation.xls: original, complex sample document
Comment 11 Bartosz 2017-02-10 16:29:19 UTC
Created attachment 131081 [details]
Minimal .xlsx file with zeroHeight parameter which is causing dramatically increasing size
Comment 13 Xisco Faulí 2017-02-12 11:34:19 UTC
*** Bug 105928 has been marked as a duplicate of this bug. ***
Comment 14 Commit Notification 2017-02-14 01:16:07 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1cde2eb9d128c9b1b658b1380074461429ab2214

tdf#105840 EXCEL export: fixes for hidden defaultRow

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Bartosz 2017-02-14 01:29:41 UTC
@Justin Could you please backport these fix to LO 5.3?
Comment 16 Commit Notification 2017-02-15 16:06:16 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "libreoffice-5-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=7003415978b162bdd9f84d3e2ea0d05e5599137a&h=libreoffice-5-3

tdf#105840 EXCEL export: fixes for hidden defaultRow

It will be available in 5.3.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Mark Mclean 2017-02-15 18:29:59 UTC
Well. I think I am to be here again?? I loaded 5.3.0.3. 
My ods file was OK, looked good, I entered date, time, saved, 3 times, worked good.
xls file opened OK, looked good, entered date, time, saved, it took a long time saving, and went from 3.7 to 150.2mb.
Opened xls 2 more times, blank. I took screen shots.
Comment 18 Justin L 2017-02-15 18:39:57 UTC
(In reply to Mark Mclean from comment #17)
> Well. I think I am to be here again?? I loaded 5.3.0.3. 
> My ods file was OK, looked good, I entered date, time, saved, 3 times,
> worked good.
> xls file opened OK, looked good, entered date, time, saved, it took a long
> time saving, and went from 3.7 to 150.2mb.
> Opened xls 2 more times, blank. I took screen shots.

Hi Mark,
   We know that it didn't work in 5.3.0. The fix from today goes into 5.3.1 which will be released in about a month, and also in 5.2.6 which will be similarly be released in about a month.  Until then, you will need to use 5.2.4.  Sorry for the trouble.
Comment 19 Xisco Faulí 2017-02-20 13:26:27 UTC
*** Bug 106104 has been marked as a duplicate of this bug. ***
Comment 20 Xisco Faulí 2017-03-17 10:54:42 UTC
*** Bug 106589 has been marked as a duplicate of this bug. ***
Comment 21 Artem 2017-03-17 11:12:27 UTC
Confirm the problem is fixed in 3.5.1.