Bug 107140 - FILESAVE XLSX Additional empty columns and larger filesize after deleting a column
Summary: FILESAVE XLSX Additional empty columns and larger filesize after deleting a c...
Status: CLOSED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.1.0.4 release
Hardware: x86-64 (AMD64) All
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:xlsx
Depends on:
Blocks:
 
Reported: 2017-04-13 13:16 UTC by Michał Bultrowicz
Modified: 2021-03-24 04:11 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
file for bug reproduction (47.72 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2017-04-13 13:17 UTC, Michał Bultrowicz
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michał Bultrowicz 2017-04-13 13:16:29 UTC
Description:
When I delete a column (let's say column M in the first sheet) in the attached spreadsheet and save the file, the file grows in size and I can see that it has many more columns (with openpyxl Python library).

Also, when I unzip the XLSX file, in xl/worksheets/sheet1.xml, <sheetData>, first <row>, there is now a strange new cell in a far away column - <c r="AMJ1" s="0"/>.

Steps to Reproduce:
1. Open the attached .xlsx
2. Delete column M from 'DANE1' sheet.
3. Save file.

Actual Results:  
The file is bigger and has more columns.

Expected Results:
The file is smaller and has less columns.


Reproducible: Always

User Profile Reset: No

Additional Info:


User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/57.0.2987.98 Chrome/57.0.2987.98 Safari/537.36
Comment 1 Michał Bultrowicz 2017-04-13 13:17:52 UTC
Created attachment 132536 [details]
file for bug reproduction
Comment 2 Michał Bultrowicz 2017-04-13 13:21:17 UTC
Update to the reproduction steps:
1. Open the attached .xlsx
2. Say 'No' when you're asked if you want to update the links in the document.
2. Delete column M from 'DANE1' sheet.
3. Save file.
Comment 3 Xisco Faulí 2017-04-13 14:14:06 UTC
I can't reproduce it in

Version: 5.4.0.0.alpha0+
Build ID: 7635e0c1c7f821a1081f8e3868f641ae74a172d6
CPU threads: 4; OS: Linux 4.8; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group


> Actual Results:  
> The file is bigger and has more columns.

what do you mean by 'bigger'(In reply to Michał Bultrowicz from comment #0)
Comment 4 Michał Bultrowicz 2017-04-13 14:21:41 UTC
It was 47.7 KB before, and after saving it's 50.6KB.
Also, the column count goes from 16 to 1024.
Comment 5 Buovjaga 2017-04-23 18:02:19 UTC
After saving it is 11,8 kB for me.

Does the same happen, if you reproduce after: Help - Restart in safe mode and then Continue in safe mode without doing anything else?

Set to NEEDINFO.
Change back to UNCONFIRMED after you have provided the information.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.3.2.2
Build ID: 5.3.2-1
CPU Threads: 8; OS Version: Linux 4.10; UI Render: default; VCL: kde4; Layout Engine: new; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group

Arch Linux 64-bit, KDE Plasma 5
Version: 5.4.0.0.alpha0+
Build ID: d69e6321fbb2c9f5b4d30890074a230ee6b39d2d
CPU threads: 8; OS: Linux 4.10; UI render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on April 17th 2016
Comment 6 Michał Bultrowicz 2017-04-23 21:05:34 UTC
I've restarted in safe mode, deleted column M on the first sheet, saved in Excel format and the file grew to 50,6 KB again.
Comment 7 XTR 2017-12-07 15:36:19 UTC
Reproduced with
Version: 6.0.0.0.beta1 (x64)
Build ID: 97471ab4eb4db4c487195658631696bb3238656c
CPU threads: 4; OS: Windows 6.1; UI render: default; 
Locale: ru-RU (ru_RU); Calc: CL

in original attached file there is some underline format on FULL first row on first sheet (bottom border? cannot turn it off)

it's not related only to xlsx, but ods too.

steps to reproduce with new file
1. new spreadsheet 
2. apply some format on full first row, fill red background for example
3. delete some column
4. goto AMJ1 cell with "Name Box" and see one cell without format
5. so there is 1023 formated cell and one nonformated now, it will be in saved file too.
6. hehe

may be it's not a bug, but something like a normal  (needs more thinking about this :)

there is not much in file size increase because of smart compact saving repeated cells in file
Comment 8 Buovjaga 2017-12-09 17:56:09 UTC
Now I could repro the file size growth. I don't know what I missed before. Maybe I did not switch to the DANE1 sheet. 
How can I determine the column count increase?

Despite comment 7, let's set to NEW and see what happens.

Not seen in Linux 3.6.7 or Win 3.5.0. Let's call it a regression even though we don't know for sure.

Version: 6.1.0.0.alpha0+ (x64)
Build ID: 0bb0299b29960c3a27427eba5d5fc34e5e913a8b
CPU threads: 4; OS: Windows 10.0; UI render: default; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2017-12-09_00:15:04
Locale: fi-FI (fi_FI); Calc: group threaded

Arch Linux 64-bit
Version: 6.1.0.0.alpha0+
Build ID: d4a54ec92674773bc0f9358a3d9090915a3c8fb0
CPU threads: 8; OS: Linux 4.14; UI render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group threaded
Built on December 9th 2017
Comment 9 Buovjaga 2018-07-08 20:08:24 UTC
Bisected (targeting the file size increase) on Ubuntu 14.04 with 41max to:
commit d4723d83549aaba53d9aca7055f7831c70527a1d
Author: Matthew Francis <mjay.francis@gmail.com>
Date:   Fri Sep 18 10:25:49 2015 +0800

    source-hash-b75bf09a5b905a3ed9c251869983a400c70c7fc6
    
    commit b75bf09a5b905a3ed9c251869983a400c70c7fc6
    Author:     Noel Power <noel.power@suse.com>
    AuthorDate: Tue Jan 29 14:51:49 2013 +0000
    Commit:     Noel Power <noel.power@suse.com>
    CommitDate: Wed Jan 30 18:01:45 2013 +0000
    
        correctly handle repeated row heights for empty rows ( fdo#59973 )
    
        it seems both xls & xlsx export suffer from problems with multiple row heights
        repeated ( if those rows are empty )
Comment 10 Markus Mohrhard 2018-11-06 19:04:10 UTC
(In reply to Buovjaga from comment #9)
> Bisected (targeting the file size increase) on Ubuntu 14.04 with 41max to:
> commit d4723d83549aaba53d9aca7055f7831c70527a1d
> Author: Matthew Francis <mjay.francis@gmail.com>
> Date:   Fri Sep 18 10:25:49 2015 +0800
> 
>     source-hash-b75bf09a5b905a3ed9c251869983a400c70c7fc6
>     
>     commit b75bf09a5b905a3ed9c251869983a400c70c7fc6
>     Author:     Noel Power <noel.power@suse.com>
>     AuthorDate: Tue Jan 29 14:51:49 2013 +0000
>     Commit:     Noel Power <noel.power@suse.com>
>     CommitDate: Wed Jan 30 18:01:45 2013 +0000
>     
>         correctly handle repeated row heights for empty rows ( fdo#59973 )
>     
>         it seems both xls & xlsx export suffer from problems with multiple
> row heights
>         repeated ( if those rows are empty )

That commit just fixed an export problem. It is not related to the actual problem, if we can call it actually a problem at all. When a column is deleted a new empty column is created which contains the default format. During OOXML export this might result in more columns being written based on our default column format detection.

I don't think we have a regression here and would not consider it a bug.
Comment 11 hawkin 2021-03-24 03:31:27 UTC Comment hidden (spam)