Bug 152104 - Long export to ods from xls / xlsx since 7.4.0beta1
Summary: Long export to ods from xls / xlsx since 7.4.0beta1
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
7.4.0.0 beta1+
Hardware: All All
: high major
Assignee: Not Assigned
URL:
Whiteboard: target:25.2.0
Keywords: bibisected, bisected, filter:ods, haveBacktrace, perf, regression
Depends on:
Blocks: Performance
  Show dependency treegraph
 
Reported: 2022-11-18 09:32 UTC by Maxim Britov
Modified: 2024-07-10 18:08 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
testcase (11.55 MB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2022-11-18 12:08 UTC, Maxim Britov
Details
gdb backtrace with debug build of LO 7.6 (84.40 KB, text/x-log)
2023-04-13 23:23 UTC, Stéphane Guillou (stragu)
Details
Perf flamegraph (615.31 KB, image/svg+xml)
2024-07-03 10:14 UTC, Buovjaga
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Maxim Britov 2022-11-18 09:32:54 UTC
We have xls file, trade report from our ERP (1С 8.3)
File big 62700 x BS with text, numbers, structutres, a lot formatting.
Filesize in xls 62mb, exported content.xml in ods 220mb

When I export this into ODS format on LO 7.4 it takes ~16min
It ~20s longer opens and ~15min longer export.


Tests:

$ time /home/lo/libreoffice7.3.7.2/program/scalc --convert-to ods test2017.xls 
(~30s here)convert /tmp/test2017.xls -> /tmp/test2017.ods using filter : calc8

real	0m36,829s
user	0m36,791s
sys	0m1,356s

$ rm test2017.ods 

$ time /home/lo/libreofficedev7.4a1/program/scalc --convert-to ods test2017.xls 
(~50s here)convert /tmp/test2017.xls -> /tmp/test2017.ods using filter : calc8

real	1m0,918s
user	1m0,811s
sys	0m1,346s

$ rm test2017.ods 

$ time /home/lo/libreofficedev7.4b1/program/scalc --convert-to ods test2017.xls 
(~50s here)convert /tmp/test2017.xls -> /tmp/test2017.ods using filter : calc8

real	16m43,210s
user	16m39,229s
sys	0m2,712s


I'm on Gentoo linux. All bin from official RPMs.


Bisected up/down to:
 632982840d84e0ea9afd096068fce54784e8167f is the first bad commit
commit 632982840d84e0ea9afd096068fce54784e8167f
Author: Jenkins Build User <tdf@pollux.tdf>
Date:   Wed May 18 06:43:30 2022 +0200

    source 9b06af2adddc49414adc135f3d08dcc88c896058
    
    source 9b06af2adddc49414adc135f3d08dcc88c896058

 instdir/program/libsclo.so | Bin 22044272 -> 22044216 bytes
 instdir/program/setuprc    |   2 +-
 instdir/program/versionrc  |   2 +-
 3 files changed, 2 insertions(+), 2 deletions(-)


$ git bisect log
# bad: [7735fbe0babcb698a78c4dc176442c8a9c55676a] source c30306ad19f7ba022628f4f88ba5b92b8a1af402
# good: [35f037427068121d5fe2111125caee10eef817ef] source 339fbb7bc30f227b9d4c9b9eea03b25f49533dee
git bisect start '7735fbe0babcb698a78c4dc176442c8a9c55676a' '35f037427068121d5fe2111125caee10eef817ef'
# good: [12ef6e781c31578c40e800abe0954d9201ce77a9] source 86039563de87149a01ffb980b5ec99074b98fd5e
git bisect good 12ef6e781c31578c40e800abe0954d9201ce77a9
# bad: [4e3f2d3431805f685006932af79d6a7687019b5f] source ecae51e5f14dad2591d6a15f2e70b4d024fb7985
git bisect bad 4e3f2d3431805f685006932af79d6a7687019b5f
# bad: [a950de20e0070d5d1f0a7f44883ff85bd5a9c138] source d8454627bd058a2d4166c4de3bd0572435c44178
git bisect bad a950de20e0070d5d1f0a7f44883ff85bd5a9c138
# good: [0c844154172f67bf237011f948842a0a3d6e6be0] source 429a960e157f3375e795cdec8f265ace1c5bdc9e
git bisect good 0c844154172f67bf237011f948842a0a3d6e6be0
# bad: [2c94a1cbc09b79ecd4eebb98d2de7a08a6d08501] source 037cae112958be3894e734d308f5f4b468a2d710
git bisect bad 2c94a1cbc09b79ecd4eebb98d2de7a08a6d08501
# bad: [e8943ef53ed8dba5b9eccd31df94f034c0cdd2e4] source a394b45125c6ebaae5d5dcc2154f6408e9a32d92
git bisect bad e8943ef53ed8dba5b9eccd31df94f034c0cdd2e4
# good: [7634a830fe2bf10b6c13f504c2c3f6f2741fe439] source f0d3727322207b3a547313e14305440ad7009079
git bisect good 7634a830fe2bf10b6c13f504c2c3f6f2741fe439
# good: [c921ada5d391e40d7b4355f657c37ff1d3eff8d3] source b6e0528ca31341239cb4ba990141a66ad4b76d6c
git bisect good c921ada5d391e40d7b4355f657c37ff1d3eff8d3
# good: [db3c1e75f456d68a14301e1817b760b179142ee8] source b2467d6c7af988f8ed4e090ebf9472be6c84fb06
git bisect good db3c1e75f456d68a14301e1817b760b179142ee8
# bad: [ba8c0369c8f399b0dc751be58295644be95f142b] source 6fc3ec85a32cd70216b4bbf21e479b4fc32a38dc
git bisect bad ba8c0369c8f399b0dc751be58295644be95f142b
# bad: [632982840d84e0ea9afd096068fce54784e8167f] source 9b06af2adddc49414adc135f3d08dcc88c896058
git bisect bad 632982840d84e0ea9afd096068fce54784e8167f
# first bad commit: [632982840d84e0ea9afd096068fce54784e8167f] source 9b06af2adddc49414adc135f3d08dcc88c896058
Comment 1 Maxim Britov 2022-11-18 12:08:47 UTC
Created attachment 183664 [details]
testcase

Ok, I have blank testcase

$ time /home/lo/libreoffice7.3.7.2/program/scalc --convert-to ods 152104_testcase.xlsx 
convert /tmp/152104_testcase.xlsx -> /tmp/152104_testcase.ods using filter : calc8

real	1m4,418s
user	1m5,138s
sys	0m2,372s

$ rm 152104_testcase.ods 

$ time /home/lo/libreofficedev7.4a1/program/scalc --convert-to ods 152104_testcase.xlsx 
convert /tmp/152104_testcase.xlsx -> /tmp/152104_testcase.ods using filter : calc8

real	0m54,555s
user	0m54,812s
sys	0m2,306s

$ rm 152104_testcase.ods 

$ time /home/lo/libreofficedev7.4b1/program/scalc --convert-to ods 152104_testcase.xlsx 
convert /tmp/152104_testcase.xlsx -> /tmp/152104_testcase.ods using filter : calc8

real	17m44,107s
user	17m43,807s
sys	0m2,558s
Comment 2 Stéphane Guillou (stragu) 2022-11-18 16:02:57 UTC
Thanks, Maxim.

Confirmed that 7.4.2 takes more than 4 minutes (didn't wait to finish)

Version: 7.4.2.3 / LibreOffice Community
Build ID: 382eef1f22670f7f4118c8c2dd222ec7ad009daf
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

... whereas 7.3.7 takes about 12 seconds.

Version: 7.3.6.2 / LibreOffice Community
Build ID: c28ca90fd6e1a19e189fc16c05f8f8924961e12e
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded 

Luboš, can you please have a look?
Comment 3 Stéphane Guillou (stragu) 2023-04-13 23:23:39 UTC
Created attachment 186647 [details]
gdb backtrace with debug build of LO 7.6

Backtrace, killed soffice after a few seconds.
Comment 4 Stéphane Guillou (stragu) 2023-04-13 23:45:47 UTC
Not just from command line. Save as > ODS does the same. Setting importance to High - Major because it hangs on a basic function and is a regression.
Comment 5 Buovjaga 2024-07-02 14:17:43 UTC
If my numbers are accurate, there has been a recent worsening.

24.2.4:
real    10m33,732s
user    10m35,002s
sys     0m0,705s

Fresh master (non-debug):
real    13m6,207s
user    13m8,077s
sys     0m0,377s
Comment 6 Buovjaga 2024-07-02 16:35:25 UTC
(In reply to Buovjaga from comment #5)
> If my numbers are accurate, there has been a recent worsening.
> 
> 24.2.4:
> real    10m33,732s
> user    10m35,002s
> sys     0m0,705s
> 
> Fresh master (non-debug):
> real    13m6,207s
> user    13m8,077s
> sys     0m0,377s

I didn't check it for this issue, but this might be due to the (still somewhat ongoing) item handling rework, for example commit ae7807c889c19145f89cec40afac82eee191837c, which I just bisected to another perf regression.
Comment 7 Buovjaga 2024-07-03 10:14:02 UTC
Created attachment 195098 [details]
Perf flamegraph

perf record -F 200 --call-graph dwarf,62000 ~/libreofficetwo/instdir/program/soffice --convert-to ods 152104_testcase.xlsx

Version: 25.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 85fd526fc681a994415bb422090d1d23aa7d54f6
CPU threads: 8; OS: Linux 6.9; UI render: default; VCL: kf6 (cairo+wayland)
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
Calc: CL threaded
Comment 8 Commit Notification 2024-07-04 07:18:33 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/890916578fc765845922284101599dcf4ece1e58

tdf#152104 speed xls->ods convert part 1

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 Commit Notification 2024-07-04 07:56:40 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/38514aecba9bede86d1ca195f9e30592157d1681

tdf#152104 speed xls->ods convert part 2

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Buovjaga 2024-07-04 09:51:43 UTC
Great job, now the time went from 13 mins to:

real    0m47,277s
user    0m49,939s
sys     0m0,424s