Bug 80381 - FILEOPEN: MS Excel 2003 XML file looses empty rows on import which consequently results in broken formulas
Summary: FILEOPEN: MS Excel 2003 XML file looses empty rows on import which consequent...
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
(earliest affected) release
Hardware: Other All
: medium major
Assignee: Not Assigned
Whiteboard: BSA
Keywords: bibisected, bisected, regression
Depends on: 62129
Blocks: MSO-XML2003
  Show dependency treegraph
Reported: 2014-06-23 08:27 UTC by Daniil Bubnov
Modified: 2017-12-21 00:36 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:

Samle Excel 2003 (Windows XP) XML sheet - looks different in MSExcel and LOCalc (2.40 KB, text/xml)
2014-06-23 08:27 UTC, Daniil Bubnov

Note You need to log in before you can comment on or make changes to this bug.
Description Daniil Bubnov 2014-06-23 08:27:31 UTC
Created attachment 101558 [details]
Samle Excel 2003 (Windows XP) XML sheet - looks different in MSExcel and LOCalc

Problem description: 
It seems that import of Excel XML vas improved recently - styles now looks great! Nevertheless it, there is new serious bugs appeared (there is no such a thing in 4.1.4, that i have installed before). It seems that bug related to wrong empty rows handling - they are dropped somehow). As a result formulas are invalid
There are some other problems with empty rows handling.

Steps to reproduce:

Here is simlpe sheet - just open it in Excel and in last LibreOffice (mine 4242) and see the difference (2 and 3 empty rows dropped, row reference in formula shifted by 1 row down)
Operating System: Windows XP
Version: release
Comment 1 Daniil Bubnov 2014-06-23 08:35:22 UTC
Comment on attachment 101558 [details]
Samle Excel 2003 (Windows XP) XML sheet - looks different in MSExcel and LOCalc

><?xml version="1.0"?>
><?mso-application progid="Excel.Sheet"?>
><Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
> xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
> xmlns:html="http://www.w3.org/TR/REC-html40">
> <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
>  <Author>bubnovDI</Author>
>  <LastAuthor>bubnovDI</LastAuthor>
>  <Created>2014-06-23T07:50:50Z</Created>
>  <Company>***</Company>
>  <Version>11.9999</Version>
> </DocumentProperties>
> <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
>  <WindowHeight>11760</WindowHeight>
>  <WindowWidth>21915</WindowWidth>
>  <WindowTopX>120</WindowTopX>
>  <WindowTopY>75</WindowTopY>
>  <ProtectStructure>False</ProtectStructure>
>  <ProtectWindows>False</ProtectWindows>
> </ExcelWorkbook>
> <Styles>
>  <Style ss:ID="Default" ss:Name="Normal">
>   <Alignment ss:Vertical="Bottom"/>
>   <Borders/>
>   <Font ss:FontName="Arial Cyr" x:CharSet="204"/>
>   <Interior/>
>   <NumberFormat/>
>   <Protection/>
>  </Style>
> </Styles>
> <Worksheet ss:Name="ÐиÑÑ1">
>  <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="9" x:FullColumns="1"
>   x:FullRows="1">
>   <Row>
>    <Cell><Data ss:Type="String">AAAAA</Data></Cell>
>   </Row>
>   <Row ss:Index="4">
>    <Cell ss:Index="2"><Data ss:Type="Number">1</Data></Cell>
>   </Row>
>   <Row>
>    <Cell ss:Index="2"><Data ss:Type="Number">2</Data></Cell>
>   </Row>
>   <Row>
>    <Cell ss:Index="2"><Data ss:Type="Number">3</Data></Cell>
>   </Row>
>   <Row>
>    <Cell ss:Index="2"><Data ss:Type="Number">4</Data></Cell>
>   </Row>
>   <Row>
>    <Cell ss:Index="2"><Data ss:Type="Number">5</Data></Cell>
>   </Row>
>   <Row>
>    <Cell ss:Index="2" ss:Formula="=SUM(R[-5]C:R[-1]C)"><Data ss:Type="Number">15</Data></Cell>
>   </Row>
>  </Table>
>  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
>   <PageSetup>
>    <PageMargins x:Bottom="0.984251969" x:Left="0.78740157499999996"
>     x:Right="0.78740157499999996" x:Top="0.984251969"/>
>   </PageSetup>
>   <Selected/>
>   <Panes>
>    <Pane>
>     <Number>3</Number>
>     <ActiveRow>2</ActiveRow>
>     <RangeSelection>R3</RangeSelection>
>    </Pane>
>   </Panes>
>   <ProtectObjects>False</ProtectObjects>
>   <ProtectScenarios>False</ProtectScenarios>
>  </WorksheetOptions>
> </Worksheet>
Comment 2 retired 2014-06-23 09:28:17 UTC
Confirmed on 4.3RC1 OS X 10.9.3.

NEW, platform: all, regression (since reporter says this did not happen under LO 4.1)
Comment 3 Xisco Faulí 2014-08-12 09:55:00 UTC
 7a454addef42971c41393dd4f668123884973601 is the first bad commit
commit 7a454addef42971c41393dd4f668123884973601
Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Date:   Thu Oct 17 09:21:19 2013 +0000

    commit 23583553d1a9951eaa33dfb598606cdf55d3f01a
    Author:     Michael Stahl <mstahl@redhat.com>
    AuthorDate: Sun Jun 2 13:26:30 2013 +0200
    Commit:     Michael Stahl <mstahl@redhat.com>
    CommitDate: Sun Jun 2 20:37:57 2013 +0200
        mysqlcppconn: MSVC 2010 finally has grown a stdint.h
        Change-Id: I5b8d948aad94ba492075245c18c8ed781baa469e

:100644 100644 27848ba16c148657f41ac7b1df02e091a44dd29f 7beb466ac333cd27a5756f959c24f4514c12f47b M	ccache.log
:100644 100644 601e47632607a385493e43c480061748c2ca4c7b 57fbcee71fd8e9eb24ab4c293f9023f03784d884 M	commitmsg
:100644 100644 21e4be7670edb7af70f3d7bf4f3a21a45c2e09bd 35f68343cf886768eaa96263b6e4a8164eb92a05 M	dev-install.log
:100644 100644 fc4c2507cb79b70a4e64b27a328ce2b4507374f9 43d9c0d800f3e49309b6b8e5393c27810a50c177 M	make.log
:040000 040000 787f0b310a532028f6ef2a3b46651a6000091148 827d6a3eb295e2aabe319c0683bab9adc4e2e26e M	opt

# bad: [423a84c4f7068853974887d98442bc2a2d0cc91b] source-hash-c15927f20d4727c3b8de68497b6949e72f9e6e9e
# good: [65fd30f5cb4cdd37995a33420ed8273c0a29bf00] source-hash-d6cde02dbce8c28c6af836e2dc1120f8a6ef9932
git bisect start 'latest' 'oldest'
# bad: [e02439a3d6297a1f5334fa558ddec5ef4212c574] source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb
git bisect bad e02439a3d6297a1f5334fa558ddec5ef4212c574
# good: [8f4aeaad2f65d656328a451154142bb82efa4327] source-hash-1885266f274575327cdeee9852945a3e91f32f15
git bisect good 8f4aeaad2f65d656328a451154142bb82efa4327
# good: [9995fae0d8a24ce31bcb5e9cd0459b69cfbf7a02] source-hash-8600bc24bbc9029e92bea6102bff2921bc10b33e
git bisect good 9995fae0d8a24ce31bcb5e9cd0459b69cfbf7a02
# good: [8ad82bc1416a07501651e8d96fe268e47d3931d3] source-hash-13821254f88d2c5488fba9fe6393dcf4ae810db4
git bisect good 8ad82bc1416a07501651e8d96fe268e47d3931d3
# good: [d084d250b04446535ca1d7c29cf2062e6bd042b3] source-hash-688f72e3a2c3ef923389bbd21f6aea3afe1114db
git bisect good d084d250b04446535ca1d7c29cf2062e6bd042b3
# good: [c2069a369d738078124812312d51f21ea1ce2421] source-hash-f160e4935c474a5293b3d3c11b3d538efb4767a0
git bisect good c2069a369d738078124812312d51f21ea1ce2421
# good: [a0f20bc04a32a7791ba765d2de2f44f1b74033d1] source-hash-1de66ba440855050a794b3b2a8647c1b02c210b8
git bisect good a0f20bc04a32a7791ba765d2de2f44f1b74033d1
# bad: [a48fbf799e4d4d555fe383b7233c804f573eca4e] source-hash-bb6ecd8b40313b7cc83d4e619029f4e001334a52
git bisect bad a48fbf799e4d4d555fe383b7233c804f573eca4e
# bad: [7a454addef42971c41393dd4f668123884973601] source-hash-23583553d1a9951eaa33dfb598606cdf55d3f01a
git bisect bad 7a454addef42971c41393dd4f668123884973601
# good: [bb1ef709fce943598a8bcab0234b9a4ba1b2e69a] source-hash-c4cca49f49408bc4094bdfcf782de2f7cd16ce6a
git bisect good bb1ef709fce943598a8bcab0234b9a4ba1b2e69a
# first bad commit: [7a454addef42971c41393dd4f668123884973601] source-hash-23583553d1a9951eaa33dfb598606cdf55d3f01a
Comment 4 Matthew Francis 2015-01-05 12:39:30 UTC
The bibisect results in comment 3 appears to point to the wrong change in behaviour (which actually seems more of a fix than a break) - the correct bad commit in 43all is below, where the empty rows 2 and 3 disappear

# first bad commit: [bc819bc0c4d8592212f84069eb7f65e539517166] source-hash-d9412fb4755377b8358a46a249cfe29a22ea9451
Comment 5 Matthew Francis 2015-01-05 14:21:04 UTC
It was painful to track this down due to some breakage near the problem commits, but the issue appears to have started in the range 3420be984986bcff03d6d127b913fc07372fe89f..eadb83f281b596e441a82798660f1a27c177b2c6
(until the end of the range, the problem file opens as blank; in addition, many commits in the vicinity of the problem need a2a10b59876951b6493419713e9054ceabd3d6cc to be cherry-picked in order for Calc to be able to open the file)

Adding Cc: to pjotr@guineapics.de; I'm not sure if you're still active in developing for LO, but if you are could you possibly take a look at this? Thanks

commit eadb83f281b596e441a82798660f1a27c177b2c6
Author: Peter Jentsch <pjotr@guineapics.de>
Date:   Sat May 5 23:45:56 2012 +0200

    register exslt functions for libxslt filter
    Change-Id: I23bb8a3cf00a9152362794281a617ad4a780faee

commit b5107faa150aab3c5480708219fc8d392a97f718
Author: Peter Jentsch <pjotr@guineapics.de>
Date:   Tue May 1 00:26:25 2012 +0200

    add for exslt:set:distinct template
    ..for processors not supporting it natively, namely Saxon &gt; 8.2
    Change-Id: I33ceedd7f70f0469c039b8e90aa8d492d5c27ce2

commit 9f29890d4e4fa916d46eeae081ef6e04eb1bfe81
Author: Peter Jentsch <pjotr@guineapics.de>
Date:   Tue May 1 00:24:51 2012 +0200

    fix a problem when handling style named for conditional formatting.
    Change-Id: Ia8deda31dc4624b1d05d2388c90dbcb17d033269
Comment 6 Peter Jentsch 2015-01-05 21:30:44 UTC
I can take a look at the issue. Unfortunately I no longer have access to any version of Excel.
Comment 7 Matthew Francis 2015-01-06 03:47:34 UTC
(In reply to Peter Jentsch from comment #6)

Thanks for looking at the problem. If you do need anything specific which would require Excel, please feel free to comment here, send me an email and/or drop into #libreoffice-qa, and I or someone else will hook you up with what's needed.
Comment 8 Peter Jentsch 2015-02-08 22:39:45 UTC
will be fixed along with fdo#62129
Comment 9 Robinson Tryon (qubit) 2015-12-13 11:16:18 UTC Comment hidden (obsolete)
Comment 10 Xisco Faulí 2017-09-29 08:52:22 UTC Comment hidden (obsolete)
Comment 11 Kohei Yoshida 2017-12-21 00:36:44 UTC
The build on the master branch, which uses the orcus-based filter, no longer exhibits this problem.  I'll mark this resolved.