Bug 84229 - FILEOPEN: small XLS document using small OLE2 storage blocks.
Summary: FILEOPEN: small XLS document using small OLE2 storage blocks.
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.5.7.2 release
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: BSA
Keywords: bibisected, bisected, regression
Depends on:
Blocks:
 
Reported: 2014-09-23 08:57 UTC by Alexey
Modified: 2015-12-17 08:36 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Excel document .xls (14.50 KB, application/vnd.ms-excel)
2014-09-23 08:57 UTC, Alexey
Details
Screenshot of XLS content as displayed in MSO 2007. (50.77 KB, image/png)
2014-09-23 10:59 UTC, Owen Genat (retired)
Details
Screenshot of XLS content as displayed in LOv3462. (118.74 KB, image/png)
2014-09-23 11:10 UTC, Owen Genat (retired)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey 2014-09-23 08:57:08 UTC
Created attachment 106715 [details]
Excel document .xls

Problem description: 

Steps to reproduce:
1. Just open this file

Current behavior: content is empty

Expected behavior: if open this document in Excel 2003, content is present

              
Operating System: All
Version: 4.3.0.4 release
Last worked in: 4.1.6.2 release
Comment 1 Owen Genat (retired) 2014-09-23 10:59:09 UTC
Created attachment 106723 [details]
Screenshot of XLS content as displayed in MSO 2007.
Comment 2 Owen Genat (retired) 2014-09-23 11:10:56 UTC
Created attachment 106724 [details]
Screenshot of XLS content as displayed in LOv3462.
Comment 3 Owen Genat (retired) 2014-09-23 11:15:29 UTC
Ouch. This appears to be an old bug. Under GNU/Linux using:

- v3.3.4.1 OOO330m19 Build: 401
- v3.4.6.2 OOO340m1 Build: 602
- v3.5.7.2 Build ID: 3215f89-f603614-ab984f2-7348103-1225a5b
- v3.6.7.2 Build ID: e183d5b
- v4.0.6.2 Build ID: 2e2573268451a50806fcd60ae2d9fe01dd0ce24
- v4.1.6.2 Build ID: 40ff705089295be5be0aae9b15123f687c05b0a
- v4.2.6.3 Build ID: 3fd416d4c6db7d3204c17ce57a1d70f6e531ee21
- v4.3.1.2 Build ID: 958349dc3b25111dbca392fbc281a05559ef6848
- v4.4.0.0.alpha0+ Build ID: 6ee5be0e1dc300120439c3579430d35e7d31131c TinderBox: Linux-rpm_deb-x86_64@46-TDF, Branch:master, Time: 2014-09-17_10:09:07

... only v3.3 and v3.4 display the contents of the XLS as expected. All other versions display an empty spreadsheet. Given the content contains Cyrillic this may be an encoding issue. Refer comment 1 and comment 2 for screenshots. Status set to NEW. Summary amended for clarity. Version set to v3.5.7.2.
Comment 4 Jacques Guilleron 2014-09-23 12:47:28 UTC
Hi,

This could be true, Owen.
Opening this file in Excel 2010, and saving it in the french locale, I opened it in 
LO 4.3.2.1 Build ID: f9b3ad49d92181b0a1fe7e76f785a2c2cd0847d3,
& Windows 7 Home Premium,
but with all Cyrillic characters replaced by "?"

I opened correctly this document with LO 3.5.3.2 
Version ID : 235ab8a-3802056-4a8fed3-2d66ea8-e241b80

Regards,

Jacques
Comment 5 raal 2014-10-16 19:02:13 UTC
 d101b9946a6a04e65e3923038503436c790b7e12 is the first bad commit
commit d101b9946a6a04e65e3923038503436c790b7e12
Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Date:   Sun Dec 9 11:37:59 2012 +0000

    source-hash-18e6e7d929c2be209407ed2e56b8ec4d5e6c4900
    
    commit 18e6e7d929c2be209407ed2e56b8ec4d5e6c4900
    Author:     Julien Nabet <serval2412@yahoo.fr>
    AuthorDate: Mon May 14 18:59:35 2012 +0200
    Commit:     Julien Nabet <serval2412@yahoo.fr>
    CommitDate: Mon May 14 19:01:02 2012 +0200
    
        WaE : XKeycodeToKeysym deprecated
    
        Replaced by XkbKeycodeToKeysym
        (cf http://nabble.documentfoundation.org/PATCH-Proposed-patch-for-XKeycodeToKeysym-deprecated-td3978158.html)
    
        Change-Id: Ide8331705369d0c38e72bfe693102625e62a87e1

:100644 100644 13e11be9938c5079b96e4eb4cd2b4acf2f9a3b05 5aa1dfc68ecb9ac57316a995424b2d3683cb4774 M	autogen.log
:100644 100644 20a85200f0d859066fecafde8cdef513d59724ec 72da0ea5e9ec1223cb456558a2e0254561faa98c M	ccache.log
:100644 100644 00d946c601c37d4463365a27a04dac440b0e86a4 5ef3324ce1c257155c9e095fdeb7d912b2681ae1 M	commitmsg
:100644 100644 ff02681e8cefd9f0b1a9b8f19c25a3e77944e88a 8b14489bddefe04fcfaecb0be901837505c64b67 M	dev-install.log
:100644 100644 151ff28cf8b64f65d7b206588dd3c4e18553adce 68ac6a90c73f1f7c8776a70772a40ae1ce41e13d M	make.log
:040000 040000 6bde6ac28b39c6b41b55491b1a6a9900d26e65c1 8b906c6863615fd1253b393b35b18a883201b310 M	opt

git bisect log
# bad: [423a84c4f7068853974887d98442bc2a2d0cc91b] source-hash-c15927f20d4727c3b8de68497b6949e72f9e6e9e
# good: [65fd30f5cb4cdd37995a33420ed8273c0a29bf00] source-hash-d6cde02dbce8c28c6af836e2dc1120f8a6ef9932
git bisect start 'latest' 'oldest'
# bad: [e02439a3d6297a1f5334fa558ddec5ef4212c574] source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb
git bisect bad e02439a3d6297a1f5334fa558ddec5ef4212c574
# bad: [8f4aeaad2f65d656328a451154142bb82efa4327] source-hash-1885266f274575327cdeee9852945a3e91f32f15
git bisect bad 8f4aeaad2f65d656328a451154142bb82efa4327
# good: [369369915d3582924b3d01c9b01167268ed38f3b] source-hash-45295f3cdceb4c289553791071b5d7f4962d2ec4
git bisect good 369369915d3582924b3d01c9b01167268ed38f3b
# bad: [6fce03a944bf50e90cd31e2d559fe8705ccc993e] source-hash-47e4a33a6405eb1b5186027f55bd9cb99b0c1fe7
git bisect bad 6fce03a944bf50e90cd31e2d559fe8705ccc993e
# good: [8a39227e344637eb7154a10ac825d211e64d584c] source-hash-f5080ebb7022c9f5d7d7fdca4fe9d19f9bb8cabf
git bisect good 8a39227e344637eb7154a10ac825d211e64d584c
# bad: [e4c742a9e244bd7ebeabc50c90182df28ac3daaf] source-hash-c52ba433491afbca70aa1977a624c795bdd5b9ef
git bisect bad e4c742a9e244bd7ebeabc50c90182df28ac3daaf
# good: [96a055e15ee7171a28888973a3c3a7307dd9867f] source-hash-9ca02a663c3eee2698eb360dd5dc7afb1951e743
git bisect good 96a055e15ee7171a28888973a3c3a7307dd9867f
# bad: [e87a0055deae2c9e25ae1d1a365cec8418b785ce] source-hash-67ff63988f3b8eef2cc2b5bdf917918b93c3f070
git bisect bad e87a0055deae2c9e25ae1d1a365cec8418b785ce
# bad: [5b4693bb72eca5e38e3f56d036bca425c9a21b37] source-hash-e3633f60b349022994e291aa3d1a0c90c3403b2e
git bisect bad 5b4693bb72eca5e38e3f56d036bca425c9a21b37
# bad: [d101b9946a6a04e65e3923038503436c790b7e12] source-hash-18e6e7d929c2be209407ed2e56b8ec4d5e6c4900
git bisect bad d101b9946a6a04e65e3923038503436c790b7e12
# first bad commit: [d101b9946a6a04e65e3923038503436c790b7e12] source-hash-18e6e7d929c2be209407ed2e56b8ec4d5e6c4900
Comment 6 Julien Nabet 2014-10-21 22:06:28 UTC
On pc Debian x86-64 with master sources updated, I could reproduce this.
Comment 7 Matthew Francis 2014-12-29 02:20:08 UTC
The below is the commit at which the behaviour changed

Adding Cc: to michael.meeks@collabora.com; This was a long time ago and I'm sure you must be busy, but is there any chance you could have a look at this? (or suggest someone who can?)
Thanks

commit 1d32c56f36adbd0d5801f0fedec3111011ea4d65
Author: Michael Meeks <michael.meeks@suse.com>
Date:   Mon May 14 09:41:02 2012 +0100

    sot: re-work OLE2 offset-to-page computation
    
    The gotcha here is that if we get ahead of ourselves, and read to
    the end of the stream, we detect bad chains too early, so instead
    incrementally build the page chain cache, which is also quicker
    and behaves more similarly to the previous code.
Comment 8 Michael Meeks 2014-12-30 09:28:01 UTC
Matthew - interesting, thanks for that ? you can confirm that it works before that commit and afterwards it does not ? =) that's interesting of course. Let me build with debugutil there and see what can be seen (if anything).

I suspect we have a problem with objects with the small block storage - that is infrequently used - not something to do with cyrilic ;-)
Comment 9 Matthew Francis 2014-12-30 10:42:07 UTC
Michael - I'm doing a sweep of bibisected regressions to build/bisect the exact commits which introduced them, so I can confirm that the breakage did occur exactly at that commit. 

It's still mostly possible to build from 2012 with a modern distribution and toolchain, although the earlier you go the more challenging it is :)
Comment 10 Commit Notification 2014-12-30 17:48:40 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=0376706c11458c269d6736e17367ee3f0404c4e9&h=libreoffice-4-4

fdo#84229 - add sot storage unit test.

It will be available in 4.4.0.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 11 Commit Notification 2014-12-30 17:48:47 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=859b59ce572f5781f3e2fa9ae6416cfd65116ca3&h=libreoffice-4-4

fdo#84229 - don't set error when seeking beyond end of valid data.

It will be available in 4.4.0.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 12 Commit Notification 2014-12-30 17:52:15 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=dcf947534778520bad32167f181b42ef6a451531

fdo#84229 - add sot storage unit test.

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 13 Commit Notification 2014-12-30 17:52:22 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=dc5383e2fa487a7599f2e317bba409dc3cde8339

fdo#84229 - don't set error when seeking beyond end of valid data.

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Michael Meeks 2014-12-30 17:52:50 UTC
Heh, thanks for isolating that, I confirm you got the right commit; a semantic change that caused problems higher up - just pushed a fix to master and (accidentally) -4-4 too. That should clear up a whole lot of older / broken XLS files which we should try harder with now. I wonder what impact that will have on fuzzing too =)

Thanks for reporting.
Comment 15 Robinson Tryon (qubit) 2015-12-17 08:36:08 UTC
Migrating Whiteboard tags to Keywords: (bibisected)
[NinjaEdit]