Bug 97597 - FILEOPEN: XLSX file on server opens with some blank tab pages (workaround: export MAX_CONCURRENCY=1)
Summary: FILEOPEN: XLSX file on server opens with some blank tab pages (workaround: ex...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Kohei Yoshida
URL:
Whiteboard: target:5.4.0 target:5.3.0.2 target:5.2.6
Keywords: dataLoss
: 94424 103044 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-02-05 15:23 UTC by Cor Nouws
Modified: 2017-01-20 13:40 UTC (History)
11 users (show)

See Also:
Crash report or crash signature:


Attachments
Example file: four tabsheets are missing data (2.80 MB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2016-09-19 01:43 UTC, kees
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Cor Nouws 2016-02-05 15:23:22 UTC
Hi,

The organization works with 5.0.3.2 64 bits Windows in a Citrix environment.

With a file with some 10+ sheets, not that much data, little formulas and with filters on half of the pages. (People are looking if it is possible to provide a test file without confidential data)

Problem does not occur when
 - opening the same on the local machine,
 - saving and opening as ods.

Setting for OpenGL was OFF. ON could not be tested: wouldn't start.
Hard refresh: no difference.
Setting ... Calc > Formulas .. Recalculate on loading..: no difference.

Choosing File > Reload directly after opening does help.

The situation is similar as described here
 https://ask.libreoffice.org/en/question/53599/some-sheets-in-excel-workbook-is-blank-when-opened-in-libreoffice-calc/
Comment 1 m.a.riosv 2016-02-06 01:29:59 UTC
Hi Cor,
a clear direct formatting on those sheet change anything?, or results are zeroes hidden by the format or the option. Are there cells with external links? view options for LO/calc doesn't change?
Comment 2 raal 2016-02-06 12:23:07 UTC
XLSX file on server -  you mean samba/windows sharing or webdav?
Comment 3 raal 2016-02-20 15:35:17 UTC
Please attach test file, thanks
Comment 4 Cor Nouws 2016-03-03 14:03:42 UTC
Hi Raal, Miguel

(In reply to m.a.riosv from comment #1)
> a clear direct formatting on those sheet change anything?, or results are
> zeroes hidden by the format or the option. Are there cells with external
> links? 

Non of these.

> view options for LO/calc doesn't change?

What options are you thinking about?


(In reply to raal from comment #2)
> XLSX file on server -  you mean samba/windows sharing or webdav?

Sharing.
But I hear that it's not reliable local too.


(In reply to raal from comment #3)
> Please attach test file, thanks

Will ask again..
Comment 5 Cor Nouws 2016-03-03 14:05:50 UTC
Note that, once confirmed, it will have to be marked with data loss.

File opens, some sheets (maybe) is not fully loaded, but users does not see this, and saves the file, the data is simply gone...
Comment 6 m.a.riosv 2016-03-03 21:30:56 UTC
(In reply to Cor Nouws from comment #4)
> Hi Raal, Miguel
> 
> (In reply to m.a.riosv from comment #1)
.....
> > view options for LO/calc doesn't change?
> 
> What options are you thinking about?
> 
I think I was thinking about OpenGL, lately it's another option to verify.
Better I don't ask about a clean profile. :)
Comment 7 trevor 2016-07-24 02:44:18 UTC
I am experiencing the same issue on Mac OS X 10.11.5 when trying to load Excel files from a network share.

All of the tabs except one loaded fine. The one tab contained nothing in it. Another tab referenced data from the blank tab and had data in it, but this appears to have been a saved calculation. When I tried to add a formula that referenced the same cell, I got a value of 0. Also, when I changed detail rows that should have updated the formula, the formula remained unchanged.

Luckily I made a backup of the file before saving it, since I am new to Libre Office and didn't trust it wouldn't corrupt my .xlsx file. I copied the backup to my local drive and opened it in Libre Office and it worked fine.
Comment 8 Cor Nouws 2016-07-29 10:58:42 UTC
(In reply to trevor from comment #7)
> I am experiencing the same issue on Mac OS X 10.11.5 when trying to load
> Excel files from a network share.

Thus marking as confirmed. Thanks!
Comment 9 Cor Nouws 2016-08-02 15:01:28 UTC
@Eike, @Markus,

Any advice on what we may be able to test / help with?
Any (distant) bell whistling?

Thanks - Cor
Comment 10 Eike Rathke 2016-08-02 19:35:00 UTC
@Cor:
I've no idea about Windows and Citrix and why it would happen there but not locally or with another file format.
I can't investigate/reproduce because I don't have such environment.
Similar for MacOSX and network share.

Guesswork: .xlsx files consist of several zipped XML streams, among others one separate stream for each sheet. Maybe when opening/unzipping the file to memory for one (or more) of these streams an empty or otherwise undetected broken XML stream is returned. This might depend on network share timing issues. However, this is poking around and someone in the know having access to such environment would have to check.
Comment 11 Cor Nouws 2016-09-01 19:59:46 UTC
we should start searching for someone able to do some network investigation?
Comment 12 kees 2016-09-19 00:55:46 UTC
I can also confirm this bug - it comes back to bite us regularly, to the extent that we now have a ban on opening xlsx over the network in LO.

Both computers would be running windows 7 64 bit, the file would be on a windows share, and the client would be running an up to date 64bit LO (last time it happened this was 5.2.1.2. 

Typically the files would be between 1 and 2 mb, there would be between 10 and 30 tabsheets in them, and when the bug occurs, typically between 1 and 5 tabsheets would be blank. The time to open the file would be between 5 and 10 seconds.

Can't share the data but will try to reproduce using a test file.
Comment 13 kees 2016-09-19 01:43:49 UTC
Created attachment 127421 [details]
Example file: four tabsheets are missing data
Comment 14 kees 2016-09-19 01:47:39 UTC
Comment on attachment 127421 [details]
Example file: four tabsheets are missing data

- there are missing data tabsheets pretty much every time the file is opened over the network from a windows share
- there are no missing data tabsheets when the file is opened locally
- the problem does not occur with ods files
- the problem occurs with xlsx files created and touched only by LO.
Comment 15 Eike Rathke 2016-09-19 09:47:15 UTC
@mmeeks:
Could it be that the .xlsx import threaded sheet-loading has some hickups in this Windows network shares constellation?
Comment 16 Michael Meeks 2016-09-19 10:58:28 UTC
Seems unlikely - I believe we download the whole file first to get to the ZIP manifest at the end of the file, and then package will work on it - so network effects shouldn't have any effect. But of course it is easy to test.

Can you re-try with: export MAX_CONCURRENCY=1 into your environment, and re-test ?

Thanks !
Comment 17 kees 2016-09-20 09:43:03 UTC
(In reply to Michael Meeks from comment #16)
> Can you re-try with: export MAX_CONCURRENCY=1 into your environment, and
> re-test ?

I tested this: opened the same file from a windows share 4 times with this setting and 4 times without. 

WITH the setting it opened correctly 4 times
WITHOUT the setting it opened correctly only 1 time, and three times the data of 1 sheet was missing. It was a different sheet every time by the way.

So it definitely seems to make a difference.
Comment 18 m.a.riosv 2016-10-08 11:42:46 UTC
*** Bug 103044 has been marked as a duplicate of this bug. ***
Comment 19 m.a.riosv 2016-11-06 14:24:45 UTC
*** Bug 94424 has been marked as a duplicate of this bug. ***
Comment 20 Kohei Yoshida 2016-11-30 02:29:43 UTC
(In reply to kees from comment #13)
> Created attachment 127421 [details]
> Example file: four tabsheets are missing data

I'm guessing this file was saved *after* the data loss had occurred.  Ideally we need a file that contains all the data prior to the data loss so that we can try and see if the data loss occurs...

If that's not possible, we'll first have to start by creating such a test file first.
Comment 21 kees 2016-11-30 02:50:37 UTC
(In reply to Kohei Yoshida from comment #20)
> (In reply to kees from comment #13)
> > Created attachment 127421 [details]
> > Example file: four tabsheets are missing data
> 
> I'm guessing this file was saved *after* the data loss had occurred. 
> Ideally we need a file that contains all the data prior to the data loss so
> that we can try and see if the data loss occurs...
> 
> If that's not possible, we'll first have to start by creating such a test
> file first.

Correct. I dont have the original version anymore, but from my experience it will work with any xlsx, as long as it has a fair number of tabs (say >=15) and it will take a bit of time to open the file (say >= 4 seconds). It will still work even with this file - just duplicate some tabsheets if the file opens too fast for the problem to occur.

By the way, we have now set MAX_CONCURRENCY=1 on all our workstations, and the problem has completely gone away. We have lifted the ban on opening xlsx over the network.
Comment 22 Kohei Yoshida 2016-12-19 23:48:01 UTC
I'm looking into this right now.
Comment 23 Kohei Yoshida 2016-12-23 00:45:16 UTC
I'm still looking into this, but I'm reasonably certain that the root cause of this is FastSaxParser throwing an exception (css::xml::sax::SAXParseException to be precise) which basically halts the parsing of whichever sheet is being parsed.  This matches the way the sheet data is partially imported when the failure happens.  The fact that this can be reproduced reliably on Windows and only when opening from a network share is probably just the thread timing issue.

I'm currently putting my effort on understanding what exactly fails on the parser side, which will probably take a while I'm afraid.
Comment 24 Commit Notification 2016-12-23 03:39:49 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4252096c68ce01ed8a06bcaf57260dbe46502cd3

tdf#97597: Make the document import state more multi-thread friendly.

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 25 Kohei Yoshida 2016-12-23 03:41:10 UTC
FYI, the above commit is not the fix.
Comment 26 Kohei Yoshida 2016-12-23 22:10:57 UTC
Some updates.

When the parser throws an exception and halts parsing for the sheet being imported, the exception message contains something along the line of:

  [xl/worksheets/sheet7.xml line 2]: Couldn't find end of Start Tag c

It's not always tag 'c' and sometimes it's tab 'v3', and the same xml stream that previously caused the exception to be throw may parse just fine on next attempt.

Reading this page

  http://xmlsoft.org/threads.html

suggests that, while libxml2 is thread-safe, we do need to call xmlInitParser() on the "main" thread once before starting multi-threaded parsing, yet we are calling this function once on each thread.  Also, it's not clear whether we build our libxml2 package with the necessary thread support turned on (the doc says we need to build with --enable-threads options).  I don't see us explicitly enabling the thread support on libxml2.
Comment 27 Michael Meeks 2016-12-24 18:02:38 UTC
The line 2 is a pain - I guess pretty printing the XML might help (?)

> "Couldn't find end of Start Tag c"

is generated by libxml2 here:

https://git.gnome.org/browse/libxml2/tree/parser.c#n11652

Luckily a unique message for the XML parser ...

I think its very unlikely that libxml2 has a bug - so this is further confirmation that this is likely to be in the package / unzipping code. I guess we need a unit test that un-zips lots of different streams concurrently in different threads and CRCs them (I guess) =)
Comment 28 Kohei Yoshida 2017-01-14 15:35:47 UTC
I think I know what the problem is.

Each input stream that each thread uses to parse the XML stream is actually a wrapper and shares the same underlying raw byte stream that represents the whole zipped archive.  So, when the wrapper tries to read bytes, it performs a series of seek-read-seek-read on the raw byte stream.  While each seek and read calls are protected by mutex, it doesn't guarantee that, when performed in parallel by multiple threads, these seek-read-seek-read calls happen atomically in each thread.  And sometimes one thread calls seek to change the current position on the byte stream and another thread calls seek to change it before the first thread has the chance to read the bytes from the original position.

The solution I have is to fully buffer the input stream for each thread to avoid sharing the raw byte stream between multiple threads.  I've tested my change on Windows, and now I can no longer reproduce the problem.
Comment 29 Commit Notification 2017-01-14 17:31:15 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4ae705d02df0ddf75b97d0e94add6994626f487e

tdf#97597: Ensure that each parsing thread has its own buffer.

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 30 Michael Meeks 2017-01-16 09:56:41 UTC
Nice catch Kohei ! =) Its an interesting case where the UNO API design is really unhelpful. I guess lots of file-system APIs have a "ReadBytesAt" construction for exactly this race - that lets people read without needing to seek then read each time atomically.
Comment 31 Commit Notification 2017-01-16 20:43:03 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "libreoffice-5-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6e55e41632aabbca25ec263cd26a631ca105d0ab&h=libreoffice-5-3

tdf#97597: Ensure that each parsing thread has its own buffer.

It will be available in 5.3.0.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 32 Cor Nouws 2017-01-16 21:48:44 UTC
thanks Kohei!
Comment 33 Kohei Yoshida 2017-01-16 21:49:46 UTC
I'll mark this fixed for now.  Hopefully my fix resolves this for good.
Comment 34 Commit Notification 2017-01-17 02:06:43 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=294f2e627cc6f1d0483f7affcf96467a4bd3ba5a

tdf#97597: attempt to add test for multithreaded input stream buffering.

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 35 Commit Notification 2017-01-17 21:17:26 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "libreoffice-5-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c418c2c310d2d0b1dbb119d1c25c2418e3de06ef&h=libreoffice-5-2

tdf#97597: Ensure that each parsing thread has its own buffer.

It will be available in 5.2.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 36 Mark Batten-Carew 2017-01-19 21:54:37 UTC
My problem (empty tabs after opening an XLSX file over a network from LO) seems to match this bug, but I have another symptom as well.

Sometimes many rows of a tab are there, but halfway through one row, the data cuts off and all following rows are missing.

Does this bug likely cause this type of missing rows symptom as well.   The tab with lots of rows missing had thousands of rows, roughly 26 columns wide.
Comment 37 Michael Meeks 2017-01-20 10:05:59 UTC
> Sometimes many rows of a tab are there, but halfway through one row, the
> data cuts off and all following rows are missing.

Sounds remarkable similar. Can you reproduce it with this fix ? =)