Bug 93553 - FILESAVE: Writer crashes or hangs due to too many parallel threads or data
Summary: FILESAVE: Writer crashes or hangs due to too many parallel threads or data
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
4.4.5.2 release
Hardware: x86 (IA32) All
: medium major
Assignee: Armin Le Grand
URL:
Whiteboard: target:5.2.0 target:5.1.4
Keywords: haveBacktrace
Depends on:
Blocks:
 
Reported: 2015-08-21 01:01 UTC by Ben
Modified: 2016-10-25 19:02 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Writer file causing threading error (3.15 MB, application/vnd.oasis.opendocument.text)
2016-02-06 23:02 UTC, Ben
Details
Thread error message (10.60 KB, image/jpeg)
2016-02-06 23:03 UTC, Ben
Details
Test file used on Ubuntu (3.10 MB, application/vnd.oasis.opendocument.text)
2016-03-13 23:28 UTC, Ben
Details
Writer file causing threading error DEBUG (27.42 KB, text/plain)
2016-03-14 17:36 UTC, Timur
Details
WinDbg stack trace with TB39 (master of 2016-03-16) (94.96 KB, text/plain)
2016-03-17 18:50 UTC, V Stuart Foote
Details
Debugger process attachment error (3.86 KB, image/png)
2016-04-02 16:41 UTC, Ben
Details
Document used for load and save time testing (9.12 MB, application/vnd.oasis.opendocument.text)
2016-04-02 20:16 UTC, Ben
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ben 2015-08-21 01:01:10 UTC
I upgraded from 4.37 to 4.4.5 and then to 5.0.0. The 4.3.7 version is completely stable. The last two upgrades crashes Writer (4.4.5) or hangs it (5.0.0) after opening a 500 page document, a single character change, and then saving it. After opening the file again, Writer goes through recovery mode, opens the document, but a single character change has the same effect. Backing down to 4.3.7 does not cause crashes anymore.

The document has 22 figures and many tables.
Comment 1 Joel Madero 2015-08-21 01:21:51 UTC
We need the document in order to do anything at all.

Setting to NEEDINFO - attach the document and then set to UNCONFIRMED. If it's confidential, try to scrub it, but really without the document we can't confirm the bug let alone fix it.

Thanks
Comment 2 Joel Madero 2015-08-21 01:22:49 UTC
Changing version also as it's supposed to be earliest version not latest and removing BLOCKER - changing to Major.
Comment 3 Ben 2015-08-22 20:54:11 UTC
Looking in LibreOffice folder there are many python scripts. The python in C:\Program Files (x86)\LibreOffice 5\program\python-core-3.3.3\bin did not execute correctly saying that python33.dll does not exist. Upon removing python version on the system and installing python3.3.3, the exucutable runs correctly. This small fix does not solve the crash/hanging problem. However, the python version included in LibreOffice 5 package should be modified to work upon install.

I will keep looking into the crash problem a bit more.
Comment 4 Ben 2015-08-22 21:27:09 UTC
I have backed down to working version 4.3.7 again. I would need two computers (which I don't have) to test the issue the issue in more detail, since LibreOffice does not allow two different versions on the same PC. I would have to scrub a large amount of my document to forward it to you. Maybe it would be possible for you to test LibreOffice with 1000 pages, many figures, and a few hundred tables to see what happens. In the mean time, I will try to get another PC to test it, but this may take several weeks.

Thank you.
Comment 5 Ben 2015-09-07 03:22:42 UTC
Tested if the latest version 'LibreOffice 5.0.1 for Windows' would solve the issue, but the issue remains the same: hang/crash.

Additional info: file size is 14.78 MB.
Comment 6 Ben 2016-02-06 22:55:51 UTC
I upgraded again to further diagnose the issue: On Windows, I upgraded from 4.3.7.2 to LibreOffice 5.0.4. The 4.3.7.2 version is completely stable. Version 5.0.4 crashed when I opened a file and generated a threading error (see attachments). When I deleted pieces of the original file, I could sometimes save it without any issue, but the behavior seems to be random.
Backing down to 4.3.7.2 does not cause crashes anymore.
Comment 7 Ben 2016-02-06 23:02:32 UTC
Created attachment 122418 [details]
Writer file causing threading error

Writer file causing threading error
Comment 8 Ben 2016-02-06 23:03:17 UTC
Created attachment 122419 [details]
Thread error message

Thread error message
Comment 9 Buovjaga 2016-03-02 19:52:24 UTC
No crash on Linux.

Please try to get a backtrace of the crash: https://wiki.documentfoundation.org/How_to_get_a_backtrace_with_WinDbg

I also recommend trying out 5.1.1 when it is released this week.

This might be worth a shot as well: https://wiki.documentfoundation.org/UserProfile#Resolving_corruption_in_the_user_profile

Set to NEEDINFO.
Change back to UNCONFIRMED after further testing.

Tested on:
64-bit, KDE Plasma 5
Build ID: 5.1.0.3 Arch Linux build-1
CPU Threads: 8; OS Version: Linux 4.4; UI Render: default; 
Locale: fi-FI (fi_FI.UTF-8)
Comment 10 Ben 2016-03-05 18:59:40 UTC
I have re-created the user profile a while ago when I already had the issue. I will try the newer version when it is out. Also haven't run the debug.
Comment 11 Ben 2016-03-13 23:27:05 UTC
Installed Virtual Box and Ubuntu that came with LibreOfice 5 on my Windows PC. I appended formulas (1.1), (1.2), and (1.3), with %alpha, %beta%Gamma, and %alpha%Gamma%delta, respectively. After each formula change, I tried to save the file. Writer still crashed 2 out of 3 times and the last time I could not even recover the file when I tried to open it after the crash. See attached file which I started with to do the test
Comment 12 Ben 2016-03-13 23:28:39 UTC
Created attachment 123548 [details]
Test file used on Ubuntu
Comment 13 Ben 2016-03-14 03:44:44 UTC
Installed LibreOffice 5.1.1 on Ubuntu (Debian).
So far, it seems that the crash issue does not appear anymore. I will perform som more testing and then install 5.1.1. also on Windows and wil continue to report the status. One major issue I notied is that most (if not all) of the formulas in the document have been changed to a different font. Is there an easy way to revert them to the text font used in the document?
Comment 14 Joel Madero 2016-03-14 04:05:22 UTC
(In reply to Ben from comment #13)
> Installed LibreOffice 5.1.1 on Ubuntu (Debian).
> So far, it seems that the crash issue does not appear anymore. I will
> perform som more testing and then install 5.1.1. also on Windows and wil
> continue to report the status. One major issue I notied is that most (if not
> all) of the formulas in the document have been changed to a different font.
> Is there an easy way to revert them to the text font used in the document?

Not as far as I know :( Is this with the test file attached that you see different fonts?
Comment 15 Timur 2016-03-14 17:36:46 UTC
Created attachment 123568 [details]
Writer file causing threading error DEBUG
Comment 16 Ben 2016-03-14 22:26:17 UTC
I did some further testing with 64 bit 5.5.1 on Windows 10. Editing formulas in the 450 page file I created with Writer version 4.3.7.2 does not seem to crash writer anymore upon saving the file. However, I observed some new phenomena with 5.5.1:

- After pressing the save button it takes several seconds before a Saving document message appears at the bottom of the screen.
- During a document, there is no progress bar indicated at the bottom of the screen. With 4.3.7.2 there is a progress bar twice traversed.
- Saving a document seems to be quite a bit slower in 5.5.1 (my impression is at least a factor 2 slower).
- Opening formulas for editing takes about 4 seconds and is also slower than before.
- Loading a document into Writer does not show a progress bar.
- Inline formulas are sometimes indicated as a box without content.
- Once, I got a bad allocation pop up error.
- When I opened a read-only document and then attempted to save it under a new name, it did not complete the save. I had to stop Writer in the task manager. After opening the document it recovered successfully.

Please let me know if these are new artifacts for which new tickets need to be created. I could not find related tickets from 2016.

Thank you for all the good work.
Comment 17 Joel Madero 2016-03-14 22:29:02 UTC
Based on the latest info I'm closing this as WFM. As far as reporting - we just ask that each new problem be put in a new report :) We'll triage them as needed. One bug per report. Thanks Ben!
Comment 18 Timur 2016-03-15 07:50:44 UTC
Joel, I confirmed the bug because I got a crash on the first save, as indicated in the title. And I made a backtrace. 
So, how do you close as WFM? Did you try to use the file? Even then, there's a backtrace. Did you look at it? If it's not useful, that's sth. else.
Comment 19 Joel Madero 2016-03-15 14:57:23 UTC
@Timur - did you test on 5.1? Ben (original reporter) says that he's not having a crash on 5.1 .... that's why I closed it. I'll set to NEW again. If you can't confirm on 5.1 please close it :)
Comment 20 Timur 2016-03-15 15:18:57 UTC
I reproduced it with master~2016-03-07_03.36.17_LibreOfficeDev_5.2.0.0.alpha0_Win_x86.
Comment 21 Ben 2016-03-15 16:13:32 UTC
Hi Timur, Joel,

I performed more editing on my 450 page document. I copied and changed that order of some formulas and attempted to save the document. The save action did not complete. I had to go into Task Manager to terminate the Writer process and then recover the document. I have experienced this several times. I sometimes also get a "bad allocation" pop up message. I don't know if this is all related to formulas or there is another underlying issue. Probably I run into some limitations which do not occur with small documents.

At this point, I will have to re-install 4.3.7.2, because the 5.5.1 version is unstable.

Thank you for looking into this issue.
Comment 22 Ben 2016-03-16 00:05:21 UTC
I made the same changes to one page of the document in a different order. Sometimes the save file action fails, sometimes it successfully completes.
Comment 23 Armin Le Grand 2016-03-16 10:45:35 UTC
Loaded, save as -> got exceptions for bad alloc in construction of ZipOutputEntry, the uno::Sequence<sal_Int8> could not be created, inited to a standard size of 32768 bytes (see ZipOutputEntry::ZipOutputEntry, n_ConstBufferSize). Currently uses mem is about 1.235.988K on Win7, using 32Bit Office version.
@Ben: Does this happen with the 64Bit Win version, too?
Comment 24 Timur 2016-03-16 11:12:48 UTC
Mine is Win7 64-bit. Can you please confirm backtrace is useful?
Comment 25 Armin Le Grand 2016-03-16 11:20:16 UTC
Re-checked saveAs: mem goes up to 250.000K, stays a while, then gioes up to 1.230.664K. That happens when ZipPackageStream::saveChild is triggered. Each call adds a ZipOutputEntry when parallell processing is allowed.
That stops at 6055 such parallell created data packages and dies out of memory.
As nice as it is to save and zip in parallell, it should be dependent not only of encryption (which it is due to tdf#89236) and number of cores (which I cannot see), but also of number of tasks to perform, at least on 32bit OS versions.

As a workaround it should be possible to save encrypted (even if not needed) since that seems to switch off parallell zipping at save time.

Checking if this can be limited to a number of threads - more than cores should not be useful anyways...
Comment 26 Armin Le Grand 2016-03-16 11:26:28 UTC
@Timur: Yes, it shows where it happens, too. It's not about Win7 64Bit, but about the Office being 32 or 64 bit. Seems you also used the Office 32Bit version.
On 64Bit it may work, but will sooner or later also fail - too many threads or data for them will be allocated sooner or later. It is a design flaw to change something to parallelism and not limit it to a useful number of threads from the beginning - as canbe seen ;-)
Comment 27 Timur 2016-03-16 11:49:41 UTC
Thank you. Does it mean it's "inherited" then? It doesn't sound like it's a regression, and bibisectRequest wouldn't make sense. If I understood well, it's not Windows only, so I remove it.
Comment 28 Armin Le Grand 2016-03-16 12:08:45 UTC
Hard-suppressed parallelism and save went well (using bParallelDeflate), thus proove of reason.
@Timur: Inherited and Regression depends on if parallell execution was added from the beginning or later - I do not know. For the systems: on 64Bit it will be much less probable, so mainly 32Bit versions should be affected.
Comment 29 Armin Le Grand 2016-03-16 14:03:19 UTC
Working on a version that uses hardware-information about available cores to early reduce proccessed threads. These cannot just be removed, they need to be processed as intended. This works, but I need to make some more tests on it.
Comment 30 Armin Le Grand 2016-03-16 14:19:29 UTC
Looks good, needs much less memory (stays around 250000K now) and is even faster. Comitted solution to gerrit.
Comment 31 Ben 2016-03-16 14:26:53 UTC
Armin,

Thank you very much for taking this on. Could you also take a look at my other tickets. I wonder if they could be related to this one. See e.g. 98665 and 98666.
Comment 32 Armin Le Grand 2016-03-16 14:33:58 UTC
Hi Ben,

you may check yourself :) Use gerrit (https://gerrit.libreoffice.org/#/c/23305/) to cherry-pick the change to a built checkout (top-right, download, choose cherry, pick), paste to the root in your shell where you build, build.
Comment 33 Armin Le Grand 2016-03-16 14:56:44 UTC
Took a short look at the tasks mentioned - the save speed may be kicked by this taskfix, the load progress probably not.
Comment 34 V Stuart Foote 2016-03-16 20:24:13 UTC
(In reply to Armin Le Grand (CIB) from comment #33)
> Took a short look at the tasks mentioned - the save speed may be kicked by
> this taskfix, the load progress probably not.

Ben reported in bug 98664 and bug 98666 that disabling OpenGL rendering restored the save/open progress bar widget.
Comment 35 Ben 2016-03-17 14:33:14 UTC
So far so good. I have not experienced crashes upon document editing.
However, I did experience two lock ups when I worked with the Bibliography database, added entries to the database, and then added Bibliography entries to the document while having the databases open. I had to use the Task Manager to terminate Writer and then recover the document. This may be unrelated to the original issue flagged in this ticket. I will keep an I on it, will try to come up with a better description of the new issue if possible, and create a new ticket if needed.
Comment 36 V Stuart Foote 2016-03-17 18:50:50 UTC
Created attachment 123670 [details]
WinDbg stack trace with TB39 (master of 2016-03-16)

(In reply to Armin Le Grand (CIB) from comment #30)
> Looks good, needs much less memory (stays around 250000K now) and is even
> faster. Comitted solution to gerrit.

On windows 8.1 Ent 64-bit en-US with
Version: 5.2.0.0.alpha0+
Build ID: 6eb7cd38e348e8a9d6498bfc2d41e91725eb34aa
CPU Threads: 8; OS Version: Windows 6.29; UI Render: GL; 
TinderBox: Win-x86@39, Branch:master, Time: 2016-03-16_12:53:35
Locale: en-US (en_US)

Got it to hang on save, pulled a stack trace using TB39 symbols, result duplicates Timur's, i.e. at the same package2 calls.

When https://gerrit.libreoffice.org/#/c/23305/ rolls, will retest.
Comment 37 Ben 2016-03-17 19:05:56 UTC
Same hang on filesave experience as Stuart after about one day of work peforming regular editing actions, without the Bibliography database open.
Comment 38 Ben 2016-03-17 22:55:50 UTC
I edited about 10 large formulas and then performed a filesave which made Writer hang. Terminated Writer in Task Manager, was able to recover file, but lost last edits.
Comment 39 V Stuart Foote 2016-03-18 00:14:04 UTC
@Ben,

Armin's refactoring in https://gerrit.libreoffice.org/#/c/23305/ is not yet integrated into the program--unless you have built your own instance from source and included it.

Until then nothing has changed with the 5.1.2 rc1 or the nightly TB builds of master, and you will continue to have instability with the parallel compression. Of course there could be another OLE related issue, as the Math formulas are OLE objects linked into the Writer canvas and converted to meta.

For now, sit tight--nothing more to be gained. We'll be able to test once Armin's patch makes it out of code-review in Gerrit rolls in a TB build (or we can roll our own) and go on from there.

Stuart
Comment 40 Commit Notification 2016-03-22 08:57:55 UTC
Armin Le Grand committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=7e2ea27e5d56f5cf767a6718a0f5edc28e24af14

tdf#93553 limit parallelism at zip save time to useful amount

It will be available in 5.2.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 41 Timur 2016-03-25 08:37:16 UTC
I guess it's fixed. No immediate crash on save or scroll with master~2016-03-22_23.57.30_LibreOfficeDev_5.2.0.0.alpha0_Win_x86.
Feel free to reopen if proved otherwise.

Ben, once you tested and confirmed yourself, please set status to "Verified".
I recommend using Separate Install GUI tool http://tdf.io/siguiexe in Windows. It downloads and extracts different LO versions, without installing, so you may test different versions. It only needs MS Visual C++ Runtime installed. 
For future bug reporting, please provide backtrace of the crash, as requested here.
And, I kindly ask you to keep number and length of comments to minimum, after a search for various side issues. 

Armin, please consider backporting to 5.1. 

BTW: There's Bug 98558 that may be related to this one, but I couldn't test because this Writer file had fileopen problem with LO prior to 4.0.
Comment 42 V Stuart Foote 2016-03-25 12:35:43 UTC
On Windows 10 Pro 64-bit en-US with
Version: 5.2.0.0.alpha0+
Build ID: 15b53976e5d119877e53f34b34cee33a5f2883fd
CPU Threads: 8; OS Version: Windows 6.19; UI Render: default; 
TinderBox: Win-x86@39, Branch:master, Time: 2016-03-22_23:57:30
Locale: en-US (en_US)

Test file now saves cleanly, and faster. Thanks Armin!
Comment 43 Ben 2016-03-29 00:47:24 UTC
Armin

I Installed master~2016-03-22_23.57.30_LibreOfficeDev_5.2.0.0.alpha0_Win_x86.
It worked for a while, but it locked up today. I had to go into Task Manager to terminate LO. A restart recovered the file.

I don't have Visual C++. I will try to provide a backtrace, but have never performed this before, since I'm not a developer.
Comment 44 V Stuart Foote 2016-03-29 03:12:28 UTC
(In reply to Ben from comment #43)
> I don't have Visual C++. I will try to provide a backtrace, but have never
> performed this before, since I'm not a developer.

Ben,

You don't need the full Visual C++ just the Debugging tools from the SDK, pretty clear instructions for doing the stacktrace on the TDF Wiki here:

https://wiki.documentfoundation.org/How_to_get_a_backtrace_with_WinDbg

PM email to me if you get stuck.

Stuart
Comment 45 Ben 2016-04-02 16:38:47 UTC
1. Installed master~2016-03-22_23.57.30_LibreOfficeDev_5.2.0.0.alpha0_Win_x86

2. From Help Menu: LibreOffice Version: 5.1.1.3 (x64)
Build ID: 89f508ef3ecebd2cfb8e1def0f0ba9a803b88a6d
CPU Threads: 8; OS Version: Windows 6.19; UI Render: default; 
Locale: en-US (en_US)
is installed in C:\Program Files\LibreOffice 5

3. There is a discrepancy between the version indicated in 1 and 2. I don't now if this is normal or not.

4. Installed: vcredist_x64.exe from from: https://www.microsoft.com/en-us/download/details.aspx?id=40784

5. Installed backtrace with WinDbg sdksetup.exe for Windows 10 from https://wiki.documentfoundation.org/How_to_get_a_backtrace_with_WinDbg

6. Install location: C:\Program Files (x86)\Windows Kits\10\

4. Cleared all checkboxes except 'Debugging Tools for Windows'.

7. Entered CACHE*C:\symbols;SRV*http://dev-builds.libreoffice.org/daily/master/Win-x86@39/symbols;SRV*http://dev-downloads.libreoffice.org/symstore/symbols;SRV*http://msdl.microsoft.com/download/symbols

8. Started debugger.

9. Started LibreOffice Writer.

10 Attempted to attach  soffice.bin process using File --> Attach process in debugger

11. Got error message popup: Could not attach to process 1020. See attachment.
Comment 46 Ben 2016-04-02 16:41:43 UTC
Created attachment 124031 [details]
Debugger process attachment error
Comment 47 Ben 2016-04-02 16:45:51 UTC
After using LibreOffice Writer master~2016-03-22_23.57.30_LibreOfficeDev_5.2.0.0.alpha0_Win_x86 for a while I ran into the same degree of instability issue as before. When I save a 15MB file hangs frequently. I get the impression this occurs more often when I switch to another application (like Chrome) while Writer is saving the file.
Comment 48 Buovjaga 2016-04-02 16:49:26 UTC
(In reply to Ben from comment #45)
> 1. Installed master~2016-03-22_23.57.30_LibreOfficeDev_5.2.0.0.alpha0_Win_x86
> 
> 2. From Help Menu: LibreOffice Version: 5.1.1.3 (x64)
> Build ID: 89f508ef3ecebd2cfb8e1def0f0ba9a803b88a6d
> CPU Threads: 8; OS Version: Windows 6.19; UI Render: default; 
> Locale: en-US (en_US)
> is installed in C:\Program Files\LibreOffice 5
> 
> 3. There is a discrepancy between the version indicated in 1 and 2. I don't
> now if this is normal or not.

The dev version is installed in C:\Program Files (x86)\LibreOffice 5 dev
Comment 49 Ben 2016-04-02 20:13:11 UTC
Ok, so I used the wrong version. I am now editing with the dev version. From the help menu: Version: 5.2.0.0.alpha0+
Build ID: 15b53976e5d119877e53f34b34cee33a5f2883fd
CPU Threads: 8; OS Version: Windows 6.19; UI Render: default; 
TinderBox: Win-x86@39, Branch:master, Time: 2016-03-22_23:57:30
Locale: en-US (en_US)

1. I rebooted my PC to ensure and had only Writer running.

2. The times below is what I experience in terms of laod and save times:

- Opened the attached document: 2 minutes to load
- Save the document after some minor edits: 5 minutes
- Restarted Writer and opened the document again: 2 minutes
- Saved the document after minor edits: 1:45 minutes.
- Save the document again after minor edits: 2 minutes.

3. I have not experienced hangs yet, because my initial focus was on load and save.

Conclusions: Load and save is much slower than before. My real document is almost 2x as large as the attached document and I experienced much longer load and save times.
Comment 50 Ben 2016-04-02 20:16:18 UTC
Created attachment 124032 [details]
Document used for load and save time testing
Comment 51 Buovjaga 2016-04-03 09:59:31 UTC
Regarding the slowness of load and save: one thing I learned recently is that debug builds have different behavior related to such performance. The build you are using has the debug features enabled.

Maybe you should try this, which is not a debug build: http://dev-builds.libreoffice.org/daily/master/Win-x86@62-merge-TDF/current/
Comment 52 Ben 2016-04-16 21:46:06 UTC
I have tried  http://dev-builds.libreoffice.org/daily/master/Win-x86@62-merge-TDF/current/ as suggested by Buovjaga. So far, after using this version for 5 hours, I haven't had any hangings or crashes. I will keep editing my document and report my results again in a few weeks after more experience with the new version.
Comment 53 Ben 2016-05-07 17:23:07 UTC
So far, I have not experienced any issues with the dev build. LibreOffice writer does not hang anymore during edits or when I keep it open for days. As far as I'm concerned, the ticket can be closed.

Thank you for all your help to get this issue resolved!
Comment 54 Michael Stahl (CIB) 2016-05-10 09:05:47 UTC
the bugzilla script is having a bad day

was pushed to libreoffice-5-1 for 5.1.4.1 in commit 97f46313d304e63e8efc9761ac1e490b186683c1