Bug 129895 - LOOL: downloading as pdf causes CRASH (in NixOS)
Summary: LOOL: downloading as pdf causes CRASH (in NixOS)
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice Online
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL: https://github.com/NixOS/nixpkgs/pull...
Whiteboard: target:7.0.0
Keywords:
: 129894 (view as bug list)
Depends on:
Blocks: Crash
  Show dependency treegraph
 
Reported: 2020-01-09 01:39 UTC by Martin Milata
Modified: 2020-05-21 16:26 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
backtrace from gdb (32.66 KB, text/plain)
2020-01-09 01:39 UTC, Martin Milata
Details
journald snippet (14.46 KB, text/plain)
2020-01-09 01:39 UTC, Martin Milata
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Milata 2020-01-09 01:39:36 UTC
Created attachment 157023 [details]
backtrace from gdb

Hi! I'm trying to package LibreOffice Online for the NixOS linux distribution. It's mostly working now however there are some crashes. Please note that NixOS has unusual filesystem layout, meaning that the systemplate setup script had to be rewritten from scratch and may possibly be missing files.

Versions:
core: git tag CODE-4.2.0-2 (commit 3d7bdc46)
online: git tag CODE-4.2.0-2 (commit 3b4be91f)
nextcloud: 17.0.2
collabora app: 3.5.1
(loolwsd is running behind https nginx proxy)

Steps to reproduce:
1. create New Document in Nextcloud
2. use menu File > Download As > PDF Document (.pdf)
3. observe child process segfaulting

Let me know if any other information is needed. I can also provide access to disposable VM with the running system.
Comment 1 Martin Milata 2020-01-09 01:39:59 UTC
Created attachment 157024 [details]
journald snippet
Comment 2 Aron Budea 2020-01-24 08:23:11 UTC
I couldn't reproduce this one, either.

Now what seems to be interesting, this is the piece of code that crashes:
https://opengrok.libreoffice.org/xref/core/comphelper/source/misc/hash.cxx?r=368f2000#78

Likely mpContext is a null pointer, and a check and a sane failure state would be warranted there, but at the same time, this code tries to initialize NSS for an MD5 hash (in this case, for the PDF output). How could that fail? Is there something special about NSS in NixOS? Does exporting PDF work in a similarly built LibreOffice (desktop)?
Comment 3 Martin Milata 2020-01-30 16:34:41 UTC
Thanks for the NSS pointer! The problem was similar to #121429 in that libnss tried to load libsoftokn3.so at runtime which it couldn't find, resulting in segfault.

The reason that the file couldn't be found is that in my systemplate the library was located under "lib" directory with "lib64" symlink to it. During the child root initialization this symlink was lost, then NSS tried to look for the library under "lib64" and failed.

The symlink was lost because in kit/Kit.cpp, the nftw() function is used to populate child roots with systemplate contents. According to its manual page, without the FTW_PHYS flag, "symbolic links are followed, but no file is reported twice". What makes it worse is that the order of the traversal is not defined so sometimes the right directory existed and sometimes not.

This is not a problem for existing installations because loolwsd-systemplate-setup creates systemplate without symlinks. Still I think this source of nondeterminism might bite someone else in the future so I'm going to submit a patch that changes linkOrCopy to use the FTW_PHYS flag.
Comment 4 Martin Milata 2020-01-30 16:36:13 UTC
*** Bug 129894 has been marked as a duplicate of this bug. ***
Comment 5 Martin Milata 2020-02-03 21:59:11 UTC
I've submitted the patch as https://gerrit.libreoffice.org/c/online/+/87749/1 but Jenkins doesn't like it. Can't access https://cpci.cbg.collabora.co.uk/job/Gerrit%20for%20online%20master/738/ to find out why, though. Can you give a hint please, Aron?
Comment 6 Aron Budea 2020-02-04 10:22:26 UTC
(In reply to Martin Milata from comment #5)
> I've submitted the patch as
> https://gerrit.libreoffice.org/c/online/+/87749/1 but Jenkins doesn't like
> it. Can't access
> https://cpci.cbg.collabora.co.uk/job/Gerrit%20for%20online%20master/738/ to
> find out why, though. Can you give a hint please, Aron?
This is the error, and I assume the CI build system is using an outdated Poco version (linkTo(...) was added in 1.8.1), we will take care of that and retrigger the build.

kit/Kit.cpp: In function ‘int {anonymous}::linkOrCopyFunction(const char*, const stat*, int, FTW*)’:
kit/Kit.cpp:298:30: error: ‘class Poco::File’ has no member named ‘linkTo’
                 File(target).linkTo(newPath.toString(), Poco::File::LinkType::LINK_SYMBOLIC);
                              ^~~~~~
kit/Kit.cpp:298:69: error: ‘Poco::File::LinkType’ has not been declared
                 File(target).linkTo(newPath.toString(), Poco::File::LinkType::LINK_SYMBOLIC);
                                                                     ^~~~~~~~
Comment 7 Commit Notification 2020-04-07 11:22:57 UTC
Martin Milata committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/online/commit/c571d9286df907f05838e6f1fca3139aae62cbc5

tdf#129895: handle symlinks when populating chroot
Comment 8 Aron Budea 2020-05-21 12:03:53 UTC
Thanks for the patch, Martin, did it fix the issue in the end?
Comment 9 Martin Milata 2020-05-21 16:23:49 UTC
It did. Thanks for merging it!
Comment 10 Aron Budea 2020-05-21 16:26:05 UTC
Glad to hear that, and thanks for the confirmation! Closing as FIXED, then.