Created attachment 157023 [details] backtrace from gdb Hi! I'm trying to package LibreOffice Online for the NixOS linux distribution. It's mostly working now however there are some crashes. Please note that NixOS has unusual filesystem layout, meaning that the systemplate setup script had to be rewritten from scratch and may possibly be missing files. Versions: core: git tag CODE-4.2.0-2 (commit 3d7bdc46) online: git tag CODE-4.2.0-2 (commit 3b4be91f) nextcloud: 17.0.2 collabora app: 3.5.1 (loolwsd is running behind https nginx proxy) Steps to reproduce: 1. create New Document in Nextcloud 2. use menu File > Download As > PDF Document (.pdf) 3. observe child process segfaulting Let me know if any other information is needed. I can also provide access to disposable VM with the running system.
Created attachment 157024 [details] journald snippet
I couldn't reproduce this one, either. Now what seems to be interesting, this is the piece of code that crashes: https://opengrok.libreoffice.org/xref/core/comphelper/source/misc/hash.cxx?r=368f2000#78 Likely mpContext is a null pointer, and a check and a sane failure state would be warranted there, but at the same time, this code tries to initialize NSS for an MD5 hash (in this case, for the PDF output). How could that fail? Is there something special about NSS in NixOS? Does exporting PDF work in a similarly built LibreOffice (desktop)?
Thanks for the NSS pointer! The problem was similar to #121429 in that libnss tried to load libsoftokn3.so at runtime which it couldn't find, resulting in segfault. The reason that the file couldn't be found is that in my systemplate the library was located under "lib" directory with "lib64" symlink to it. During the child root initialization this symlink was lost, then NSS tried to look for the library under "lib64" and failed. The symlink was lost because in kit/Kit.cpp, the nftw() function is used to populate child roots with systemplate contents. According to its manual page, without the FTW_PHYS flag, "symbolic links are followed, but no file is reported twice". What makes it worse is that the order of the traversal is not defined so sometimes the right directory existed and sometimes not. This is not a problem for existing installations because loolwsd-systemplate-setup creates systemplate without symlinks. Still I think this source of nondeterminism might bite someone else in the future so I'm going to submit a patch that changes linkOrCopy to use the FTW_PHYS flag.
*** Bug 129894 has been marked as a duplicate of this bug. ***
I've submitted the patch as https://gerrit.libreoffice.org/c/online/+/87749/1 but Jenkins doesn't like it. Can't access https://cpci.cbg.collabora.co.uk/job/Gerrit%20for%20online%20master/738/ to find out why, though. Can you give a hint please, Aron?
(In reply to Martin Milata from comment #5) > I've submitted the patch as > https://gerrit.libreoffice.org/c/online/+/87749/1 but Jenkins doesn't like > it. Can't access > https://cpci.cbg.collabora.co.uk/job/Gerrit%20for%20online%20master/738/ to > find out why, though. Can you give a hint please, Aron? This is the error, and I assume the CI build system is using an outdated Poco version (linkTo(...) was added in 1.8.1), we will take care of that and retrigger the build. kit/Kit.cpp: In function ‘int {anonymous}::linkOrCopyFunction(const char*, const stat*, int, FTW*)’: kit/Kit.cpp:298:30: error: ‘class Poco::File’ has no member named ‘linkTo’ File(target).linkTo(newPath.toString(), Poco::File::LinkType::LINK_SYMBOLIC); ^~~~~~ kit/Kit.cpp:298:69: error: ‘Poco::File::LinkType’ has not been declared File(target).linkTo(newPath.toString(), Poco::File::LinkType::LINK_SYMBOLIC); ^~~~~~~~
Martin Milata committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/online/commit/c571d9286df907f05838e6f1fca3139aae62cbc5 tdf#129895: handle symlinks when populating chroot
Thanks for the patch, Martin, did it fix the issue in the end?
It did. Thanks for merging it!
Glad to hear that, and thanks for the confirmation! Closing as FIXED, then.