Bug 95843 - Headless mode leaves zombie process
Summary: Headless mode leaves zombie process
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
4.4.6.3 release
Hardware: x86-64 (AMD64) Linux (All)
: medium minor
Assignee: Stephan Bergmann
URL:
Whiteboard: target:6.1.0 target:6.0.5
Keywords: needsDevEval
Depends on:
Blocks:
 
Reported: 2015-11-16 05:43 UTC by ufaowl
Modified: 2018-05-13 14:11 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ufaowl 2015-11-16 05:43:54 UTC
I'm running soffice in headless mode and it leaves 1 zombie process. I start it with: /opt/libreoffice4.4/program/soffice.bin --headless

"ps aux| grep sof" shows me this:

root 3067 1.0 3.9 591180 40312 pts/0 Sl+ 08:31 0:00 /opt/libreoffice4.4/program/soffice.bin --headless

root 3069 0.0 0.0 0 0 pts/0 Z+ 08:31 0:00 [soffice.bin] <defunct>

the process gets SIGCHLD signal but doesn't issue wait()

--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3069, si_status=1, si_utime=0, si_stime=0} ---
Comment 1 Alex Thurgood 2015-11-16 10:28:37 UTC
This appears to have nothing to do with the database (Base) module of LibreOffice ? Setting to component LibreOffice.
Comment 2 Nicholas Niro 2015-11-19 18:14:10 UTC
It's not really a good idea to run programs like libreoffice as root (or any program other than to actually change settings or run privileged apps). Libreoffice has absolutely no need to be ran as root under any conditions. (same goes for wine and many other programs).

I tried to run soffice with the flag --headless. It shows this in my process list : 

$ ps aux | grep soffice
nik_89   28745  0.0  0.7 505176 57112 pts/17   Sl+  12:51   0:00 /opt/libreoffice4.4/program/soffice.bin --headless
nik_89   28747  0.0  0.0      0     0 pts/17   Z+   12:51   0:00 [soffice.bin] <defunct>

The program soffice is still running while I dumped the process list. If I SIGINT the running process by pressing Ctrl + C (in the window where soffice --headless is currently running), it is killed and it no longer shows in the process list.

A zombie process is one that stays in the process list regardless of killing attempts. (You could kill -9 <pid> repeatedly and it would still stay there).

Try to run soffice as a normal user to see if there's any differences.
set this back to UNCONFIRMED when you answer please.

Libreoffice version : 4.4.3.2 
Build Id : 88805f81e9fe61362df02b9941de8e38a9b5fd16
distro : Void linux x86-64
Comment 3 ufaowl 2015-11-20 03:24:51 UTC
Hi.
I've tried to run it as non-root user, but the problem persits:
[owl@CentOSo ~]$ ps aux|grep soffice
owl       8507  0.6  4.1 591184 42368 pts/0    Sl+  06:16   0:00 ./soffice.bin --headless
owl       8509  0.0  0.0      0     0 pts/0    Z+   06:16   0:00 [soffice.bin] <defunct>
owl       8545  0.0  0.0 112664   968 pts/1    R+   06:17   0:00 grep --color=auto soffice

It is still doesn't recieve wait() from the parent. If you kill parent then initd (or systemd) becomes zombie's new parent. And it get its longwaited wait()
Comment 4 Nicholas Niro 2015-11-20 19:26:42 UTC
So in your example, if you killed process 8507 (kill -9 8507), the process 8509 would still run and be impossible to kill by doing (kill -9 8509)?

I'd be curious to see your output of 'ps aux | grep soffice' after you kill the parent process (like it was for 8507).
Comment 5 ufaowl 2015-11-23 08:39:44 UTC
Child process (8509) disapears immidiately after I kill parent(8507).
Comment 6 Wahrendorff 2016-11-14 11:45:57 UTC
There is still a Zombie Process in Version 5.1.4.2 when starting in headless mode. It does not seem to make any problems.
Comment 7 Xisco Faulí 2017-08-03 16:16:52 UTC Comment hidden (obsolete)
Comment 8 Tom Turelinckx 2017-08-11 13:17:37 UTC
This problem is still present in 5.4.0.3.

Starting in headless mode results in one soffice.bin process and one defunct child process:

resin     2692     1  0 12:23 pts/4    00:00:01 /usr/lib/libreoffice/program/soffice.bin --headless
resin     2693  2692  0 12:23 pts/4    00:00:00 [soffice.bin] <defunct>

The defunct process has state code Z, which the ps man page describes as:

Z    defunct ("zombie") process, terminated but not reaped by its parent

Killing 2692 results in 2693 disappearing as well.

This child process also disappears when connecting to the headless instance. For example:

/usr/lib/libreoffice/program/soffice.bin --headless --accept="socket,host=localhost,port=8100;urp;StarOffice.ServiceMana
ger" &

resin     2851     1  0 12:44 pts/4    00:00:01 /usr/lib/libreoffice/program/soffice.bin --headless --accept=socket,host
resin     2852  2851  0 12:44 pts/4    00:00:00 [soffice.bin] <defunct>

Once JODConverter connects to port 8100, process 2852 disappears. Subsequently disconnecting from the instance and connecting again does NOT result in any new defunct child processes. The headless instance works fine, functionality-wise.

We're typically using a command-line like this to start in headless mode:

/usr/lib/libreoffice/program/soffice.bin --headless --nologo --nodefault --norestore --nocrashreport --nolockcheck --nof
irststartwizard --accept="socket,host=localhost,port=8100;urp;StarOffice.ServiceManager" "-env:UserInstallation=file:///
home/resin/.ooinst0" &

None of those additional parameters has any impact on this problem: the behavior remains the same.
Comment 9 Chris Sherlock 2017-08-29 12:22:33 UTC
So basically, headless mode seems to fork some child processes, then it is killing off the parent process without first calling on wait.
Comment 10 Chris Sherlock 2017-08-29 12:27:25 UTC
Sorry, my bad. I wrote that last comment wrong: the parent process is forking off child processes, then the child process is killed but the parent process is not waiting on the child process. It has nothing to do with the parent process dying or not... in fact, if the parent process dies then the OS should reparent all the children to PID 1, and these zombie will eventually get reaped.
Comment 11 Jean-Baptiste Faure 2017-09-09 09:32:40 UTC
Ok, I see the zombie when I launch LO 5.4.1 in headless mode. This zombie is killed when I close LO by ctrl+C in the terminal where I launched it. Tested under Ubuntu 16.04 x86-64.

So the question is: is it a bug or not?
Set keyword needsDevEval

Best regards. JBF
Comment 12 Travers Carter 2017-12-03 09:28:43 UTC
I'm seeing the same behaviour on 5.4.3.2 (Fedora 27) and 5.0.6.2 (CentOS 7.4.1708)

It looks to me like the issue here is that fire_glxtest_process() launches a child process to test for GLX regardless of --headless, but reaping the child process is handled by the X display initialisation stuff that only runs when the GLX test passes, which it never does in --headless and there doesn't appear to be a path via which the child is reaped in the --headless case (or possibly any GLX test failure?)

I'd guess that either the glxtest child should simply not be launched for --headless but alternatively it probably should be reaped elsewhere for the alternate path(s)

The following stack trace (5.4.3.2) shows the non-headless version reaping the glxtest child from X11OpenGLDeviceInfo::GetData()

Thread 1 "soffice.bin" hit Breakpoint 5, 0x00007ffff75c5d70 in waitpid () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff75c5d70 in waitpid () at /lib64/libc.so.6
#1  0x00007fffeff019a0 in X11OpenGLDeviceInfo::GetData() () at /usr/lib64/libreoffice/program/libvcllo.so
#2  0x00007fffeff01e90 in X11OpenGLDeviceInfo::X11OpenGLDeviceInfo() () at /usr/lib64/libreoffice/program/libvcllo.so
#3  0x00007fffeff0022c in OpenGLHelper::isDeviceBlacklisted() () at /usr/lib64/libreoffice/program/libvcllo.so
#4  0x00007fffeff002a4 in OpenGLHelper::supportsVCLOpenGL() () at /usr/lib64/libreoffice/program/libvcllo.so
#5  0x00007fffeff00398 in OpenGLHelper::isVCLOpenGLEnabled() () at /usr/lib64/libreoffice/program/libvcllo.so
#6  0x00007fffd31d74cd in SalDisplay::BestVisual(_XDisplay*, int, XVisualInfo&) () at /usr/lib64/libreoffice/program/libvclplug_genlo.so
#7  0x00007fffd31dc75f in SalDisplay::initScreen(SalX11Screen) const () at /usr/lib64/libreoffice/program/libvclplug_genlo.so
#8  0x00007fffd31e22cd in vcl_sal::WMAdaptor::WMAdaptor(SalDisplay*) () at /usr/lib64/libreoffice/program/libvclplug_genlo.so
#9  0x00007fffd31e26ab in vcl_sal::NetWMAdaptor::NetWMAdaptor(SalDisplay*) () at /usr/lib64/libreoffice/program/libvclplug_genlo.so
#10 0x00007fffd31e3571 in vcl_sal::WMAdaptor::createWMAdaptor(SalDisplay*) () at /usr/lib64/libreoffice/program/libvclplug_genlo.so
#11 0x00007fffd31ddf3d in SalDisplay::Init() () at /usr/lib64/libreoffice/program/libvclplug_genlo.so
#12 0x00007fffd31de0e4 in SalX11Display::SalX11Display(_XDisplay*) () at /usr/lib64/libreoffice/program/libvclplug_genlo.so
#13 0x00007fffd60cec5e in SalKDEDisplay::SalKDEDisplay(_XDisplay*) () at /usr/lib64/libreoffice/program/libvclplug_kde4lo.so
#14 0x00007fffd60d3d26 in KDESalInstance::CreateDisplay() const () at /usr/lib64/libreoffice/program/libvclplug_kde4lo.so
#15 0x00007fffd31de7c2 in X11SalInstance::AfterAppInit() () at /usr/lib64/libreoffice/program/libvclplug_genlo.so
#16 0x00007fffefdf89e2 in InitVCL() () at /usr/lib64/libreoffice/program/libvcllo.so
#17 0x00007fffefdfa00d in ImplSVMain() () at /usr/lib64/libreoffice/program/libvcllo.so
#18 0x00007fffefdfa070 in SVMain() () at /usr/lib64/libreoffice/program/libvcllo.so
#19 0x00007ffff791d515 in soffice_main () at /usr/lib64/libreoffice/program/libsofficeapp.so
#20 0x000055555555478b in main ()
Comment 13 Xisco Faulí 2018-01-29 17:37:16 UTC Comment hidden (obsolete)
Comment 14 Travers Carter 2018-01-29 17:58:11 UTC
My current command is:

soffice.bin --headless --nologo --nodefault --norestore --nofirststartwizard -env:JFW_PLUGIN_DO_NOT_CHECK_ACCESSIBILITY=1 --accept=pipe,name=PyUNO;urp;

The key to reproduction is in the X11 environment though, the problem only occurs when X is unavailable, if I recall correctly just unsetting the DISPLAY environment may not be sufficient to reproduce, it needs to be run without access to X.

I will set up a proper reproduce case example/steps and update the ticket (including status) then
Comment 15 ufaowl 2018-01-30 05:01:58 UTC
My command is:

/opt/libreoffice4.4/program/soffice.bin --nofirststartwizard --invisible --headless --norestore --nologo --nodefault --accept=socket,host=localhost,port=8100;urp;

The enviroment is indeed X-less
Comment 16 Michael Balzer 2018-03-30 19:27:35 UTC
I'm having the same issue, occasionally the headless process gets stuck and blocks all further processing of documents.

This is with LibreOffice 5.0.6.2 00(Build:2) on CentOS Linux 7 (Core), also without X11 environment.

From the current event:

sdafmo   29853  0.0  0.1 702152 62224 ?        Sl   Mär28   0:00 /usr/lib64/libreoffice/program/soffice.bin --headless --invisible --nocrashreport --nodefault --nofirststartwizard --nologo --norestore --accept=socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext
sdafmo   30832  0.0  0.0      0     0 ?        Z    Mär28   0:00  \_ [soffice.bin] <defunct>

gdb -p 29853
…
(gdb) where
#0  0x00007f91edbe4a3d in poll () at /lib64/libc.so.6
#1  0x00007f91e847bd04 in SvpSalInstance::DoReleaseYield(int) () at /usr/lib64/libreoffice/program/libvcllo.so
#2  0x00007f91e847c090 in SvpSalInstance::DoYield(bool, bool, unsigned long) () at /usr/lib64/libreoffice/program/libvcllo.so
#3  0x00007f91e83adef1 in Application::Yield() () at /usr/lib64/libreoffice/program/libvcllo.so
#4  0x00007f91e83adf85 in Application::Execute() () at /usr/lib64/libreoffice/program/libvcllo.so
#5  0x00007f91edee23a3 in desktop::Desktop::Main() () at /usr/lib64/libreoffice/program/libsofficeapp.so
#6  0x00007f91e83b2da6 in ImplSVMain() () at /usr/lib64/libreoffice/program/libvcllo.so
#7  0x00007f91e83b2e92 in SVMain() () at /usr/lib64/libreoffice/program/libvcllo.so
#8  0x00007f91edf08d02 in soffice_main () at /usr/lib64/libreoffice/program/libsofficeapp.so
#9  0x00000000004006fb in main ()

Regards,
Michael
Comment 17 Commit Notification 2018-04-19 14:50:17 UTC
Stephan Bergmann committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4bacf58f4af44ac8c4632b43289ccfcc07e5820c

tdf#95843: Wait for fire_glxtest_process also in --headless mode

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Stephan Bergmann 2018-04-19 14:53:06 UTC
Thansk to Travers Carter for the excellent analysis!
Comment 19 Commit Notification 2018-04-26 14:17:12 UTC
Stephan Bergmann committed a patch related to this issue.
It has been pushed to "libreoffice-6-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6839b7714b80cf28614dcd793edcdeb70dc6ed5f&h=libreoffice-6-0

tdf#95843: Wait for fire_glxtest_process also in --headless mode

It will be available in 6.0.5.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 g4827387 2018-04-30 16:24:52 UTC
Built off of 6.0 branch this morning - the issue appears to be not fixed.

Running doc to pdf conversions:
./instdir/program/soffice --headless --invisible --nodefault --nofirststartwizard --nolockcheck --nologo --norestore --convert-to pdf --outdir somePath someDocFile.docx

Running `ps aux`:
484 86 0.0 0.0 0 0 ? Z 16:06 0:00 [soffice.bin] <defunct>
484 124 0.0 0.0 0 0 ? Z 16:06 0:00 [gpgconf] <defunct>
484 126 0.0 0.0 0 0 ? Z 16:06 0:00 [gpgconf] <defunct>
484 128 0.0 0.0 0 0 ? Z 16:06 0:00 [gpg2] <defunct>
484 130 0.0 0.0 0 0 ? Z 16:06 0:00 [soffice.bin] <defunct>
484 132 0.0 0.0 0 0 ? Z 16:06 0:00 [gpgconf] <defunct>
484 175 0.0 0.0 0 0 ? Z 16:06 0:00 [gpgconf] <defunct>
484 177 0.0 0.0 0 0 ? Z 16:06 0:00 [gpgconf] <defunct>
484 179 0.0 0.0 0 0 ? Z 16:06 0:00 [gpg2] <defunct>
484 181 0.0 0.0 0 0 ? Z 16:06 0:00 [soffice.bin] <defunct>
484 183 0.0 0.0 0 0 ? Z 16:06 0:00 [gpgconf] <defunct>
484 307 0.0 0.0 0 0 ? Z 16:08 0:00 [gpgconf] <defunct>
484 309 0.0 0.0 0 0 ? Z 16:08 0:00 [gpgconf] <defunct>
484 311 0.0 0.0 0 0 ? Z 16:08 0:00 [gpg2] <defunct>
484 313 0.0 0.0 0 0 ? Z 16:08 0:00 [soffice.bin] <defunct>
484 315 0.0 0.0 0 0 ? Z 16:08 0:00 [gpgconf] <defunct>
484 375 0.0 0.0 0 0 ? Z 16:09 0:00 [gpgconf] <defunct>
484 377 0.0 0.0 0 0 ? Z 16:09 0:00 [gpgconf] <defunct>
484 379 0.0 0.0 0 0 ? Z 16:09 0:00 [gpg2] <defunct>
484 381 0.0 0.0 0 0 ? Z 16:09 0:00 [soffice.bin] <defunct>
484 383 0.0 0.0 0 0 ? Z 16:09 0:00 [gpgconf] <defunct>
484 484 0.0 0.0 0 0 ? Z 16:11 0:00 [gpgconf] <defunct>
484 486 0.0 0.0 0 0 ? Z 16:11 0:00 [gpgconf] <defunct>
484 488 0.0 0.0 0 0 ? Z 16:11 0:00 [gpg2] <defunct>
484 490 0.0 0.0 0 0 ? Z 16:11 0:00 [soffice.bin] <defunct>
484 492 0.0 0.0 0 0 ? Z 16:11 0:00 [gpgconf] <defunct>
484 549 0.0 0.0 0 0 ? Z 16:11 0:00 [gpgconf] <defunct>
484 551 0.0 0.0 0 0 ? Z 16:11 0:00 [gpgconf] <defunct>
484 553 0.0 0.0 0 0 ? Z 16:11 0:00 [gpg2] <defunct>
484 555 0.0 0.0 0 0 ? Z 16:11 0:00 [soffice.bin] <defunct>
484 557 0.0 0.0 0 0 ? Z 16:11 0:00 [gpgconf] <defunct>
484 600 0.0 0.0 0 0 ? Z 16:12 0:00 [gpgconf] <defunct>
484 602 0.0 0.0 0 0 ? Z 16:12 0:00 [gpgconf] <defunct>
484 604 0.0 0.0 0 0 ? Z 16:12 0:00 [gpg2] <defunct>
484 606 0.0 0.0 0 0 ? Z 16:12 0:00 [soffice.bin] <defunct>
484 608 0.0 0.0 0 0 ? Z 16:12 0:00 [gpgconf] <defunct>
484 641 0.0 0.0 0 0 ? Z 16:12 0:00 [gpgconf] <defunct>
484 643 0.0 0.0 0 0 ? Z 16:12 0:00 [gpgconf] <defunct>
484 645 0.0 0.0 0 0 ? Z 16:12 0:00 [gpg2] <defunct>
484 647 0.0 0.0 0 0 ? Z 16:12 0:00 [soffice.bin] <defunct>
484 649 0.0 0.0 0 0 ? Z 16:12 0:00 [gpgconf] <defunct>
484 686 0.0 0.0 0 0 ? Z 16:12 0:00 [gpgconf] <defunct>
484 688 0.0 0.0 0 0 ? Z 16:12 0:00 [gpgconf] <defunct>
484 690 0.0 0.0 0 0 ? Z 16:12 0:00 [gpg2] <defunct>
484 692 0.0 0.0 0 0 ? Z 16:12 0:00 [soffice.bin] <defunct>
484 694 0.0 0.0 0 0 ? Z 16:12 0:00 [gpgconf] <defunct>
484 733 0.0 0.0 0 0 ? Z 16:13 0:00 [gpgconf] <defunct>
484 735 0.0 0.0 0 0 ? Z 16:13 0:00 [gpgconf] <defunct>
484 737 0.0 0.0 0 0 ? Z 16:13 0:00 [gpg2] <defunct>
484 739 0.0 0.0 0 0 ? Z 16:13 0:00 [soffice.bin] <defunct>
484 741 0.0 0.0 0 0 ? Z 16:13 0:00 [gpgconf] <defunct>
484 888 0.0 0.0 0 0 ? Z 16:15 0:00 [gpgconf] <defunct>
484 890 0.0 0.0 0 0 ? Z 16:15 0:00 [gpgconf] <defunct>
484 892 0.0 0.0 0 0 ? Z 16:15 0:00 [gpg2] <defunct>
484 894 0.0 0.0 0 0 ? Z 16:15 0:00 [soffice.bin] <defunct>
484 896 0.0 0.0 0 0 ? Z 16:15 0:00 [gpgconf] <defunct>
484 1010 0.0 0.0 0 0 ? Z 16:16 0:00 [gpgconf] <defunct>
484 1012 0.0 0.0 0 0 ? Z 16:16 0:00 [gpgconf] <defunct>
484 1014 0.0 0.0 0 0 ? Z 16:16 0:00 [gpg2] <defunct>
484 1016 0.0 0.0 0 0 ? Z 16:16 0:00 [soffice.bin] <defunct>
484 1018 0.0 0.0 0 0 ? Z 16:16 0:00 [gpgconf] <defunct>
484 1061 0.0 0.0 0 0 ? Z 16:17 0:00 [gpgconf] <defunct>
484 1063 0.0 0.0 0 0 ? Z 16:17 0:00 [gpgconf] <defunct>
484 1065 0.0 0.0 0 0 ? Z 16:17 0:00 [gpg2] <defunct>
484 1067 0.0 0.0 0 0 ? Z 16:17 0:00 [soffice.bin] <defunct>
484 1069 0.0 0.0 0 0 ? Z 16:17 0:00 [gpgconf] <defunct>
484 1077 0.0 0.0 117216 2376 ? R 16:17 0:00 ps aux
Comment 21 Stephan Bergmann 2018-05-03 08:45:57 UTC
(In reply to g4827387 from comment #20)

I cannot reproduce your findings with a current libreoffice-6-0 Linux build (21f9812eb53272d50522a649b8c4b3492d95450b).

Doing `while true; do echo; ps aux | grep pts/3; done` in one terminal and `instdir/program/soffice --headless --invisible --nodefault --nofirststartwizard --nolockcheck --nologo --norestore --convert-to pdf --outdir /tmp sw/qa/core/data/ooxml/pass/fdo80514.docx` in terminal pts/3, I see the "[soffice.bin] <defunct>" appearing and disappearing quickly again, long before the main soffice.bin process terminates.  I do not see any gpg-related zombies at all.

The fix <https://cgit.freedesktop.org/libreoffice/core/commit/?id=4bacf58f4af44ac8c4632b43289ccfcc07e5820c> "tdf#95843: Wait for fire_glxtest_process also in --headless mode" deliberately uses WNOHANG in reap_glxtest_process (vcl/unx/glxtest.cxx), so that if that additional process ever takes excessively long, it doesn't block the main soffice.bin process (but instead manifests again as a zombie soffice.bin process).  In my tests, the additional process had always already terminated by the time it was reaped with WNOHANG, but timing may be different for other people, of course.  You can try if replacing that WNOHANG with 0 makes a difference.

@g4827387:

1  I'm not sure what your "Running `ps aux`" is meant to show.  If the second column is PID, that means you have multiple soffice.bin zombies?  What exactly did you do?  (Also, there's the "ps aux" process itself in your---apparently edited---`ps aux` output, but no non-zombie oosplash and soffice.bin?)

2  Please try replaceing WNOHANG with 0, as explained above.

3  For the gpg-related zombies, it is probably best to file a separate issue.  Unless Thorsten (now on cc) has an idea?
Comment 22 Xisco Faulí 2018-05-07 11:41:13 UTC
Hi g4827387@trbvm.com,
Could you please answer Stephan's questions from comment 21?
Setting to RESOLVED FIXED for the time being...
Comment 23 g4827387 2018-05-08 12:31:48 UTC
(In reply to Stephan Bergmann from comment #21)
> (In reply to g4827387 from comment #20)
> 
> I cannot reproduce your findings with a current libreoffice-6-0 Linux build
> (21f9812eb53272d50522a649b8c4b3492d95450b).
> 
> Doing `while true; do echo; ps aux | grep pts/3; done` in one terminal and
> `instdir/program/soffice --headless --invisible --nodefault
> --nofirststartwizard --nolockcheck --nologo --norestore --convert-to pdf
> --outdir /tmp sw/qa/core/data/ooxml/pass/fdo80514.docx` in terminal pts/3, I
> see the "[soffice.bin] <defunct>" appearing and disappearing quickly again,
> long before the main soffice.bin process terminates.  I do not see any
> gpg-related zombies at all.
> 
> The fix
> <https://cgit.freedesktop.org/libreoffice/core/commit/
> ?id=4bacf58f4af44ac8c4632b43289ccfcc07e5820c> "tdf#95843: Wait for
> fire_glxtest_process also in --headless mode" deliberately uses WNOHANG in
> reap_glxtest_process (vcl/unx/glxtest.cxx), so that if that additional
> process ever takes excessively long, it doesn't block the main soffice.bin
> process (but instead manifests again as a zombie soffice.bin process).  In
> my tests, the additional process had always already terminated by the time
> it was reaped with WNOHANG, but timing may be different for other people, of
> course.  You can try if replacing that WNOHANG with 0 makes a difference.
> 
> @g4827387:
> 
> 1  I'm not sure what your "Running `ps aux`" is meant to show.  If the
> second column is PID, that means you have multiple soffice.bin zombies? 
> What exactly did you do?  (Also, there's the "ps aux" process itself in
> your---apparently edited---`ps aux` output, but no non-zombie oosplash and
> soffice.bin?)
> 
> 2  Please try replaceing WNOHANG with 0, as explained above.
> 
> 3  For the gpg-related zombies, it is probably best to file a separate
> issue.  Unless Thorsten (now on cc) has an idea?

I actually tried 0 instead of WNOHANG and got similar result.

1. Yes, that's exactly - multiple soffice.bin zombies. I run multiple document conversions, that's what I'm trying to use LibreOffice for. There should be no oosplash since I'm running with --nologo.

2. Already tried before reopening this issue with similar results.

3. I'm guessing they are probably related.


Additional information. I'm running LibreOffice on AWS Lambda. Using a nodejs wrappper to execute it:
`const { execSync } = require('process_child');

execSync('/tmp/instdir/program/soffice --headless --invisible ....');
console.log(execSync('ps aux').toString()); // -> Reveals zombie processes (soffice.bin + multiple gpg per execution). Should be easily reproducible with this. Another example is https://github.com/vladgolubev/serverless-libreoffice. I did compile it myself to include the fix but it doesn't fix the issue.
Comment 24 Stephan Bergmann 2018-05-08 12:55:17 UTC
(In reply to g4827387 from comment #23)
> I actually tried 0 instead of WNOHANG and got similar result.
> 
> 1. Yes, that's exactly - multiple soffice.bin zombies. I run multiple
> document conversions, that's what I'm trying to use LibreOffice for.

From those two statements, it follows that the soffice.bin zombies you see must be from different forks than the fire_glxtest_process that this bug concentrated on.  Please file a new bug (even if the symptoms match this bug's subject, it is easier to keep track of the apparently different root causes that way).

> There
> should be no oosplash since I'm running with --nologo.

--nologo is unrelated to the presence of an oosplash process.
Comment 25 Jean-Baptiste Faure 2018-05-13 14:11:55 UTC
(In reply to Stephan Bergmann from comment #24)
> [...]
> From those two statements, it follows that the soffice.bin zombies you see
> must be from different forks than the fire_glxtest_process that this bug
> concentrated on.  Please file a new bug (even if the symptoms match this
> bug's subject, it is easier to keep track of the apparently different root
> causes that way).

The new bug report is there: https://bugs.documentfoundation.org/show_bug.cgi?id=117523

Best regards. JBF