Bug 131985 - CPU / Memory burn ...
Summary: CPU / Memory burn ...
Status: CLOSED FIXED
Alias: None
Product: LibreOffice Online
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium critical
Assignee: Michael Meeks
URL:
Whiteboard: target:7.0.0
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-08 13:20 UTC by Michael Meeks
Modified: 2021-11-30 18:05 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Meeks 2020-04-08 13:20:19 UTC
Sometimes we have documents that burn lots of CPU & Memory; interestingly their main-loops are still running:

5456  clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=80195, tv_nsec=502097667}) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSans-Bold.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSans-Regular.ttf", R_OK) = 0
5456  access("/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/DejaVuSans.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/opens___.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSans-Bold.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSans-Regular.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/opens___.ttf", R_OK) = 0
5456  access("/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSerif-Bold.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/DejaVuSans.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/opens___.ttf", R_OK) = 0

	+ did something here [!] ...

5456  write(14, "w", 1)                 = -1 EAGAIN (Ressource temporairement non disponible)
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSans-Bold.ttf", R_OK) = 0
5456  clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=80195, tv_nsec=648905941}) = 0
5456  write(14, "w", 1)                 = -1 EAGAIN (Ressource temporairement non disponible)
5456  clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=80196, tv_nsec=300254527}) = 0

	+ 761 of these ... (happens each time)

5456  clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=80197, tv_nsec=38938770}) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSans-Bold.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSans-Regular.ttf", R_OK) = 0
5456  access("/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/DejaVuSans.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/opens___.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSans-Bold.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSans-Regular.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/opens___.ttf", R_OK) = 0
5456  access("/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSerif-Bold.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/DejaVuSans.ttf", R_OK) = 0
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/opens___.ttf", R_OK) = 0
5456  write(14, "w", 1)                 = -1 EAGAIN (Ressource temporairement non disponible)
5456  access("/opt/collaboraoffice6.2/share/fonts/truetype/LiberationSans-Bold.ttf", R_OK) = 0
5456  clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=80197, tv_nsec=186801155}) = 0
5456  write(14, "w", 1)                 = -1 EAGAIN (Ressource temporairement non disponible)
5456  clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=80197, tv_nsec=852122768}) = 0

and some repeat of this. The wakeup pipe is full, so it is clear that we never get to the poll.

We are clearly getting stuck here:

bool SvpSalInstance::DoYield(bool bWait, bool bHandleAllCurrentEvents)
{
    // first, process current user events
    bool bEvent = DispatchUserEvents(bHandleAllCurrentEvents);
    if (!bHandleAllCurrentEvents && bEvent)

** Either we bail out here - and never get the poll

        return true;

    bEvent = CheckTimeout() || bEvent;

** Or we have a very short timeout that we always take - setting bEvent to true

    SvpSalYieldMutex *const pMutex(static_cast<SvpSalYieldMutex*>(GetYieldMutex()));

    if (IsMainThread())
    {
        if (bWait && ! bEvent)
        {
... call back to the kit poll() ...

** Or we are in a 'Yield' loop where bWait is always false - so we never get

More details on that much appreciated =)
Comment 1 Michael Meeks 2020-04-08 13:21:07 UTC
Am looking at this... Gabriel you might be interested too =)
Comment 2 Michael Meeks 2020-04-08 14:17:56 UTC
Potential patch here:
   https://gerrit.libreoffice.org/c/core/+/91927
   tdf#131985 - ensure the Kit poll callback gets into fast idle loops. [NEW]

with a cleanup that might help things here too:
   https://gerrit.libreoffice.org/c/core/+/91925

This would/should let us get a stream of commands processed, some of which may perturb the app out of its busy/idle/timeout loop: though really diagnosing why we get stuck in that idle loop is prolly more useful & interesting generally.

For that - would need to attach gdb to a wedged process and see what timeouts / idles are on the stack; prolly by stepping through pTask->Invoke() in the scheduler.cxx a few times & seeing where they go.
Comment 3 Commit Notification 2020-04-08 21:12:34 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/486d32f1e83002e6ff43db32463fb49393cc3664

tdf#131985 - ensure the Kit poll callback gets into fast idle loops.

It will be available in 7.0.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 4 Michael Meeks 2020-04-08 21:14:19 UTC
Would love feedback on whether this helps people before closing. Obviously - I'd really like to fix the busy-loops there too - which will now burn even more CPU than before but less RAM perhaps, and possibly be break-able-out-of ... ;->
Comment 5 Michael Meeks 2020-04-10 09:14:57 UTC
This prolly also wants: https://gerrit.libreoffice.org/c/online/+/92000 to reduce busy waiting at the end of milliseconds for the next timeout to trigger =)
Comment 6 Michael Meeks 2020-04-10 13:09:10 UTC
Also this one - cleaning that up: https://gerrit.libreoffice.org/c/online/+/92032
Comment 7 Fay Anabelle 2021-11-30 15:54:15 UTC Comment hidden (spam)
Comment 8 Michael Warner 2021-11-30 16:02:31 UTC
@Michael Meeks, I see that the patches you submitted have been merged. Are you still working on this? Has it been resolved? Please update the status of this bug if you get a chance. Thanks.
Comment 9 Michael Meeks 2021-11-30 16:21:17 UTC
> Would love feedback on whether this helps people before closing.

I expect this is closed =)