Bug 45307 - LibreOffice hangs on HTML Table paste (multiple image from URL, see comment 33)
Summary: LibreOffice hangs on HTML Table paste (multiple image from URL, see comment 33)
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: high major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: haveBacktrace, perf
: 66547 70758 83493 117667 130244 (view as bug list)
Depends on:
Blocks: Image-Caching Paste
  Show dependency treegraph
 
Reported: 2012-01-27 06:12 UTC by Marc Schipperheyn
Modified: 2024-02-21 08:55 UTC (History)
20 users (show)

See Also:
Crash report or crash signature:


Attachments
example HTML tables to compare (7.21 KB, application/zip)
2023-03-31 11:13 UTC, Stéphane Guillou (stragu)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Schipperheyn 2012-01-27 06:12:32 UTC
If we take this page 
http://en.wikipedia.org/wiki/List_of_cities,_towns_and_villages_in_Friesland,_M-Z

And copy the table of cities and try to copy it into Calc or Write, you get a hang on all open LibreOffice windows. 

AFAIC there are three major problems here:
1. a hang in one window can hang all windows
2. a paste of something that LibreOffice can't process properly, leads to an unrecoverable hang in stead of an error message
3. a paste of a pretty standard HTML table copy is not processed properly
Comment 1 Marc Schipperheyn 2012-01-27 06:13:19 UTC
I'm on Windows 7 64bit
Comment 2 Michael Salem 2012-03-07 06:59:20 UTC
I have had this problem for a long time with OpenOffice (as far as I remember) and LibreOffice, various versions up to and including 3.5.0, under Microsoft Windows 7/32 (possibly XP too). Basically, I can either copy almost any web page into LibreOffice, or enter the URL of one as the filename in LibreOffice Writer. The web page will appear, and some editing can be started, but soon LibreOffice abends. Probably the same as
http://comments.gmane.org/gmane.comp.documentfoundation.libreoffice.user/17196
During the time I have been having this problem I have uninstalled and reinstalled Open/LibreOffice to no avail.

After the abend LibreOffice Writer will sometimes not run again; Windows task Manager shows it as running (though with no display). Killing all running instances of Writer allows normal operation again.

I report what I find; it may affect other components than Writer and for other operating systems.

Possibly related bugs I've found: 690796, 39865
Comment 3 retired 2013-07-03 09:42:10 UTC
OS -> ALL

Still valid with LO 4.1.0.1
Comment 4 retired 2013-07-03 09:43:30 UTC
*** Bug 66547 has been marked as a duplicate of this bug. ***
Comment 5 Jose Luis Triana 2013-12-03 23:34:13 UTC
Sir Hangs-A-lot is still present on LibreOffice 4.1.3.2 and in x86_64 as well, as my OS is 64 bit, Linux based OS.  

HANGS when copying, pasting, manipulating tables that I copied from HTML. 

And when saving documents with that tables. 

counterproductive
annoying. 

I have the portable version of LO 4.0.3 32 bit for Windows, and hangs as well.
Comment 6 Matthew Francis 2015-01-21 07:55:34 UTC
This seems to be a result of loading - slowly - the large number of small images in the copied HTML. Pasting a smaller subset of the table finishes after a few seconds
Comment 7 Matthew Francis 2015-01-21 07:56:03 UTC
*** Bug 83493 has been marked as a duplicate of this bug. ***
Comment 8 Timur 2015-05-12 11:11:03 UTC
(In reply to Marc Schipperheyn from comment #0)
> If we take this page 
> http://en.wikipedia.org/wiki/List_of_cities,_towns_and_villages_in_Friesland,
> _M-Z
> And copy the table of cities and try to copy it into Calc or Write, you get
> a hang on all open LibreOffice windows. 

I cannot reproduce it on a computer with Core-7 processor and 8 GB of RAM with neither LO version. 
Can you please retest with current LO and write LO version and hardware specs (processor, RAM)?
Comment 9 Marc Schipperheyn 2015-06-01 14:23:01 UTC
I'm on a MBP 15 inch 2011 model with Mac OS Maverick.
Comment 10 Chris Halls 2015-09-08 12:40:22 UTC
(In reply to Matthew Francis from comment #6)
> This seems to be a result of loading - slowly - the large number of small
> images in the copied HTML. Pasting a smaller subset of the table finishes
> after a few seconds

I agree with this analysis. This backtrace shows the code waiting for image data from the remote server.

#0  0x00007f536ede1438 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f53712a344e in osl_waitCondition (Condition=0x539f1f0, pTimeout=pTimeout@entry=0x7ffea15882f0) at libreoffice/sal/osl/unx/conditn.cxx:201
#2  0x00007f5368b13812 in salhelper::ConditionWaiter::ConditionWaiter (this=0x7ffea1588320, aCond=..., milliSec=<optimized out>)
    at libreoffice/salhelper/source/condition.cxx:113
#3  0x00007f5373ec9cce in utl::Moderator::getResult (this=this@entry=0x539f100, milliSec=milliSec@entry=5000)
    at libreoffice/unotools/source/ucbhelper/ucblockbytes.cxx:570
#4  0x00007f5373ecd9b1 in utl::UCBOpenContentSync (xLockBytes=..., xContent=..., rArg=..., xSink=..., xInteract=...)
    at libreoffice/unotools/source/ucbhelper/ucblockbytes.cxx:788
#5  0x00007f5373ecfb54 in utl::UcbLockBytes::CreateLockBytes (xContent=..., rProps=..., eOpenMode=eOpenMode@entry=(StreamMode::READ | StreamMode::SHARE_DENYNONE), 
    xInteractionHandler=..., pHandler=pHandler@entry=0x0) at libreoffice/unotools/source/ucbhelper/ucblockbytes.cxx:1436
#6  0x00007f5373ed23e8 in utl::lcl_CreateStream (rFileName=..., eOpenMode=(StreamMode::READ | StreamMode::SHARE_DENYNONE), xInteractionHandler=..., pHandler=0x0, 
    bEnsureFileExists=bEnsureFileExists@entry=true) at libreoffice/unotools/source/ucbhelper/ucbstreamhelper.cxx:119
#7  0x00007f5373ed397e in utl::UcbStreamHelper::CreateStream (rFileName=..., eOpenMode=<optimized out>, pHandler=<optimized out>)
    at libreoffice/unotools/source/ucbhelper/ucbstreamhelper.cxx:144
#8  0x00007f53741dd789 in GraphicFilter::ImportGraphic (
    this=0x7f53759cac00 <rtl::Static<(anonymous namespace)::StandardGraphicFilter, (anonymous namespace)::theGraphicFilter>::get()::instance>, rGraphic=..., rPath=..., 
    nFormat=nFormat@entry=65535, pDeterminedFormat=pDeterminedFormat@entry=0x0, nImportFlags=nImportFlags@entry=GraphicFilterImportFlags::NONE)
    at libreoffice/vcl/source/filter/graphicfilter.cxx:1316
#9  0x00007f53493bde81 in SwHTMLParser::InsertImage (this=this@entry=0x50c79c0) at libreoffice/sw/source/filter/html/htmlgrin.cxx:475
#10 0x00007f53493fd290 in SwHTMLParser::NextToken (this=0x50c79c0, nToken=268) at libreoffice/sw/source/filter/html/swhtml.cxx:1493
#11 0x00007f53493d7375 in SwHTMLParser::BuildTableCell (this=this@entry=0x50c79c0, pCurTable=pCurTable@entry=0x5bd3990, bReadOptions=bReadOptions@entry=true, bHead=<optimized out>)
    at libreoffice/sw/source/filter/html/htmltab.cxx:3859
#12 0x00007f53493d9868 in SwHTMLParser::BuildTableRow (this=this@entry=0x50c79c0, pCurTable=pCurTable@entry=0x5bd3990, bReadOptions=<optimized out>, eGrpAdjust=<optimized out>, 
    eGrpVertOri=<optimized out>) at libreoffice/sw/source/filter/html/htmltab.cxx:4284
#13 0x00007f53493da247 in SwHTMLParser::BuildTableSection (this=0x50c79c0, pCurTable=0x5bd3990, bReadOptions=<optimized out>, bHead=<optimized out>)
    at libreoffice/sw/source/filter/html/htmltab.cxx:4472
#14 0x00007f53493dabcf in SwHTMLParser::BuildTable (this=this@entry=0x50c79c0, eParentAdjust=<optimized out>, bIsParentHead=bIsParentHead@entry=false, 
    bHasParentSection=bHasParentSection@entry=true, bHasToFly=bHasToFly@entry=false) at libreoffice/sw/source/filter/html/htmltab.cxx:5261
#15 0x00007f53493fd7e7 in SwHTMLParser::NextToken (this=0x50c79c0, nToken=660) at libreoffice/sw/source/filter/html/swhtml.cxx:1693
#16 0x00007f537353fae1 in HTMLParser::Continue (this=this@entry=0x50c79c0, nToken=<optimized out>) at libreoffice/svtools/source/svhtml/parhtml.cxx:337
#17 0x00007f53493f5acb in SwHTMLParser::Continue (this=0x50c79c0, nToken=0) at libreoffice/sw/source/filter/html/swhtml.cxx:630
#18 0x00007f537353c81c in HTMLParser::CallParser (this=this@entry=0x50c79c0) at libreoffice/svtools/source/svhtml/parhtml.cxx:319
#19 0x00007f53493eea01 in SwHTMLParser::CallParser (this=this@entry=0x50c79c0) at libreoffice/sw/source/filter/html/swhtml.cxx:561
#20 0x00007f53493eed24 in HTMLReader::Read (this=this@entry=0x4b90b10, rDoc=..., rBaseURL=..., rPam=..., rName=...)
    at libreoffice/sw/source/filter/html/swhtml.cxx:219
#21 0x00007f53493737d6 in SwReader::Read (this=this@entry=0x7ffea1589ef0, rOptions=...) at libreoffice/sw/source/filter/basflt/shellio.cxx:175
#22 0x00007f53494a3eaa in SwTransferable::_PasteFileContent (rData=..., rSh=..., nFormat=nFormat@entry=SotClipboardFormatId::HTML, bMsg=bMsg@entry=true)
    at libreoffice/sw/source/uibase/dochdl/swdtflvr.cxx:1703
#23 0x00007f53494ae8a4 in SwTransferable::PasteData (rData=..., rSh=..., nAction=<optimized out>, nFormat=SotClipboardFormatId::HTML, 
    nDestination=nDestination@entry=SotExchangeDest::SWDOC_FREE_AREA, bIsPasteFormat=bIsPasteFormat@entry=false, bIsDefault=false, pPt=0x0, nDropAction=0 '\000', 
    bPasteSelection=false) at libreoffice/sw/source/uibase/dochdl/swdtflvr.cxx:1334
#24 0x00007f53494aecb7 in SwTransferable::Paste (rSh=..., rData=...) at libreoffice/sw/source/uibase/dochdl/swdtflvr.cxx:1177
#25 0x00007f5349532dd7 in SwBaseShell::ExecClpbrd (this=<optimized out>, rReq=...) at libreoffice/sw/source/uibase/shells/basesh.cxx:288

The backtrace is from a build of current git master, ie future version 5.1.

If I paste the text from the wikipedia article as unformatted text (Ctrl+Shift+Alt+V), the paste operation happens very quickly.

I think that the problem from a users' point of view, is that during this long loading process, there is absolutely no feedback to the user, and the UI is unresponsive.

I wonder if there is any way that the loading of such images can be postponed as a background job?

I'm setting to NEW as I think there is enough information here to work with.
Comment 11 kie000 2015-09-08 16:49:06 UTC
An office application shouldn't be loading anything from the web during a copy. 

When I copy something I expect data -> clipboard -> Libreoffice, not for Libreoffice to start acting like a web browser and fetching stuff from the internet. And no option is offered - typically I don't want the objects, just the textual data.

Separately, when I paste the example data from wikipedia, Calc does paste the data but the formatting goes so bad that it's for all intents useless.
Comment 12 Timur 2017-05-24 16:21:02 UTC
As I wrote in Comment 8, I was never able to repro this (but I did with Bug 91237). So please retest this one with current LO.
Comment 13 raal 2017-05-24 16:44:57 UTC
No crash in Version: 5.5.0.0.alpha0+
Build ID: 07381c017cd2b4e3ce643d17ae7cbb11ddef2228
CPU threads: 4; OS: Linux 4.4; UI render: default; VCL: gtk2; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2017-05-19_23:08:45
Comment 14 Telesto 2017-05-24 18:04:03 UTC
LibO is fully unresponsive when copy/ pasting the table from Firefox to Writer. It takes up to a minute or so (no crash):

Version: 5.5.0.0.alpha0+
Build ID: d57e6cd9dcc96112994ca2b14ac45896e86b26e5
CPU threads: 4; OS: Windows 6.19; UI render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-05-18_22:43:07
Locale: nl-NL (nl_NL); Calc: CL
 
It seems to me (but I'm no Dev) that LibO is trying to download every image at once, creating a flood of TCP connections (I used TCPview). This is the normal behavior since LibreOffice 3.3.0 OOO330m19 (Build:6) tag libreoffice-3.3.0.4

OpenOffice AOO413m1(Build:9783) at the other hand uses only 2 endpoints (and the pasting does go a lot faster)
Comment 15 Michelle 2017-06-13 19:51:00 UTC
I just tried to copy some text including tables from https://www.humanservices.gov.au/customer/enablers/working-out-child-support-payments-using-basic-formula and it crashed LibreOffice completely.  I had several Calc files open as well, and on trying to restart LibreOffice, it tried to recover those files I had open, but all had "disappeared" & could not be recovered.  The other Writer file I had open also reverted to unnamed document.  The information provided to me on first crash follows:
This bug was filed from the crash reporting server and is br-a3368511-e014-4417-8341-e75fe41a7b53.
Comment 16 kie000 2017-06-13 20:31:24 UTC
I just want to say me too, crash recovery is very hit and miss, it gets confused and does weird things.
Comment 17 Telesto 2017-06-13 20:54:53 UTC
@Michelle from comment #15)
Looks like bug 105769. This only affects the 64-bit version. Please try with the 32-bit build or use the x64 5.4 beta version
Comment 18 Kruno 2018-05-12 08:56:32 UTC
I don't think it only affects 64-bit systems as I got hang on 32-bit, but now in 6.0.4.1 seams it does not hang anymore.
Comment 19 Buovjaga 2018-05-20 07:18:50 UTC
*** Bug 117667 has been marked as a duplicate of this bug. ***
Comment 20 francois@synways.com 2018-08-17 10:24:07 UTC
I have also the issue copying html table.

Version: libreoffice 1:5.2.7-1+deb9u4 amd64 on debian stretch


I have the bug when my vpn is connected.

When using vpn, I also use a remote proxy with a ssh tunnel: ssh $myproxy-hsot -L 3128:127.0.0.1:3128
Then I can either setup the proxy settings (http(s): 127.0.0.1 3128) in the gnome system settings or in the libreoffice options (manual proxy settings)
In both cases  the pasting works without bugs. (it is just a bit slow)

If i do not set the proxy in system nor in libreoffice and set it only in firefox manually, then obviously firefox can reach internet and libreoffice cannot.
In this case copy pasting an html table with pictures elements will hangs libreoffice forever and I need to force close it.

=> libreoffice should not need internet for pasting clipboard.
=> if internet cannot be avoided, at least libreoffice should give up after some timeout and/or offer a cancel button, instead of freezing a loosing changes on all opened documents.
Comment 21 kie000 2018-08-19 09:15:19 UTC
Just another me too - I always use VPN. (because snoopers charter makes 1984 look good).
Comment 22 e14051033 2018-11-29 09:13:54 UTC
Bug not reproducible in version.

Version: 6.3.0.0.alpha0+ (x64)
Build ID: 0f25a3c36f27fd51453b9a9115f236b83c143684
CPU threads: 8; OS: Windows 10.0; UI render: default; VCL: win; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2018-11-27_20:06:55
Locale: zh-TW (zh_TW); UI-Language: en-US
Calc: threaded
Comment 23 Timur 2018-11-29 10:05:17 UTC
I guess all those Bug 45307, Bug 43338, Bug 91237 have the same root, management of remote images fetch. 
If reproducible or not depends probably more on internet link and load than on computer type. 
Maybe we can close all those and open a new enhancement for TCP session management.
Comment 24 kie000 2018-11-29 11:00:31 UTC
It took an hour to copy the example table. Firefox hung for about 20 minutes and then LibreOffice took about 40 minutes of doing pretty much nothing. What's with all the waiting?

Is windows clipboard trying to use the internet? Is LibreOffice trying to use the internet? That's not how clipboard or copying should work.

During that hour+ of waiting LibreOffice was unresponsive. 

Windows clipboard was out of action for the initial 20 minutes when the Firefox tab was doing it's thing. During this 20 minutes the only thing that could be pasted anywhere was the text only part of the copy, other copy (not paste) actions failed.

Win7 64bit 16GB Ram, Fast SSD

LibreOffice 6.1.3.2 (x64) Calc

Calc might have crashed on lesser hardware.

I don't care if LO crashed or not, LO should not be trying to use the internet when I'm doing a simple copy-paste, that's bad programming, just paste the contents of the clipboard already, it should not take an hour on a fast machine with fibre internet.
Comment 25 Thomas Lendo 2019-08-11 20:35:21 UTC
*** Bug 70758 has been marked as a duplicate of this bug. ***
Comment 26 Timur 2020-01-28 20:49:32 UTC
*** Bug 130244 has been marked as a duplicate of this bug. ***
Comment 27 QA Administrators 2022-01-28 03:57:31 UTC Comment hidden (obsolete)
Comment 28 Roman Kuznetsov 2022-03-02 12:46:31 UTC
Still repro in

Version: 7.4.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: dc99d27f04b47c173de934a19b6d6a3cc572c20a
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: ru-RU
Calc: CL
Comment 29 Rajasekaran Karunanithi 2022-11-05 00:04:59 UTC
Can't reproduce in LO 7.4.2.3 under LXLE(x64) Focal distro.Those tables got copied in few seconds to writer.No hangs at all.

Version: 7.4.2.3 / LibreOffice Community
Build ID: 40(Build:3)
CPU threads: 1; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Ubuntu package version: 1:7.4.2~rc3-0ubuntu0.20.04.1~lo1
Calc: threaded
Comment 30 Buovjaga 2022-11-05 07:59:03 UTC
(In reply to Rajasekaran Karunanithi from comment #29)
> Can't reproduce in LO 7.4.2.3 under LXLE(x64) Focal distro.Those tables got
> copied in few seconds to writer.No hangs at all.

Copying from A to Z, it takes about 2 minutes for me and I have 32 GB of RAM while I remember you have 4 GB. I wonder, if we copied & pasted the same way.

However, this is much better than in comment 24.

Arch Linux 64-bit
Version: 7.4.2.3 / LibreOffice Community
Build ID: 40(Build:3)
CPU threads: 8; OS: Linux 6.0; UI render: default; VCL: kf5 (cairo+xcb)
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
7.4.2-1
Calc: threaded
Comment 31 Rajasekaran Karunanithi 2022-11-05 10:28:02 UTC
(In reply to Buovjaga from comment #30)
> (In reply to Rajasekaran Karunanithi from comment #29)
> > Can't reproduce in LO 7.4.2.3 under LXLE(x64) Focal distro.Those tables got
> > copied in few seconds to writer.No hangs at all.
> 
> Copying from A to Z, it takes about 2 minutes for me and I have 32 GB of RAM
> while I remember you have 4 GB. I wonder, if we copied & pasted the same way.
> 
> However, this is much better than in comment 24.
> 
> Arch Linux 64-bit
> Version: 7.4.2.3 / LibreOffice Community
> Build ID: 40(Build:3)
> CPU threads: 8; OS: Linux 6.0; UI render: default; VCL: kf5 (cairo+xcb)
> Locale: fi-FI (fi_FI.UTF-8); UI: en-US
> 7.4.2-1
> Calc: threaded

Actually I copied the tables one by one.Anyways I will try to copy all tables together and see the results.
Comment 32 kie000 2022-11-05 16:10:13 UTC
I tried the copy-paste again, it took about an hour.
L.O. V' 7.4.2.3, Win7 64bit, i7 3770k 16GB DDR3 Ram, SSDs

Also tried on Win10 same version, it took about 2 minutes
Comment 33 Stéphane Guillou (stragu) 2023-03-31 11:13:52 UTC
Created attachment 186357 [details]
example HTML tables to compare

Testing with saving the HTML page locally, I figured out the slowdown indeed comes from the loading of images from URL.

The HTML table has this repeated image tag:

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/55/WMA_button2b.png/17px-WMA_button2b.png" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/5/55/WMA_button2b.png/17px-WMA_button2b.png 1x, //upload.wikimedia.org/wikipedia/commons/thumb/5/55/WMA_button2b.png/34px-WMA_button2b.png 2x" class="wmamapbutton noprint" title="Show location on an interactive map" alt="" style="padding: 0px 3px 0px 0px; cursor: pointer;">

Pasting a table that contains those (like in slow.html, or the original Wikipedia article) takes too long (6 seconds for a 42×3 table).

With the same page, but img tags pointing to a local file (like in fast.html), pasting the table is instant.

Wondering if this has the same source cause as slow loading of image from URL like in bug 141781.

Tested in:

Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 9c7d3ce813c761b116232bc291e2737c59d383da
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded
Comment 34 Stéphane Guillou (stragu) 2023-03-31 11:23:45 UTC
From duplicates:

- backtrace when pasting in Writer in attachment 105826 [details] (master from ~2014-09-05, from bug 83493)
- Callgrind output in attachment 142160 [details] (LO 6.1.0.0.alpha1+, from bug 117667)

Already noticeable difference between the two example tables in OOo 3.3, so inherited.
Comment 35 Matt K 2023-12-15 02:06:25 UTC
It looks like there is a timeout of 5000 millisec in utl::UCBOpenContentSync for each image.  Maybe we could change the code to stop trying to download if a timeout has already occurred?