Bug 153380 - Pasting table copied from QGIS (as HTML) freezes because of large values in WKT column (comment 24)
Summary: Pasting table copied from QGIS (as HTML) freezes because of large values in W...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: high major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: perf
Depends on:
Blocks: Paste
  Show dependency treegraph
 
Reported: 2023-02-04 19:00 UTC by M-Rick
Modified: 2023-03-31 06:40 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
example QGIS project (66.05 KB, application/zip)
2023-02-05 15:20 UTC, Stéphane Guillou (stragu)
Details
CSV containing spatial data from example geojson, storead as WKT (9.18 MB, text/csv)
2023-03-22 10:28 UTC, Stéphane Guillou (stragu)
Details
example Geojson to open in QGIS (10.62 MB, application/geo+json)
2023-03-23 06:12 UTC, Stéphane Guillou (stragu)
Details
contents of clipboard (TSV data) (11.28 MB, text/plain)
2023-03-27 14:31 UTC, Stéphane Guillou (stragu)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description M-Rick 2023-02-04 19:00:14 UTC
Description:
When copying datas from QGIS data table to a Calc spreadsheet makes LibreOffice hanging. It works well with Excel or any other spreadsheet application. Only LibreOffice seems being incompatible.
It affects any versions of LibreOffice.

Steps to Reproduce:
1. Open the data table in a QGIS project
2. Select the datas an copy them
3. Open LibreOffice, on a spreadsheet document or create a new one
4. Paste the datas, LibreOffice hangs. It happens with Writer as well.

Actual Results:
LibreOffice hangs and it needs to force closing the application.

Expected Results:
The datas should be pasted in place in the spreadsheet including the column headers just like they are in the QGIS table.


Reproducible: Always


User Profile Reset: No

Additional Info:
Works well with any other spreadsheet softwares.
Comment 1 m_a_riosv 2023-02-04 22:19:26 UTC
Please, could you give the link to a QGIS table that makes LIbreOffice hangs at pasting.
Comment 2 Stéphane Guillou (stragu) 2023-02-05 15:18:39 UTC
I could not reproduce the issue when copying a layer's attribute table (select a layer in QGIS, F6, Ctrl + A, Ctrl + C, paste in Calc or Writer).

Tested with QGIS 3.22.15-Białowieża and LO 7.4.5 and a recent master build, on Ubuntu 20.04.
Comment 3 Stéphane Guillou (stragu) 2023-02-05 15:20:09 UTC
Created attachment 185136 [details]
example QGIS project

You can test with one of the "amenity_library" layers in this zipped QGIS project.
Comment 4 Julien Nabet 2023-02-06 11:47:09 UTC
On pc Debian x86-64 with LO Debian package 7.4.5.1, I don't reproduce this but I can reproduce this with master sources updated today.
Comment 5 Stéphane Guillou (stragu) 2023-02-06 13:20:01 UTC
Could not reproduce with:

Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: b2bd60b8c1937502857e12b0eea42323fd2353c8
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

@Julien, @M-Rick, which steps exactly do you use to make it hang? (Using the attachment)
Comment 6 Julien Nabet 2023-02-06 16:01:13 UTC
I made some other tests, I can reproduce this only with a build with enable-dbgutil option (so with all debug part).
Comment 7 Julien Nabet 2023-02-06 16:30:15 UTC
I focused on enable-dbgutil build, in fact, it's far slowlier there.
I had to wait for about 1 min.

Here are the steps I use to reproduce this on the build with enable-dbgutil:
1) Open a console/terminal and launch Calc (term1)
2) Open another console/terminal and launch Qgis (term2)
3) On Qgis, open the project, F6, Ctrl-A, Ctrl-C
4) On Calc, right click on any cell and choose Paste
=> the dialog "Import Options" appears
5) Let the by-default options (Automatic and "Keep asking during this session") and click OK button
=> hourglass during a minute, I thought I had to Ctrl-C and finally the result appeared.
Comment 8 Stéphane Guillou (stragu) 2023-02-06 16:41:01 UTC
I'm wondering if it has to do with QGIS also including the Well-Known Geometry of the feature as an extra first column in the resulting table in LO.
I can imagine the OP having a layer of very complex geometries and therefore pasting a very large number of coordinate values at once when expecting to only paste simple attributes.

Maybe the difference between pasting the point layer table vs the polygon layer table is noticeable in the debug build, Julien?
Comment 9 Julien Nabet 2023-02-06 16:46:09 UTC
(In reply to Stéphane Guillou (stragu) from comment #8)
> ...
> Maybe the difference between pasting the point layer table vs the polygon
> layer table is noticeable in the debug build, Julien?

Since I don't know at all Qgis, I don't know at which case corresponds what you indicated in your comment 2 "layer's attribute table" unless it's another thing?

To be sure, could you indicate what to do for testing the paste from:
1) the point layer table 
2) the polygon layer table
?
Comment 10 Julien Nabet 2023-02-06 16:50:17 UTC
I must recognize I don't understand how come I can't reproduce this with a similar release build since I suppose M-Rick used a standard non debug 7.4.5 msi file to install LO.
Or perhaps a distinct bug on Windows so freeze on Windows, and slowliness with debug builds at least on Linux?
Comment 11 Stéphane Guillou (stragu) 2023-02-06 16:55:59 UTC
(In reply to Julien Nabet from comment #9)
> (In reply to Stéphane Guillou (stragu) from comment #8)
> To be sure, could you indicate what to do for testing the paste from:
> 1) the point layer table 
> 2) the polygon layer table
> ?

In the attached project, there's a polygon layer (rectangle icon) and point layer (dot icon) in the Layers panel, usually bottom-left of the window. Both of them are called "amenity_library_Brisbane, Queensland" but they are different geometries.
The attribute table is accessed by selecting the layer and pressing F6 (or right-clicking a layer > Open Attribute Table). The attribute table contains the data associated with each feature in the layer, mostly non-spatial data but when copying it, QGIS adds to the clipboard an extra first column of spatial data that describes the features (the "wkt_geometry" column). I can see how it is useful, because it allows to copy across all the information, both spatial and non-spatial, and import that layer in one go in a different program. But I'm surprised I can't see an option to only copy the attribute table with the WKT geometry data.
Comment 12 Julien Nabet 2023-02-06 17:59:57 UTC
Hmm I don't see rectangle and point layer icons in layer panel.
I got 8 buttons (approximative English translation from French):
- open panel for layer style
- add a group
- manage map themes
- filter legend
- filter legend with expression
- expand all
- reduce all
- delete layer/group

Anyway, don't bother, I can't reproduce this with build release so even if I find the right buttons, it won't help.
Comment 13 M-Rick 2023-02-07 19:39:05 UTC
I tried with this dataset:
https://data.loire-atlantique.fr/api/explore/v2.1/catalog/datasets/224400028_communes-loire-atlantique-denominations-formegeo-interco/exports/geojson?lang=fr&timezone=Europe%2FBerlin

There are only 207 records and it hangs forever. On Excel it's pasted immediately.
I tried on macOS as well with the same result.

Version de QGIS
3.22.3-Białowieża
Révision du code
1628765ec7
Version de Qt
5.15.2
Version de Python
3.9.5
Version de GDAL/OGR
3.4.1
Version de Proj
8.2.1
Version de la base de données du registre EPSG
v10.041 (2021-12-03)
Version de GEOS
3.10.0-CAPI-1.16.0
Version de SQLite
3.35.2
Version de PDAL
2.3.0
Version du client PostgreSQL
13.0
Version de SpatiaLite
5.0.1
Version de QWT
6.1.3
Version de QScintilla2
2.11.5
Version de l'OS
Windows 10 Version 2009
Comment 14 QA Administrators 2023-02-08 03:24:57 UTC Comment hidden (obsolete)
Comment 15 Stéphane Guillou (stragu) 2023-03-22 10:17:51 UTC
Reproduced with geojson from comment 13: copying the whole attribute table and pasting it in LO 7.5.1.2 hangs for ages.

This is a performance issue related to the WKT column that contains the spatial information and is copied along with the attributes.

You can check that it is indeed because of the WKT columns by changing the QGIS settings so it does not copy it across:
Settings > Options > Data sources > Copy features as: plain text, no geometry.

Pasting the attribute table is then instant.

Version: 7.5.1.2 (X86_64) / LibreOffice Community
Build ID: fcbaee479e84c6cd81291587d2ee68cba099e129
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

OOo 3.3 was also struggling with such data, so marking as inherited.

Importing a CSV containing the same data triggers the warning dialog "The data could not be loaded completely because the maximum number of characters per cell was exceeded." I assume it then truncates the data.

MS Excel can paste it instantly, but it results in 462 rows of data, whereas the original geojson layer only has 207 features. That's because Excel silently splits those long values across several rows, to make it fit in a cell's maximum of 32759 characters.

Eike, what are your thoughts on Calc handling these large strings pastes? Surely, it should trigger the same "maximum number of characters" warning and be as snappy as a CSV import?
Comment 16 Stéphane Guillou (stragu) 2023-03-22 10:28:34 UTC
Created attachment 186132 [details]
CSV containing spatial data from example geojson, storead as WKT

To test the CSV import and associated warning.
Comment 17 m_a_riosv 2023-03-22 21:17:53 UTC
With attached file, no slowness for me, only a few seconds (about 5-6), and shows the message about cell lengths:

Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: b5c3a7502f7ff6ccf0f829c1f3a2ba50b8584c41
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: es-ES (es_ES); UI: en-US Calc: CL threaded Jumbo
Version: 7.5.2.1 (X86_64) / LibreOffice Community
Build ID: e8bf3b441b8370f8440b0339fd9490765a8d57ca
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: es-ES (es_ES); UI: en-US Calc: CL threaded
Comment 18 Stéphane Guillou (stragu) 2023-03-23 06:12:22 UTC
Created attachment 186147 [details]
example Geojson to open in QGIS

(In reply to m.a.riosv from comment #17)
> With attached file, no slowness for me, only a few seconds (about 5-6), and
> shows the message about cell lengths:

Which attachment? And which steps? The issue happens when copy-pasting from QGIS, not when opening the CSV. The CSV is only to show the difference between importing and pasting.
Comment 19 m_a_riosv 2023-03-23 19:56:51 UTC
There was only one file attached, comment#16
Comment 20 Stéphane Guillou (stragu) 2023-03-23 20:40:34 UTC
(In reply to m.a.riosv from comment #19)
> There was only one file attached, comment#16

There was also the QGIS project I just obsoleted (attachment 185136 [details]), but to guarantee seeing the slowness, I used the geojson M-Rick provided, with the following steps:

1. load the geojson (attachment 186147 [details]) as a layer in QGIS
2. open its attribute table with F6; copy everything with Ctrl + A and Ctrl + C
3. paste in Calc

If you haven't changed the options in QGIS, it will also copy a WKT column that defines the geometry of the features, which is the problematic part.

To see the difference with when the WKT columns is not included, use in QGIS:
Settings > Options > Data sources > Copy features as: plain text, no geometry.

Then repeat copying and pasting, which should be instant.
Comment 21 Eike Rathke 2023-03-27 13:14:00 UTC
The clipboard target format used on Paste is HTML with all the polygon information in the wkt_geom column for each record stuffed into one cell, which then is also cut at some point. From QGIS 3.24.3-Tisler (on Fedora), even with all data selected, after Copy there appear to be only 4 rows in the clipboard and data is truncated after 64k characters:

xclip -o | wc
      3    3108   65536

(3 instead of 4 because there's no final linefeed after truncated data). So the original performance problem couldn't be reproduced.

But for the HTML import slowness with huge data in one cell might be a bottle neck.
Comment 22 Stéphane Guillou (stragu) 2023-03-27 14:31:28 UTC
Created attachment 186248 [details]
contents of clipboard (TSV data)

(In reply to Eike Rathke from comment #21)
> xclip -o | wc
>       3    3108   65536

With QGIS 3.22, I get the full data in xclip:

xclip -o | wc
    207  557493 11824096

Attached, the contents of my clipboard. It is tab-separated data.
Comment 23 Eike Rathke 2023-03-28 09:47:57 UTC
And apparently TSV text/plain (available as Unformatted Text with Paste-Special) is not a problem, but the import of the HTML target is.
These targets are offered according to xclip -selection clipboard -o -t TARGETS:

text/plain;charset=utf-8
text/html
text/plain
UTF8_STRING
STRING
TARGETS
TIMESTAMP

HTML can be obtained with xclip -selection clipboard -o -t text/html
Comment 24 Stéphane Guillou (stragu) 2023-03-31 06:36:12 UTC
There's a lot of comments, so here's a summary:


Steps to reproduce:

1. load the attachment 186147 [details] geojson as a layer in QGIS
2. copy all the features with Ctrl + A and Ctrl + C (can also be done in the attribute table, opened with F6)
3. paste in Calc, click "OK" in Import Options dialog.


Result:

QGIS copies the attribute table with a WKT column that contains very long strings defining the geometries. The default clipboard target used in LO is HTML.

In my test (using LO 7.5.2.2, QGIS 3.22.16, Ubuntu 20.04, the default paste target), pasting locks up LO at 100% of one core for more than 6 minutes until content is pasted.
Opening the resulting ~5-MB ODS takes the same amount of time.
Once the data is in LO (207 rows of data + headings, which matches data in QGIS), working with the data frequently freezes LO again for anything between a few seconds to several minutes, 1 core at 100%, making it impossible to work with it. (For example: select all, align to top: frozen for more than 20 minutes at 100% of 1 core, 2.4 GB of memory used, then crashed.)

In LO, the resulting WKT column's contain strings up to 184,545 characters. Rows have very large heights, which might be related to the slowdowns.


Workarounds:

- in QGIS: Settings > Options > Data sources > Copy features as: plain text, no geometry. Only the attributes are pasted (but no spatial data is lost).
- or, in LO: Edit > Paste Special > Paste unformatted text. WKT is included  and paste is near-instant, but data is _silently_ truncated to 65,535 characters per cell. So a dataloss issue here.


Expected result:

If the number of characters per cell is too much to handle, LO should paste HTML tables similarly to text file imports: gracefully truncating the data to a sensible limit _while warning the user that data will be lost_.
But it looks like LO _can_ have more than 65,535 characters per cell, and this example paste has the same order of magnitude, so why is it struggling so much?