Bug 138226 - Poor and slow HTML importation - vertically stretched elements
Summary: Poor and slow HTML importation - vertically stretched elements
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.0.2.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-11-15 05:51 UTC by Luke Kendall
Modified: 2022-04-11 17:38 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample document (44.38 KB, application/vnd.oasis.opendocument.text)
2020-11-15 05:51 UTC, Luke Kendall
Details
Result with LO 7.1.0.0beta1 (96.58 KB, application/vnd.oasis.opendocument.text)
2020-12-09 08:13 UTC, Dieter
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kendall 2020-11-15 05:51:50 UTC
Created attachment 167308 [details]
Sample document

When I select all the contents of https://www.mailerlite.com/pricing and paste into Writer (wanting to resize to fit onto a single sheet of paper), I find that the result would require an hour of editing to fix.
See attached document.

The conversion process is also slow - it took 30 secs or a minute for the contents to appear.

In contrast, I found that if I pasted the same contents into Thunderbird's rich text editor, the HTML appeared to be a very faithful copy of the web page, and also appeared and could be edited within a few seconds.

The only real error I noticed was the vastly stretched X marks on some of the rows, but that made the document too much work to fix.

Hope this is helpful.
Comment 1 Dieter 2020-12-09 08:13:17 UTC
Created attachment 167994 [details]
Result with LO 7.1.0.0beta1

I can't confirm it with

Version: 7.0.3.1 (x64)
Build ID: d7547858d014d4cf69878db179d326fc3483e082
CPU threads: 4; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: en-GB
Calc: threaded

Steps I did:
1. I opened https://www.mailerlite.com/pricing
2. Strg+A
3. Strg+C
4. Open new documetn in writer
5. Paste => HTML format without comments


Could you please try to reproduce it with a master build from http://dev-builds.libreoffice.org/daily/master/current.html ? You can install it alongside the standard version. I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' if the bug is still present in the master build
Comment 2 Dieter 2020-12-09 08:16:10 UTC
Previous test used LO 7.0.3.1 but I get the same result with

Version: 7.1.0.0.beta1 (x64)
Build ID: 828a45a14a0b954e0e539f5a9a10ca31c81d8f53
CPU threads: 4; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: en-GB
Calc: threaded

Sorry for confusion
Comment 3 Luke Kendall 2020-12-10 04:15:07 UTC
I'm on a tight deadline right now so don't have time to test with the daily build.
But I just confirmed that it still has that same problem for me in 7.0.3.1:

Version: 7.0.3.1
Build ID: 00(Build:1)
CPU threads: 4; OS: Linux 5.8; UI render: default; VCL: gtk3
Locale: en-GB (en_AU.UTF-8); UI: en-US
Ubuntu package version: 1:7.0.3-0ubuntu0.20.10.1
Calc: threaded

I wonder if it's a Linux only problem?
Comment 4 starb1585 2020-12-14 02:11:50 UTC
(In reply to Dieter from comment #1)
> Created attachment 167994 [details]
> Result with LO 7.1.0.0beta1
> 
> I can confirm that its doing the same thing on the master build
http://dev-builds.libreoffice.org/daily/master/current.html
> 
> 
> Steps I did:
> 1. I opened https://www.mailerlite.com/pricing
> 2. copy and paste to writer
> 3. Distortion on the X
>
>
Comment 5 Buovjaga 2021-11-25 15:06:10 UTC
I get a different result, which does not include any images and styles. Has the behaviour changed completely? Could everyone please check again?

Arch Linux 64-bit
Version: 7.2.2.2 / LibreOffice Community
Build ID: 20(Build:2)
CPU threads: 8; OS: Linux 5.14; UI render: default; VCL: kf5 (cairo+xcb)
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
7.2.2-2
Calc: threaded
Comment 6 Timur 2022-04-11 10:16:18 UTC
No problem for me either in Linux and Windows, also no images but there are none. 
Simple paste. 
This report didn't have a result and didn't specify if behind some proxy. 
So I set Needinfo.
Comment 7 Luke Kendall 2022-04-11 17:36:10 UTC
No it was not behind a proxy.

The only images were the green ticks and the red crosses.

In the older version of Writer, the X images (character?) was stretched vertically to about ten times the size of the tick mark, making each row enormously tall.  The problematic result is still visible in the sample document.

The version of Writer I still have installed on my old computer (7.3.0.3) has no problem pasting the contents of that URL now, except browsers no longer select the ticks and crosses. I suspect they've changed their HTML since I reported the bug, so no ticks or crosses are included in the copy, to paste, so that page is no longer a good test example.

Picking a random other page with images and html though (https://growthlab.com/how-to-self-publish-a-book-and-double-revenue/) and copying and pasting seems to work fine, so maybe the problem has since been fixed.