Bug 91985 - FILEOPEN: Link external data to html file with <PRE> in table prevent correct table load
Summary: FILEOPEN: Link external data to html file with <PRE> in table prevent correct...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.4.3.2 release
Hardware: x86-64 (AMD64) All
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Calc-External-Datalink
  Show dependency treegraph
 
Reported: 2015-06-10 15:40 UTC by Jo Bobit
Modified: 2018-09-28 15:31 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Simple example of unsupported HTML page (530 bytes, text/html)
2015-06-10 15:40 UTC, Jo Bobit
Details
printscreen (14.15 KB, image/png)
2015-06-12 12:58 UTC, raal
Details
Printscreen LO 4.4.3 (256.68 KB, image/jpeg)
2015-06-18 12:45 UTC, Jo Bobit
Details
Example of HTML output from data server (530 bytes, text/html)
2015-06-18 14:56 UTC, Jo Bobit
Details
screenshot LO 5.2.1.2 (108.32 KB, image/jpeg)
2016-09-27 15:40 UTC, Jo Bobit
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jo Bobit 2015-06-10 15:40:49 UTC
Created attachment 116440 [details]
Simple example of unsupported HTML page

If you try to link to an external data source in an HTML format and the table contains <PRE></PRE> tags in some columns, the table is not loaded correctly.

It looks like the <PRE> prevent calc from detecting the column tag <TD> (see example file provided). Web browsers like Firefox or IE don't have any problem displaying such table.

The problem first appeared using an URL from web server, but can be reproduced with a simple html file.
Unfortunately I can't prevent my data source (on the server) to put the useless <PRE> tag in the page code (it's a comercial product).
Comment 1 Buovjaga 2015-06-11 15:57:08 UTC
How do I do this "link to an external data source"?
I tried with DDE function, but I got an error. https://help.libreoffice.org/4.4/Calc/Spreadsheet_Functions#DDE
Comment 2 raal 2015-06-12 12:58:29 UTC
Created attachment 116484 [details]
printscreen

LO 4.4.3, win7
I can load data with Insert->Link to external data

Problem is empty row 6 and 7 or something else? Thanks
Comment 3 raal 2015-06-16 13:56:53 UTC
set back to Unconfirmed after you answer questions from comment 1 and comment 2. thank you.
Comment 4 Jo Bobit 2015-06-18 10:44:52 UTC
Yes, open the attached file with 

   Insert -> Link to external data

select the file after clicking on "..." button then
select Table_all (or something like that; unfortunately my LO
now crash at this step with a message "application has stopped
working").

The table loads but the cell with the <PRE> in the HTML source
(i.e. Dat11 in table "wrong") is not loaded and a shift appear
in the resulting table.

Table correctly imported looks like this:

  +------+------+
  | Col1 | Col2 |
  +------+------+
  | Dat11| Dat12|
  +------+------+
  | Dat21| Dat22|
  +------+------+

Wrong table looks like that:

  +------+------+
  | Col1 | Col2 |
  +------+------+
  | Dat12|      |
  +------+------+
  | Dat21| Dat22|
  +------+------+

As I mention it here above my LO now crash short after I start it
so I can't reproduce or test more. I tried to "re install" it (I 
use a portable version on Windows) but still have the crash problem.

Sorry for the ASCII art.
Comment 5 Buovjaga 2015-06-18 11:43:41 UTC
In 5.1 the menu item has changed to Sheet->Link to external data.
Lang used for import: automatic
From the available, I selected HTML_tables

The only problem with the lower table is that there appears a gap of two empty rows before Dat21 and Dat22. So Dat11 is visible for me.

Win 7 Pro 64-bit Version: 5.1.0.0.alpha1+
Build ID: 437210d58f32177ef1829d704f7f4d2f1bbfbfdd
TinderBox: Win-x86@39, Branch:master, Time: 2015-06-18_07:21:56
Locale: fi-FI (fi_FI)
Comment 6 Jo Bobit 2015-06-18 12:45:46 UTC
Created attachment 116623 [details]
Printscreen LO 4.4.3

I finally got my LO work again.
So there is a screen shot of what I get.
Comment 7 Buovjaga 2015-06-18 12:57:28 UTC
(In reply to Jo Bobit from comment #6)
> Created attachment 116623 [details]
> Printscreen LO 4.4.3
> 
> I finally got my LO work again.
> So there is a screen shot of what I get.

So did you do the same as me, select HTML_tables from the list of available tables/ranges?
Comment 8 Jo Bobit 2015-06-18 14:56:43 UTC
Created attachment 116626 [details]
Example of HTML output from data server

Yes, I select HTML_tables to get the result shown in the screen shot.
But I just realize I used a slightly different HTML source file; the
<PRE> tag was around the <TD></TD> not in between.

Here is attached the "real" output of my data server (with <PRE> around
<TD></TD>).
Comment 9 Buovjaga 2015-06-18 15:33:48 UTC
(In reply to Jo Bobit from comment #8)
> Created attachment 116626 [details]
> Example of HTML output from data server
> 
> Yes, I select HTML_tables to get the result shown in the screen shot.
> But I just realize I used a slightly different HTML source file; the
> <PRE> tag was around the <TD></TD> not in between.
> 
> Here is attached the "real" output of my data server (with <PRE> around
> <TD></TD>).

Ok, with this I do get the same result as you :)

Win 7 Pro 64-bit Version: 5.1.0.0.alpha1+
Build ID: 437210d58f32177ef1829d704f7f4d2f1bbfbfdd
TinderBox: Win-x86@39, Branch:master, Time: 2015-06-18_07:21:56
Locale: fi-FI (fi_FI)
Comment 10 QA Administrators 2016-09-20 10:09:46 UTC Comment hidden (obsolete)
Comment 11 Jo Bobit 2016-09-27 15:40:40 UTC
Created attachment 127678 [details]
screenshot LO 5.2.1.2

I tested with LO 5.2.1.2 Portable (see screenshot added) and the bug is still there.

I used Sheet->Link to External Data... pointed to the Example of HTML file
and selected automatic language detection, then HTML_all.

I got the same result (missing Dat11 in the second table).


OS Name	Microsoft Windows 7 Professional
Version	6.1.7601 Service Pack 1 Build 7601
System Type	x64-based PC
Comment 12 QA Administrators 2018-08-29 02:42:06 UTC Comment hidden (obsolete)
Comment 13 Jo Bobit 2018-09-28 13:03:29 UTC
I tested again the problem with the version 6.1.0 and the problem evolved
a bit. I simply can't anymore link the file! When I try to link the test
file, the dialog let me choose the file but do not detect any table in
the file and the "Ok" stay disabled. So it's impossible to tell if the
bug due to the PRE is still present.

LibreOffice used:

Version: 6.1.0.3
Build ID: efb621ed25068d70781dc026f7e9c5187a4decd1
CPU threads: 8; OS: Windows 6.1; UI render: default; 
Locale: en-US (fr_CH); Calc: CL
Comment 14 Buovjaga 2018-09-28 15:31:46 UTC
(In reply to Jo Bobit from comment #11)
> Created attachment 127678 [details]
> screenshot LO 5.2.1.2
> 
> I tested with LO 5.2.1.2 Portable (see screenshot added) and the bug is
> still there.
> 
> I used Sheet->Link to External Data... pointed to the Example of HTML file
> and selected automatic language detection, then HTML_all.
> 
> I got the same result (missing Dat11 in the second table).

I still repro with attachment 116626 [details]

Arch Linux 64-bit
Version: 6.1.1.2
Build ID: 6.1.1-1
CPU threads: 8; OS: Linux 4.18; UI render: default; VCL: gtk3_kde5; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group threaded

Arch Linux 64-bit
Version: 6.2.0.0.alpha0+
Build ID: d357ea1d1ff95cb5ce2ee6b4828afa2484707256
CPU threads: 8; OS: Linux 4.18; UI render: default; VCL: gtk3_kde5; 
Locale: fi-FI (fi_FI.UTF-8); Calc: threaded
Built on 28 September 2018