Bug 53089 - Numbers from external URL are always interpreted as text
Summary: Numbers from external URL are always interpreted as text
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.6.0.2 rc
Hardware: All All
: high normal
Assignee: Kohei Yoshida
URL:
Whiteboard: target:3.7.0 target:3.6.1
Keywords: regression
Depends on:
Blocks: mab3.6
  Show dependency treegraph
 
Reported: 2012-08-03 09:40 UTC by Frank Richter
Modified: 2012-08-16 17:32 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
simple document showing this bug. (10.05 KB, application/vnd.oasis.opendocument.spreadsheet)
2012-08-03 09:40 UTC, Frank Richter
Details
reformated html file for linking to external data (5.12 KB, text/html)
2012-08-09 16:17 UTC, Jean-Baptiste Faure
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Richter 2012-08-03 09:40:22 UTC
Created attachment 65080 [details]
simple document showing this bug.

While inserting a link to external data (http-URL), numbers in a html formatted
table are always interpreted as text, not as number. That is different from former verions of LibreOffice (up to version 3.5) and OpenOffice.
So, as a result, all calculations with this data receive zeroes.
Workaround is to wrap everything in value(...) functions, but that requires a lot of changes in a lot of tables here.

Checking "Detect special numbers (such as dates)" in the "Import Options"-dialogue does not make any difference.
Comment 1 Jean-Baptiste Faure 2012-08-04 17:45:51 UTC
LO says : impossible to update the link ...
Perhaps cookies problem ?

Best regards. JBF
Comment 2 Frank Richter 2012-08-06 11:11:15 UTC
The linked HTML document ist static, no cookie handling involved.
I'll recheck this today evening from home.

Regards, Frank.
Comment 3 Frank Richter 2012-08-06 23:52:02 UTC
I just rechecked update of linked data -- works like a charm. You can get at the file at http://www.ergora.eu/data/libreoffice3.6-external_data_bug.html
No complicated things or API involved, every other simple html table should show this bug as well.

Regards, Frank.
Comment 4 Jean-Baptiste Faure 2012-08-09 15:25:51 UTC
(In reply to comment #2)
> The linked HTML document ist static, no cookie handling involved.
> I'll recheck this today evening from home.

Sorry, I made my test with my own build of LO 3.6. I can link the spreadsheet to your external data if I use the official build of LO 3.6.0.4

I reproduce the behavior you describe: numbers are imported as text unlike LO 3.5.5. So regression.

Best regards. JBF
Comment 5 Jean-Baptiste Faure 2012-08-09 15:53:52 UTC
I did my test on Ubuntu 11.10 x86_64.

Best regards. JBF
Comment 6 Jean-Baptiste Faure 2012-08-09 16:17:32 UTC
Created attachment 65351 [details]
reformated html file for linking to external data

I played with a copy of your html file. Here is a new version of this file. I only opened it with LO 3.6.0.4, changed the format of numbers (starting to column 4) from text to number and saved under a new name.
Now using this html file as external data provider, works as expected.

Can you try with my file ?

Best regards. JBF
Comment 7 Rainer Bielefeld Retired 2012-08-09 16:56:02 UTC
[Reproducible] with Server Installation of  "LibreOffice 3.6.0.4  German UI/Locale [Build-ID:  932b512] on German WIN7 Home Premium (64bit) 

Steps how to reproduce:
1. Open new Spreadsheet document, click cell A1
2. Menu 'Insert -> Link to external Data'
3. copy / Paste URL from Comment 3 to "URL of external Source" pane
4. <Enter>
   > Import options dialog appears
5. Language Automatic
6. check "Detect Numbers" (does not matter), <ok>
7. In next dialog select "HTML_tables"  <ok>
   > Contents appears in A1:L4
8. Menu 'View -> Value Highlighting'
   Expected: all Contents looking like numbers should be shown as numbers in
             blue
   Actual: all black, what's looking like numbers is text.

Still [Reproducible] with parallel installation of Master "LOdev " 3.7.0.0.alpha0+   - WIN7 Home Premium (64bit) ENGLISH UI [Build ID: 66e4540]" (tinderbox:Win-x86@6, pull time 2012-07-26 02:09:47) 

*Already* [Reproducible] with Server Installation of  "LibreOffice 3.6.0.2 rc  German UI/Locale [Build-ID:  815c576] on German WIN7 Home Premium (64bit) 

Was *still OK* with
- 3.6.0.0.beta3
- Server installation of Master "LOdev 3.6.0alpha0+  – WIN7 Home Premium (64bit) ENGLISH UI [Build ID: a502549]" (tinderbox: Win-x86@6-fast, pull time 2012-05-31 07:33:55)
- MinGW Build 2012-04-26

@Spreadsheet Team
Please set Status to ASSIGNED and add yourself to "Assigned To" if you accept this Bug or forward the Bug if it's not your turf.    (and remove others in team from CC)
Comment 8 Rainer Bielefeld Retired 2012-08-09 17:01:37 UTC
@Jean-Baptiste Faure
I reproduced "from the scratch", so I did not do any further tests.
Comment 9 Frank Richter 2012-08-10 07:03:35 UTC
(In reply to comment #6)
> Created attachment 65351 [details]
> reformated html file for linking to external data
> 
> I played with a copy of your html file. Here is a new version of this file. I
> only opened it with LO 3.6.0.4, changed the format of numbers (starting to
> column 4) from text to number and saved under a new name.
> Now using this html file as external data provider, works as expected.
> 
> Can you try with my file ?
> 
> Best regards. JBF

With Your file LibreOffice works as expected. Your HTML-Table contains non-standard attributes SDVAL= and SDNUM=. I assume, these are inserted by LibreOffice and cause LO to re-read the numbers as numbers when used as external data source.
Comment 10 Jean-Baptiste Faure 2012-08-10 20:57:50 UTC
Here is an example of real website with numbers which are imported as text in LO 3.6 and as numbers in LO 3.5. May be useful for test purpose.
http://www.insee.fr/fr/themes/conjoncture/serie_revalorisation.asp

Best regards. JBF
Comment 11 Yi Ding 2012-08-10 21:58:11 UTC
This sounds a lot like this bug: https://bugs.freedesktop.org/show_bug.cgi?id=52205
Comment 12 Eike Rathke 2012-08-11 09:55:01 UTC
@Kohei:
This seems to have the same cause as bug 52205 though the fix isn't as straight forward. See sc/source/filter/rtf/eeimpars.cxx line 334 in
ScEEImport::WriteToDocument() the if(bSimple) case with mbSetTextCellFormat=true, that's used for RTF and HTML import.
Comment 13 Kohei Yoshida 2012-08-11 14:17:16 UTC
I'll take a look at this on Monday.
Comment 14 Kohei Yoshida 2012-08-13 17:25:25 UTC
When fixing (or verifying) this bug, check Bug 52205 as well.
Comment 15 Kohei Yoshida 2012-08-13 17:27:56 UTC
And Bug 43109 too.
Comment 16 Not Assigned 2012-08-13 18:24:30 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=51f1fc69aa539dec8035195b98e0b128026388e9

fdo#53089: Avoid setting valid numbers as text during html import.
Comment 17 Not Assigned 2012-08-14 11:05:34 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "libreoffice-3-6":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=22571981cdae59bce508dfd2af4c873aa216d885&g=libreoffice-3-6

fdo#53089: Avoid setting valid numbers as text during html import.


It will be available in LibreOffice 3.6.1.
Comment 18 Kohei Yoshida 2012-08-16 17:32:33 UTC
Fixed.