Bug 99249 - CALCULATION WRONG FROM HTML TO TEXT. Problem detecting numbers when copying from html and pasting in Calc
Summary: CALCULATION WRONG FROM HTML TO TEXT. Problem detecting numbers when copying ...
Status: CLOSED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
5.1.0.3 release
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: needsDevAdvice
Depends on:
Blocks:
 
Reported: 2016-04-12 17:33 UTC by John Shabanowitz
Modified: 2017-05-29 13:11 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Zip file of summary html page and folder of supporting files for problem (569.82 KB, application/x-zip-compressed)
2016-05-03 18:53 UTC, John Shabanowitz
Details
simplified html file (919 bytes, text/html)
2017-04-03 19:30 UTC, Carlos
Details

Note You need to log in before you can comment on or make changes to this bug.
Description John Shabanowitz 2016-04-12 17:33:10 UTC
Copy a list of accounts and totals from a web page. Paste into Calc as unformatted text. Sum the column of numbers. Calculation is wrong. I did this numerous times. I don't know if it is me or Calc 5.0.1. Paste the html into a text editor and then copy paste into Calc and the calculations are correct.
Comment 1 Julien Nabet 2016-04-12 19:05:54 UTC
If you use 5.1 branch, could you give a try to last 5.1.2?
Also, could you give a step by step process including the website you used?
Comment 2 John Shabanowitz 2016-04-24 00:12:37 UTC
As you requested, I tried the same thing in LO 5.1.2.2. The bug does not seem to apply. The website was https://banking2.capitalone.com. I don't want to publicly post any data files. 

The step by step process was to open and sign into the bank website. Open a Calc sheet to receive the data. First select and copy data from website. Click on LO, then click in first cell A1. Right click and select Paste Special. Choose Unformatted Text from the two choices, Unformatted Text and HTML Formatted Text.
Comment 3 Buovjaga 2016-05-02 15:17:30 UTC
I don't know, if this can be said to be the fault of Calc's. Websites can add any crap they like in the HTML markup and it might mess things up.

If you can't share the HTML, we will never know what the problem is.

You might copy the HTML and replace any sensitive data. It could just be a couple of rows so we see the structure.

Set to NEEDINFO.
Change back to UNCONFIRMED after you have provided the data.
Comment 4 John Shabanowitz 2016-05-03 18:53:06 UTC
Created attachment 124827 [details]
Zip file of summary html page and folder of supporting files for problem

This is the html page the bank serves. Sensitive information has been "REDACTED" or set to $0.00 value. The supporting folder is images, java script files, etc.
Comment 5 Buovjaga 2016-05-04 09:12:11 UTC
Something is not right with LibreOffice. I have "Detect special numbers" checkbox ticked when doing a special paste, yet the amounts are not detected as Currency format type.
The same happens with a simple CSV:
$1,00;$2,00;juu
$1,00;$2,00;juu

It gives them the default format. Right-click - Format cells - Numbers to check.

The same behavior is already in version 3.5 for me. Yet, the web is full of people saying Detect special numbers solved their currency import problem.

You could do a mass change of the format after pasting/importing, but that always adds an apostrophe so it reads '$1,00. So you would have to get rid of that as well.

Here is an extension that might help, I didn't test: http://extensions.libreoffice.org/extension-center/ct2n-convert-text-to-number-and-dates

This needs more testers and insight.
Comment 6 Carlos 2017-04-03 19:30:52 UTC
Created attachment 132316 [details]
simplified html file
Comment 7 Carlos 2017-04-03 19:43:19 UTC
* copy data from the html page. 
* click in first cell A1. 
* right click and select Paste Special. 
* choose Unformatted Text
* "Detect special numbers" checkbox is NOT ticked
RESULT: LO does not detect the numbers

* copy data from the html page. 
* click in first cell A1. 
* right click and select Paste Special. 
* choose Unformatted Text
* "Detect special numbers" checkbox IS ticked
RESULT: LO detects the numbers


* copy data from the html page. 
* click in first cell A1. 
* right click and select Paste Special. 
* choose HTML format
RESULT: LO detects the numbers


* copy data from the html page. 
* paste it in a text editor
* copy the text from the text editor
* click in first cell A1. 
* right click and select Paste
* Unformatted Text is automatically detected
* "Detect special numbers" checkbox IS ticked
RESULT: LO detects the numbers



* copy data from the html page. 
* paste it in a text editor
* copy the text from the text editor
* click in first cell A1. 
* right click and select Paste
* Unformatted Text is automatically detected
* "Detect special numbers" checkbox is NOT ticked
RESULT: LO does not detect the numbers
Comment 8 Eike Rathke 2017-05-29 13:10:10 UTC
Works as designed. The "number" is preceded with a dollar sign and contains a decimal separator. To be imported as a numeric number it has to be interpreted as English-US and "Detect special numbers" has to be checked.