Bug 56772 - FILTER: CALC moves some column content to another cell with some HTML file
Summary: FILTER: CALC moves some column content to another cell with some HTML file
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.6.2.2 release
Hardware: Other All
: medium normal
Assignee: Eike Rathke
URL:
Whiteboard: BSA target:4.1.0 target:3.6.7 target:...
Keywords: regression
Depends on:
Blocks:
 
Reported: 2012-11-05 13:12 UTC by MichaelB
Modified: 2013-05-17 09:21 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments
Excel file (HTML) that bug on LibreOffice (5.89 KB, application/vnd.ms-excel)
2012-11-05 13:12 UTC, MichaelB
Details

Note You need to log in before you can comment on or make changes to this bug.
Description MichaelB 2012-11-05 13:12:53 UTC
Created attachment 69562 [details]
Excel file (HTML) that bug on LibreOffice

Problem description: 

With excel file exported from JasperAnalysis, columns are shifted to the left if nothing is inside the cells. The document become unusable because some values are in the wrong columns.

This problem appear in LibreOffice 3.5.x, 3.6.x but not in 3.3.x 


Steps to reproduce:
Open my attached file and open it in excel and after LibreOffice. You will see that numbers appears not on the same columns. (excel is correct)

You can also open it with a text editor because is HTML and see that LibreOffice put data not in the right cells...


Platform (if different from the browser): 
              
Browser: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:16.0) Gecko/20100101 Firefox/16.0

Problem appear with all OS.
Comment 1 Markus Mohrhard 2012-11-11 08:04:49 UTC
This is just an html file and has nothing to do with excel.
Comment 2 Markus Mohrhard 2012-11-18 12:46:29 UTC
So after spending one night trying to understand our HTML parser I still have no clue.

The problem is the <td> element without content which is stripped at some point either in the editengine or in ScHTMLParser. Sadly this is code is so complex that it is nearly impossible to follow the flow of the execution.

I'm quite sure that the problem is already during the parsing and not later after setting the column.
Comment 3 Kohei Yoshida 2012-11-21 19:33:59 UTC
Markus, I assume you've confirmed this bug at least?
Comment 4 Kohei Yoshida 2013-01-08 15:48:05 UTC
We have 'regression' keyword. No need to repeat that in the subject.
Comment 5 Eike Rathke 2013-05-16 20:07:10 UTC
Taking.
Comment 6 Commit Notification 2013-05-16 22:01:23 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=bb7360ca9929e9b395b3c903f460c9ed5efdce4d

resolved fdo#56772 keep track of HTML ON/OFF tokens



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 7 Eike Rathke 2013-05-16 22:24:29 UTC
Pending review
for 4-0 as https://gerrit.libreoffice.org/3925
for 3-6 as https://gerrit.libreoffice.org/3926
Comment 8 Commit Notification 2013-05-17 07:31:57 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-3-6":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=9cedf500f2299022641bb774549a6600e1af43d1&h=libreoffice-3-6

resolved fdo#56772 keep track of HTML ON/OFF tokens


It will be available in LibreOffice 3.6.7.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 9 Commit Notification 2013-05-17 07:32:32 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=2d4e570f9d6e6464cf597a1a90e84ccb3b232b5a&h=libreoffice-4-0

resolved fdo#56772 keep track of HTML ON/OFF tokens


It will be available in LibreOffice 4.0.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.