Bug 107677 - FILESAVE HTML: Comments are added after exporting to HTML
Summary: FILESAVE HTML: Comments are added after exporting to HTML
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
4.3.0.4 release
Hardware: All All
: medium minor
Assignee: Not Assigned
URL:
Whiteboard: target:5.4.0
Keywords: bibisected, bisected, difficultyInteresting, regression
Depends on:
Blocks:
 
Reported: 2017-05-07 12:11 UTC by Telesto
Modified: 2017-05-10 15:51 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Example file (8.96 KB, application/3dr)
2017-05-07 12:11 UTC, Telesto
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Telesto 2017-05-07 12:11:38 UTC
Description:
LibO opens the exported HTML file with multiple comments. The is quite a small example 

Steps to Reproduce:
1. Open the attached file
2. Save as HTML
3. Open the HTML file

Actual Results:  
HTML file contains comments

Expected Results:
There should be no comments


Reproducible: Always

User Profile Reset: No

Additional Info:
Found in
Version: 5.4.0.0.alpha1+
Build ID: 274ecb49b70b3f01d47546e3b44317946c106042
CPU threads: 4; OS: Windows 6.2; UI render: default; 
TinderBox: Win-x86@62-TDF, Branch:MASTER, Time: 2017-05-05_22:45:07
Locale: nl-BE (nl_NL); Calc: single

and in
LibO4.3

but not in:
Versie: 4.2.0.4 
Build ID: 05dceb5d363845f2cf968344d7adab8dcfb2ba71


User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0
Comment 1 Telesto 2017-05-07 12:11:58 UTC
Created attachment 133122 [details]
Example file
Comment 2 Buovjaga 2017-05-07 16:46:31 UTC
Reproduced.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.4.0.0.alpha1+
Build ID: 6e4cba99bb35e6697b94309eedd1a08ebea2dc68
CPU threads: 8; OS: Linux 4.10; UI render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on May 5th 2016
Comment 3 Xisco Faulí 2017-05-10 11:34:48 UTC
Regression introduced by:

author	Jan Holesovsky <kendy@collabora.com>	2014-06-06 19:52:48 (GMT)
committer	Jan Holesovsky <kendy@collabora.com>	2014-06-06 19:55:06 (GMT)
commit f7d51f43deda5e28df63f1b8e168e84838d0d0b4 (patch)
tree 4741929dfee04031a3c986b6f1204d92c6aa8c45
parent c2034f3993ab0e6f4550f3059b3229bc319b6b56 (diff)
html export: More standard time specification in <meta/>.

Bisected with bibisect-44max.

Adding Cc: to Jan Holesovsky
Comment 4 Jan Holesovsky 2017-05-10 13:36:52 UTC
Heh, I don't think this is a regression :-)  The functionality to insert comment for meta-tags was there previously, so for pages that have the correct specification of the time you'd see it too.

It was only that LibreOffice was ignoring its own malformed meta tags previously, but after they are fixed, they started appearing on re-import.

Having said that - I can imagine that we could actually read the meta tags correctly, instead of showing them as comments :-)  Let's turn this into an easy hack:

The actual postit note is added in SwHTMLParser::ParseMoreMetaOptions() in sw/source/filter/html/swhtml.cxx, see SwPostItField aPostItField etc. at the very end of that method.

Instead of this, we want to have an 'if' for OOO_STRING_SVTOOLS_HTML_META_created and OOO_STRING_SVTOOLS_HTML_META_changed, and handle that similarly to how is HtmlMeta::Created and HtmlMeta::Changed handled in svtools/source/svhtml/parhtml.cxx, see HTMLParser::ParseMetaOptionsImpl().
Comment 5 Jan Holesovsky 2017-05-10 13:43:06 UTC
Heh - now I read the HTMLParser::ParseMetaOptionsImpl better, and indeed it's a regression, sorry :-(
Comment 6 Commit Notification 2017-05-10 15:49:56 UTC
Jan Holesovsky committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=bb0543f6de926a2d89797162a974bb01772d890d

tdf#107677 html import: Import ISO8601 datetime in html meta tags too.

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Commit Notification 2017-05-10 15:50:32 UTC
Jan Holesovsky committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c43ee7161ab57a22b53655b7b8d6e20ce1a70daa

related tdf#107677 html import: Fix the legacy datetime format reading.

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Jan Holesovsky 2017-05-10 15:51:55 UTC
Fixed now :-)