| Summary: | A showcase of HTML import, editing and export bugs in an HTML5 era | ||
|---|---|---|---|
| Product: | LibreOffice | Reporter: | Eyal Rozenberg <eyalroz1> |
| Component: | Writer | Assignee: | Not Assigned <libreoffice-bugs> |
| Status: | REOPENED --- | ||
| Severity: | normal | CC: | heiko.tietze, michael.warner.ut+libreoffice, quikee, vmiklos, vsfoote |
| Priority: | medium | ||
| Version: | 7.0.0.0.beta1+ | ||
| Hardware: | All | ||
| OS: | All | ||
| See Also: |
https://bugs.documentfoundation.org/show_bug.cgi?id=101179 https://bugs.documentfoundation.org/show_bug.cgi?id=95861 https://bugs.documentfoundation.org/show_bug.cgi?id=154434 |
||
| Whiteboard: | |||
| Crash report or crash signature: | Regression By: | ||
| Bug Depends on: | |||
| Bug Blocks: | 108799, 111951 | ||
| Attachments: |
The page saved using "Save Page WE" to a single HTML
Screenshot 01 - HTML meta tags as comments Screenshot 02 - Layout in LO Writer vs a browser Screenshot 03 - Three copies of same image (zoom 80%) |
||
|
Description
Eyal Rozenberg
2020-07-29 15:12:42 UTC
Created attachment 163738 [details]
The page saved using "Save Page WE" to a single HTML
You can save this attachment instead of following instructions 1 through 5.
Created attachment 163739 [details]
Screenshot 01 - HTML meta tags as comments
Created attachment 163740 [details]
Screenshot 02 - Layout in LO Writer vs a browser
Created attachment 163741 [details]
Screenshot 03 - Three copies of same image (zoom 80%)
This is issue (9.). You'll note three different "copies" of the same image/chart, with different degrees of bluriness.
Note it's not impossible that the three images exist within the .HTML file - but even if they do, they overlap, and some sort of Javascript magic ensures the crisp one shows and the others don't.
Some of the issues I brought up regard the LO UI, so adding needsUXEval Work on the Writer Web module ended at HTML 4.0 Transitional--while not officially deprecated the feature is essentially abandoned. Import and Export (save to HTML) works reasonably well for inline CSS2 HTML 4.0 markup--that is it. The default import filter mode for opening a .HTML document with LibreOffice is into the Writer Web module, into its 'Web' (un-paged view). I can not confirm reported issue of import opening to Writer Web 'Normal' (i.e. page view). Clear you user profile to defaults to resolve. The CSS of the js based HTML5/CSS3 web page linked is simply not renderable, and excess content/meta is filter import captured as comments. The Writer Web mode allows those spurious (to HTML 4.0) comments to be toggled off--or better to simply delete in bulk from the HTML file. Point is this is as good as it gets, and we have bug 95861 open to consider work to make the Writer Web module HTML5 and CSS3 aware if not functional. With some devs opining it would be better to drop the Writer Web module completely and only filter import to Writer, and export to styled XHTML. *** This bug has been marked as a duplicate of bug 95861 *** (In reply to V Stuart Foote from comment #6) You've made several points in your comment; but I'll begin by stressing that this bug is not a duplicate of 95861. That bug regards HTML5 and CSS3, like you yourself said; but this bug has nothing in particular to do with CSS3. While it's quite possible that the HTML I attached has some CSS3-specific selectors or attributes - most of the issues listed here have nothing to do with that. The appearance of the document may involve mis-handling or non-handling of CSS3, but I'm not even sure that's the case; and again - it's 2 out of 10 issues. It's important, IMHO, not to "kill" this bug as a dupe exactly because it showcases many issues at once. Oh, also - IIANM, the HTML itself in the attached document is plain-vanilla. Nothing beyond HTML 4.0 and probably earlier. > Work on the Writer Web module ended at HTML 4.0 Transitional--while not > officially deprecated the feature is essentially abandoned. I'm not sure I see why this is relevant. Bugs are bugs. If the feature was experimental, or unavailable by default etc. then it might be argued that bugs should not be reported and addressed. I understand that nobody is springing into action to fix this, and that is ok (well, maybe). > Import and Export (save to HTML) works reasonably well for inline CSS2 HTML > 4.0 markup--that is it. First note that this issue is not merely about the importation and the exportation but also about what LO does with what's been imported. Having said that - import and export doesn't work reasonably well in some cases. There are significant issues - as I have demonstrated. That is another reason why it is inappropriate to close this bug. > The default import filter mode for opening a .HTML document with LibreOffice > is into the Writer Web module, into its 'Web' (un-paged view). I can not > confirm reported issue of import opening to Writer Web 'Normal' (i.e. page > view). I'll try to get others to confirm. > > Clear you user profile to defaults to resolve. I've never cleaned my LO user profile before. I'll try it and report the result. > The CSS of the js based HTML5/CSS3 web page linked is simply not renderable, The web page is not "JS-based"; and it is quite renderable. In fact, its script elements are mostly empty. The URIs are actually not in src= attribtes but in data-savepage-src attributes. And if you delete the script tags, you still get basically the same rendering in a browser and the same mis-rendering in LibreOffice. > and excess content/meta is filter import captured as comments. ... which is a bug, or several bugs, as I've described. > Point is this is as good as it gets With respect - that is unacceptable. That is, you are of course under no personal obligation to fix things, but LO's current handling of HTML documents is not nearly what it should be, and there is no reason to lower users' expectations to the current state of the implementation. >, and we have bug 95861 open to consider > work to make the Writer Web module HTML5 and CSS3 aware if not functional. It's possible that work on that may help some of the issues here, but probably at most the two issues which may be the cause of lack of CSS3 support. Possibly not even those. > With some devs opining it would be better to drop the Writer Web module > completely and only filter import to Writer, and export to styled XHTML. Only 2 of the issues I've reported regard saving the edited file. And they too are valid issues, I believe, while writing HTML files is supported. Also, are you certain that saving this document to XHTML would yield reasonable output? I am somewhat doubtful. Please don't forget add the keyword needsUXEval when CC'ing libreoffice-ux-advise. Would rather drop HTML support than putting effort in. After 10 (or 20 years) there are better suited tools and HTML/CSS develops so fast that we never catch up. But anyway, nothing to discuss for UX. (In reply to Heiko Tietze from comment #9) > But anyway, nothing to discuss for UX. Actually, several points are UX/UI relevant: * "Real-estate" distribution of comments - generally and for comments which are the result of an imported piece of text not placed in the body of the document. * Named-author vs no-named-author comments - why should the latter say "no author" rather than not saying anything? * Undated comments - why don't we have them? * Possibility of hiding comment authors/dates, manually or automatically when we have many comments. * The many-comments scrolling mechanism * The in-comment-balloon scroll bar when it's super-small (see screenshot 1) All of these issues are not really about HTML support. > Would rather drop HTML support than putting effort in. After 10 (or 20 > years) there are better suited tools If there's a tool which takes an HTML and produces an ODT, maybe we can just use it / its code for importing? :-| > and HTML/CSS develops so fast that we never catch up. That's fair enough, but this bug is really not about catching up to new fancy CSS. The example is not the ACID test... We have several tickets about a large number of comments, eg. bug 38295. (In reply to Eyal Rozenberg from comment #10) > If there's a tool which takes an HTML and produces an ODT, maybe we can just > use it / its code for importing? :-| There is pandoc (https://pandoc.org/) which claims to support both html5 and odt, but I have not used it myself, so I have no idea how well it works. (In reply to Michael Warner from comment #12) > There is pandoc (https://pandoc.org/) which claims to support both html5 and > odt, but I have not used it myself, so I have no idea how well it works. Perhaps someone closer to LO development than I am might want to open an issue about exploring the possibility of integrating some pandoc import-filter+ODT-output-filter pairs into LO. |