Created attachment 186269 [details] File wih styles for image amd text placement I open the attached file and Writer loses the formatting. The images are not properly placed or scaled and the text is not centered. I haven't checked whether the hyperlinks are intact.
Created attachment 186271 [details] Layout in browser This is how the HTML file displays in Firefox. The images are contained in ../Images, a sibling directory to the one containg the HTML.
Created attachment 186272 [details] Layout in Writer This is how the HTML file appears when I open it in LibreOffice.
Thank you for reporting the bug. One question: Is it an html-document from a website (can you paste address here) or is it an odt-file saved as html (could you please attach odt-file in this case)? => NEEDINFO
This is a hand crafted HTML file that I wanted to convert to LibreOffice Writer. The three attachments are: 1. The HTML file. 2. A screen shot of Firefox viewing the file with the file schema 3. A screen shot of LibreOffice Writer after importing the HTML file
I confirm, that LO opens file wrong, but I'm not so familiar with html, that I'm able to decide, if it is a bug in LO Writer or if there is something wrong with html code. I hope, htat somebody else can help.
LibreOffice is not a web browser. LibreOffice's document model can not be mapped to the layout capabilities of browsers.
(In reply to Buovjaga from comment #6) > LibreOffice is not a web browser. LibreOffice's document model can not be > mapped to the layout capabilities of browsers. I agree, but what does it mean for this report?
(In reply to Dieter from comment #7) > (In reply to Buovjaga from comment #6) > > LibreOffice is not a web browser. LibreOffice's document model can not be > > mapped to the layout capabilities of browsers. > > I agree, but what does it mean for this report? It means the request in this report can't happen unless we get the ambition to do web layout in Writer. I don't know how that would work, but everything is possible.
I can confirm that this behavior is still present in Version: 24.8.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: a2265e8faa099d9652efd12392c2877c2df1d1eb CPU threads: 8; OS: Windows 10.0 Build 19045; UI render: default; VCL: win Locale: en-US (en_US); UI: en-US Calc: threaded and Version: 24.2.1.2 (X86_64) / LibreOffice Community Build ID: db4def46b0453cc22e2d0305797cf981b68ef5ac CPU threads: 8; OS: Windows 10.0 Build 19045; UI render: default; VCL: win Locale: en-US (en_US); UI: en-US Calc: threaded
The behavior is confirmed. LO isn't designed for creating web pages or browsing through web pages. So this might be an enhancement request.
(In reply to Robert Großkopf from comment #10) > The behavior is confirmed. LO isn't designed for creating web pages or > browsing through web pages. So this might be an enhancement request. Then we should ask design team, if they think LibreOffice should include a web browser competitive with the major players and also to adapt our document model completely according to the requirements.
I think key is the anchoring of images that we do 'To Character', by default. At least after opening Writer with the document as parameter (loading via Ctrl+O stalls infinitely; and opening from the start center runs the document in Writer Web) and manually switching the anchor from 'As Character' as it always is (obviously ignoring tools > options > writer > formatting aids > image anchor) brings text next to the image. Ultimately we will not reach pixel-perfect representation, as browsers with different engines works differently. How about using a table in the HTML sources? (In reply to Buovjaga from comment #11) > Then we should ask design team, if they think LibreOffice should include a > web browser competitive with the major players and also to adapt our > document model completely according to the requirements. We shouldn't include a complete browser. But since we are proud to filter data from almost any source, we need to support HTML too. At least the basic features.
Review the see also list for discussion of what would be needed to move forward from the LO filter support of HTML4.0 transitional and CSS 1.1 styling, to make LO relevant for current W3C/WHATWG web standards. There are ongoing suggestions to remove the Writer Web module. And to instead improve the import filters for importing to Writer, Draw, Impress, or Calc as ODF only documents. With corresponding export filter work to render ODF back to web content. The utility of LibreOffice as an HTML4 editor continues to degrade. So there is nothing actionable here for the HTML of the OP (and here we simply don't handle the class= positioning of the embedded css on filter import). IMHO => NAB and => WF for any effort to address this single issue. Dev's with UX-advise agreement should decide what to do with the Writer Web module.
(In reply to Heiko Tietze from comment #12) > I think key is the anchoring of images that we do 'To Character', by > default. At least after opening Writer with the document as parameter > (loading via Ctrl+O stalls infinitely; and opening from the start center > runs the document in Writer Web) and manually switching the anchor from 'As > Character' as it always is (obviously ignoring tools > options > writer > > formatting aids > image anchor) brings text next to the image. > > Ultimately we will not reach pixel-perfect representation, as browsers with > different engines works differently. How about using a table in the HTML > sources? If you look at the source document, it uses position:absolute, floats, display:inline, percentage widths, max-width, margins. All of these are specified in the CSS standard to work and play together in a certain way.
(In reply to Buovjaga from comment #14) > If you look at the source document, it uses position:absolute, floats, > display:inline, percentage widths, max-width, margins. All of these are > specified in the CSS standard to work and play together in a certain way. Just to clarify as there was a misunderstanding in a chat channel, I think this report should be closed as wontfix due to being unrealistic.
WONTFIX also makes sense to me. > The utility of LibreOffice as an HTML4 editor continues to degrade. yes – > There are ongoing suggestions to remove the Writer Web module. > And to instead improve the import filters for importing to Writer, > Draw, Impress, or Calc as ODF only documents. Makes sense to me and seems to be easier to handle UX-wise, too, since it would be an import/export, just like the other formats.
All comments vote for WF. One the one hand we want to support as many formats as possible and do have support for HTML but on the other we surely cannot catch up with Internet browsers. We might be able to improve in some area but likely not in case of complex layouts. So the recommendation is to create a more simple document that LibreOffice can load rather than spending a lot of effort.
Writer cannot even import the plainest of plain HTML (see the attached), which is rendered perfectly OK, even with Word XP, dating back from 2001. However, what's worse, it actually opens the link, which in this case is harmless, but might, in other scenarios, pull in all kinds of nasty stuff. If you want to compete with this other office suite, you need at least render such simple html correctly!
Created attachment 196061 [details] Writer cannot even ender the simplest of simple HTML...
(In reply to robert from comment #18) > Writer cannot even import the plainest of plain HTML... Wouldn't call it simple. But let's check the details. The first line ends with "<snip...> <b>To</b> " followed by "| +-- <...snip>". I see no line break such as <BR>, <P>, or #13. What exactly should the HTML filter accept as line break (or IOW what is the standard here)?
(In reply to Heiko Tietze from comment #20) > (In reply to robert from comment #18) > > Writer cannot even import the plainest of plain HTML... > Wouldn't call it simple. But let's check the details. > > The first line ends with "<snip...> <b>To</b> " > followed by "| +-- <...snip>". I see no line break such as <BR>, <P>, or > #13. What exactly should the HTML filter accept as line break (or IOW what > is the standard here)? Seems the import filter doesn't know anything about "white-space" With option "pre" it will accept all spaces and won't set more than one space to one. It will also accept 
 or \n. Also import filter ignores display:none.
It was confirmed, but a clear => WF of OP or later comments where attachment 196061 [details] needs to be edited for line ends prior to import. Other than WF here--two paths forward, either enhance the Writer Web module and its filters to parse current CSS and HTML5 and write out viable CSS/HTML5 content. Or *dump* Writer Web module and its HTML4 Transitional implementation completely, instead refactoring the filters to import Web documents into the core modules and implicitly converting them to native ODF. Export as/print to HTML/XHTML to handle publication, just like PDF and EPub.