Bug 81403 - FILESAVE: Problems with LibreOffice Writer's HTML output
Summary: FILESAVE: Problems with LibreOffice Writer's HTML output
Status: RESOLVED INVALID
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.2.5.2 release
Hardware: Other Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: BSA
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-16 01:26 UTC by Tracy Chu
Modified: 2015-09-04 03:01 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
The spelling variation file (11.29 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2014-07-16 01:26 UTC, Tracy Chu
Details
LibreOffice HTML (4.37 KB, text/html)
2014-12-09 20:42 UTC, Robinson Tryon (qubit)
Details
Word HTML (5.75 KB, text/html)
2014-12-09 20:42 UTC, Robinson Tryon (qubit)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tracy Chu 2014-07-16 01:26:12 UTC
Created attachment 102882 [details]
The spelling variation file

Problem description: 

Libre Office's Times New Roman is more "squished" when it is pasted as html, while the tables in Writer are too wide and are difficult to change.  


Steps to reproduce:
1. .... Copy.. and paste...
2. ....
3. ....


Example: 

On this page: http://legendofgalacticheroes.blogspot.com/p/name-variations.html

The top two paragraphs were copied and pasted from Word. The bottom of the page, where it says:

>I know the correct spelling of the last name should be Lohengrin, but it’s not a matter of changing one letter, and this is the name of the main character I grew up with, so I’m leaving it.

Is copied and pasted from LibreOffice. I tried making the spacing wider or the paragraph wider, but as you can see on the blog, it's still squished. 

Example 2: 

The tables shown on that same page is fairly wide. That's after I spent a lot of time modifying the height of the table. The tables copied and pasted from Word look like the ones on this page:

http://legendofgalacticheroes.blogspot.com/p/notes.html



Problem 3: 

When I opened up the word files in LibreOffice, there were a lot of spaces that were "grey". I could go over and delete the grey marks and replace them with spaces, but it was weird. 

Problem 4: 

The table I talked about in Example 2 was originally done in Word. When I opened up the file in LibreOffce, all the row heights were fixed at double-spaced, and copied and pasted as double-spaced. It was ridiculous. LibreOffice won't let me change the heights so I had to redo that entire table and delete the original table, and after struggling for hours, I ended up with what was shown in Example 2. 


SORRY ABOUT ALL OF THIS! Thank you for your help. 

Operating System: Windows 8
Version: 4.2.5.2 release
Comment 1 Yousuf Philips (jay) (retired) 2014-07-16 03:20:45 UTC
Hi Tracy,

Thanks for submitting the bug. Unfortunately, its only possible to deal with one bug per bug report, so lets focus on the first one.

So i opened up the file you attached and copied it from word 2013 to libreoffice 4.2.5 and if i pasted it as an RTF, it will come in perfectly. Then i saved the pasted data as an html file and opened that, and the text wasnt squashed.

If i pasted it as html without comments, i noticed the grey spaces you mentioned in problem 3, and those are non-breaking spaces, which you can create by pressing ctrl+shift+space. those are displayed by default as a visual aid, but you can easily tell libreoffice not to show it (goto Tools > Options > LibreOffice Writer > Formatting Aids and uncheck non-breaking spaces).

Please let me know if this all works on your side.
Comment 2 Tracy Chu 2014-07-17 07:06:51 UTC
Not really. I had to edit Ch 1 of my blog today and when I tried to paste it back, the fonts and all the paragraphs were squished again, even though I did not touch any of the formatting, paragraph size, or line width. It was the same Word file that I saved from before. I simply changed a few words, but when it was pasted back, it was different. 


Here's what it looks like when it's copied and pasted from LibreOffice:

https://lh6.googleusercontent.com/-3SN_g4ah_Pc/U8d1XbQpsOI/AAAAAAAAEmw/b3QOSMi8mfQ/w964-h402-no/LibreOffice+Text.jpg

Here's the exact same text unadulterated by whatever LibreOffice was doing:

https://lh4.googleusercontent.com/-y3Vn56AHY0Q/U8d1XoSP-PI/AAAAAAAAEm0/_ThiXOuL0jk/w964-h508-no/Word+Text.jpg


As you can see, the two are absolutely not the same.
Comment 3 Tracy Chu 2014-07-17 07:07:38 UTC
In fact, I just noticed that my spaces before paragraphs were all gone too....
Comment 4 Tracy Chu 2014-07-17 07:24:04 UTC
If this helps, this is the HTML when copied and pasted from LibreOffice:

<div align="center" style="line-height: 100%; margin-bottom: 0in;">
<span style="font-family: Consolas, serif;"><span style="font-size: 10pt;"><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">Chapter
One</span></span><span style="color: black;"><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">
     </span></span></span><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">Eternal
Night</span></span></span></span></div>
<div style="line-height: 100%; margin-bottom: 0in;">
<br /></div>
<div style="line-height: 100%; margin-bottom: 0in;">
<br /></div>
<div style="line-height: 100%; margin-bottom: 0in;">
<br /></div>
<div style="line-height: 100%; margin-bottom: 0in;">
<span style="font-family: Consolas, serif;"><span style="font-size: 10pt;"><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">	I</span></span></span></span></div>
<div style="line-height: 100%; margin-bottom: 0in;">
<br /></div>
<div style="line-height: 100%; margin-bottom: 0in;">
<span style="font-family: Consolas, serif;"><span style="font-size: 10pt;"><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">	The
moment Galactic Imperial Fleet captain, Siegfried Kircheis, stepped
onto the bridge, he stopped thoughtfully. Countless specks of light
inlaid the abyss of the universe, and they enveloped Siegfried’s
body with an overwhelming sense of infinity. </span></span></span></span>
</div>
<div style="line-height: 100%; margin-bottom: 0in;">
<br /></div>
<div style="line-height: 100%; margin-bottom: 0in;">
<span style="font-family: Consolas, serif;"><span style="font-size: 10pt;"><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">	“………”</span></span></span></span></div>
<div style="line-height: 100%; margin-bottom: 0in;">
<br /></div>
<div style="line-height: 100%; margin-bottom: 0in;">
<span style="font-family: Consolas, serif;"><span style="font-size: 10pt;"><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">	It
was as if his entire being was floating in the boundless darkness,
but this illusion disappeared quickly. The bridge of Flagship
</span></span><span style="color: blue;"><span lang="zxx"><u><a href="http://gineipaedia.com/wiki/Reinhard_von_M%C3%BCsel#The_Br.C3.BCnhild"><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">Brünhild</span></span></a></u></span></span><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">
was shaped in a giant hemisphere. The hemisphere’s spherical part
was the bridge’s upper half, and it was covered with a single
screen resembling a transparent piece of glass that allowed one to
clearly observe the universe outside. </span></span></span></span>
</div>
<div style="line-height: 100%; margin-bottom: 0in;">
<span style="font-family: Consolas, serif;"><span style="font-size: 10pt;"><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">	</span></span></span></span></div>
<div style="line-height: 100%; margin-bottom: 0in;">
<span style="font-family: Consolas, serif;"><span style="font-size: 10pt;"><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">	After
his momentary sensibility subsided, Kircheis re-inspected his
surroundings. Within this spacious room, the lighting system
controlled the brightness to produce a thin layer of darkness. 
Numerous screens both large and small, consoles, gauges, computers,
and communication devises, etc, were arranged in an orderly geometric
pattern. People walked back and forth, and the movements of their
heads, arms, and legs made it easy for one to imagine schools of fish
riding along with the currents. </span></span></span></span>
</div>
<div style="line-height: 100%; margin-bottom: 0in;">
<br /></div>
<div style="line-height: 100%; margin-bottom: 0in;">
<span style="font-family: Consolas, serif;"><span style="font-size: 10pt;"><span style="font-family: Times New Roman, serif;"><span style="font-size: 12pt;">	A
hint of odor stimulated Kircheis’s nostrils. It was the scent of
adrenaline produced by nervous people under a state of fight or
flight, mixed with the electronic odor that machines emitted in the
recycled oxygen. It was a scent that spacemen found to be most
familiar.&nbsp;</span></span></span></span>
</div>
Comment 5 Tracy Chu 2014-07-17 07:24:49 UTC
This is the HTML when it's copied and pasted from Word:

<div style="margin-bottom: .0001pt; margin: 0in;">
<div style="margin-bottom: .0001pt; margin: 0in;">
<div style="margin-bottom: .0001pt; margin: 0in;">
<div style="margin-bottom: .0001pt; margin: 0in;">
<div style="margin-bottom: .0001pt; margin: 0in;">
<div style="margin-bottom: .0001pt; margin: 0in;">
<div style="margin-bottom: .0001pt; margin: 0in;">
<div style="margin-bottom: .0001pt; margin: 0in;">
<div style="margin-bottom: .0001pt; margin: 0in;">
<div style="margin-bottom: .0001pt; margin: 0in;">
<div style="margin-bottom: .0001pt; margin: 0in;">
<div style="margin-bottom: .0001pt; margin: 0in;">
<i>Legend of Galactic Heroes,<span class="apple-converted-space">&nbsp;</span></i><i>Part 1 – Dawn<o:p></o:p></i></div>
<div class="MsoPlainText" style="text-indent: 22.5pt;">
<br /></div>
<div class="MsoPlainText" style="text-indent: 22.5pt;">
<br /></div>
<div class="MsoPlainText" style="text-indent: 22.5pt;">
<br /></div>
<div class="MsoPlainText" style="text-indent: 22.5pt;">
<br /></div>
<div class="MsoPlainText" style="text-indent: 22.5pt;">
<br /></div>
<div align="center" class="MsoPlainText" style="text-align: center;">
<span style="font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; font-size: 12.0pt; mso-fareast-font-family: MingLiU;">Chapter One&nbsp;&nbsp;&nbsp; Eternal Night<o:p></o:p></span></div>
<div class="MsoPlainText" style="text-indent: 22.5pt;">
<br /></div>
<div class="MsoPlainText">
<br /></div>
<div class="MsoPlainText">
<br /></div>
<div class="MsoPlainText">
<span style="font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; font-size: 12.0pt; mso-fareast-font-family: MingLiU;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; I<o:p></o:p></span></div>
<div class="MsoPlainText">
<br /></div>
<div class="MsoPlainText">
<span style="font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; font-size: 12.0pt; mso-fareast-font-family: MingLiU;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The
moment Galactic Imperial Fleet Captain Siegfried Kircheis stepped onto the
bridge, he stopped thoughtfully. Countless specks of light inlaid the abyss of
the universe, and they enveloped Siegfried’s body with an overwhelming sense of
infinity. <o:p></o:p></span></div>
<div class="MsoPlainText">
<br /></div>
<!--more--><br />
<div class="MsoPlainText">
<span style="font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; font-size: 12.0pt; mso-fareast-font-family: MingLiU;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; “………”<o:p></o:p></span></div>
<div class="MsoPlainText">
<br /></div>
<div class="MsoPlainText">
<span style="font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; font-size: 12.0pt; mso-fareast-font-family: MingLiU;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; It
was as if his entire being was floating in the boundless darkness, but this
illusion disappeared quickly. The bridge of Flagship </span><a href="http://gineipaedia.com/wiki/Reinhard_von_M%C3%BCsel#The_Br.C3.BCnhild"><span style="font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; font-size: 12.0pt; mso-fareast-font-family: MingLiU;">Brünhild</span></a><span style="font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; font-size: 12.0pt; mso-fareast-font-family: MingLiU;"> was shaped in a giant hemisphere. The
hemisphere’s spherical part was the bridge’s upper half, and it was covered
with a single screen resembling a transparent piece of glass that allowed one
to clearly observe the universe outside. <o:p></o:p></span></div>
<div class="MsoPlainText">
<span style="font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; font-size: 12.0pt; mso-fareast-font-family: MingLiU;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <o:p></o:p></span></div>
<div class="MsoPlainText">
<span style="font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; font-size: 12.0pt; mso-fareast-font-family: MingLiU;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; After
his momentary sensibility subsided, Kircheis re-inspected his surroundings.
Within the spacious room, the lighting system controlled the brightness to
produce a thin layer of darkness. &nbsp;Numerous
screens both large and small, consoles, gauges, computers, and communication
devises, etc, were arranged in an orderly geometric pattern. People walked back
and forth, and the movement of their heads, arms, and legs made it easy for one
to imagine schools of fish riding along with the currents. <o:p></o:p></span></div>
<div class="MsoPlainText">
<br /></div>
<div class="MsoPlainText">
<span style="font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; font-size: 12.0pt; mso-fareast-font-family: MingLiU;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A
hint of odor stimulated Kircheis’s nostrils. It was the scent of adrenaline
produced by nervous people under a state of fight or flight mixed with
electronic odor machines emitted in the recycled oxygen. It was a scent the spacemen
found most familiar. <o:p></o:p></span></div>
<div class="MsoPlainText">
<br /></div>
<div class="MsoPlainText">
<span style="font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; font-size: 12.0pt; mso-fareast-font-family: MingLiU;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The
red haired young man strode toward the center of the bridge. Although he was
given the rank of Captain, Kircheis was not yet 21 years old. He, without his
military uniform, was still the “tall, handsome, red-haired chap” in the eyes
of the logistics women spacemen. Sometimes, he also felt uneasy that his age
was disproportionate with his rank. He still could not nonchalantly accept the
fact the way his superior did. <o:p></o:p></span></div>
<div class="MsoPlainText">
Comment 6 Yousuf Philips (jay) (retired) 2014-07-17 10:11:22 UTC
Hi Tracy,

I seem to be a bit confused about this. Are you copying the html code from your website and then loading it into libreoffice or are you loading a docx file in libreoffice.
Comment 7 Tracy Chu 2014-07-17 16:08:18 UTC
Jay, I suggested no such thing. 

I typed up the entire file in Word. Copied and pasted it into blogger, and I get this, where everything is perfectly normal:

https://lh4.googleusercontent.com/-y3Vn56AHY0Q/U8d1XoSP-PI/AAAAAAAAEm0/_ThiXOuL0jk/w964-h508-no/Word+Text.jpg

I open up the exact same docx file in LibreOffice, change a few words, copy and paste it back into blogger, and I get this:

https://lh6.googleusercontent.com/-3SN_g4ah_Pc/U8d1XbQpsOI/AAAAAAAAEmw/b3QOSMi8mfQ/w964-h402-no/LibreOffice+Text.jpg

I just pasted the html code to show that even though it is the exact same document, when it's copied and pasted from Office, its html code is different from when it's copied and pasted from LibreOffice.
Comment 8 Buovjaga 2014-11-26 09:25:24 UTC
Should be UNCONFIRMED.
Comment 9 Robinson Tryon (qubit) 2014-12-09 20:40:07 UTC
(In reply to Tracy Chu from comment #7)
> I typed up the entire file in Word. Copied and pasted it into blogger, and I
> get this, where everything is perfectly normal:

So we're talking about copying from inside Word to the clipboard, then into some web-form on blogger.com, right?

> I just pasted the html code to show that even though it is the exact same
> document, when it's copied and pasted from Office, its html code is
> different from when it's copied and pasted from LibreOffice.

HTML export from LibreOffice is likely to be different from HTML export from Word, as they're two separate products. There are two pieces to consider here:
1) How closely does the HTML markup mirror the markup in the word processor?
2) How closely do we want to emulate Word's output?
Comment 10 Robinson Tryon (qubit) 2014-12-09 20:41:36 UTC
(In reply to Tracy Chu from comment #2)
> Here's what it looks like when it's copied and pasted from LibreOffice:
> 
> https://lh6.googleusercontent.com/-3SN_g4ah_Pc/U8d1XbQpsOI/AAAAAAAAEmw/
> b3QOSMi8mfQ/w964-h402-no/LibreOffice+Text.jpg
> 
> Here's the exact same text unadulterated by whatever LibreOffice was doing:
> 
> https://lh4.googleusercontent.com/-y3Vn56AHY0Q/U8d1XoSP-PI/AAAAAAAAEm0/
> _ThiXOuL0jk/w964-h508-no/Word+Text.jpg

Those URLs don't work for me, so I'll attach HTML files containing the html provided in comment 4 and comment 5
Comment 11 Robinson Tryon (qubit) 2014-12-09 20:42:29 UTC
Created attachment 110645 [details]
LibreOffice HTML
Comment 12 Robinson Tryon (qubit) 2014-12-09 20:42:44 UTC
Created attachment 110646 [details]
Word HTML
Comment 13 Robinson Tryon (qubit) 2015-01-17 10:10:54 UTC
(In reply to Robinson Tryon (qubit) from comment #9)
> HTML export from LibreOffice is likely to be different from HTML export from
> Word, as they're two separate products. There are two pieces to consider
> here:
> 1) How closely does the HTML markup mirror the markup in the word processor?
> 2) How closely do we want to emulate Word's output?

Tracy: How closely would you expect the HTML markup to be between export from Word and export from Writer?

Status -> NEEDINFO

(please change status back to 'UNCONFIRMED' after you reply. Thanks!)
Comment 14 QA Administrators 2015-07-18 17:36:19 UTC
Dear Bug Submitter,

This bug has been in NEEDINFO status with no change for at least 6 months. Please provide the requested information as soon as possible and mark the bug as UNCONFIRMED. Due to regular bug tracker maintenance, if the bug is still in NEEDINFO status with no change in 30 days the QA team will close the bug as INVALID due to lack of needed information.

For more information about our NEEDINFO policy please read the wiki located here: 
https://wiki.documentfoundation.org/QA/FDO/NEEDINFO

If you have already provided the requested information, please mark the bug as UNCONFIRMED so that the QA team knows that the bug is ready to be confirmed.


Thank you for helping us make LibreOffice even better for everyone!


Warm Regards,
QA Team

This NEEDINFO message was generated on: 2015-07-18
Comment 15 QA Administrators 2015-09-04 03:01:37 UTC
Dear Bug Submitter,

Please read this message in its entirety before proceeding.

Your bug report is being closed as INVALID due to inactivity and a lack of information which is needed in order to accurately reproduce and confirm the problem. We encourage you to retest your bug against the latest release. If the issue is still present in the latest stable release, we need the following information (please ignore any that you've already provided):

a) Provide details of your system including your operating system and the latest version of LibreOffice that you have confirmed the bug to be present

b) Provide easy to reproduce steps – the simpler the better

c) Provide any test case(s) which will help us confirm the problem

d) Provide screenshots of the problem if you think it might help

e) Read all comments and provide any requested information

Once all of this is done, please set the bug back to UNCONFIRMED and we will attempt to reproduce the issue. 
Please do not:
a) respond via email 
b) update the version field in the bug or any of the other details on the top section of FDO
Message generated on: 2015-09-03