Created attachment 81056 [details] Starting HTML file Problem description: Starting from this HTML text: <BODY LANG="en-US" DIR="LTR"> <P>This is Black: great.<BR><FONT COLOR="#b84747">This is Colored: why not.</FONT><BR>This is Black again: fine.</P> </BODY> and changing to bold and removing some capital letters, it produces this ugly code: <BODY LANG="en-US" DIR="LTR"> <P>This is <B>b</B><B>lack</B>: great.<BR><FONT COLOR="#b84747">This is </FONT><FONT COLOR="#b84747"><B>c</B></FONT><FONT COLOR="#b84747"><B>olored</B></FONT><FONT COLOR="#b84747">: why not.</FONT><BR>This is Black again: fine.</P> </BODY> However, LO 3.5 (the best version so far in my opinion), did as expected: <BODY LANG="en-US" DIR="LTR"> <P><B>This is black</B>: great.<BR><FONT COLOR="#b84747"><B>This is colored</B></FONT><FONT COLOR="#b84747">: why not.</FONT><BR>This is Black again: fine.</P> </BODY> I will attach the simple starting html file. Steps to reproduce: 1. Open the starting html file with LO Writer/Html 2. Change "This is black" and "This is colored" to bold. 3. Change "Black" and "Colored" to lowercase. 4. Save to another html file. Do you mind fixing it? Operating System: All Version: 4.0.0.3 release Last worked in: 3.5.0 release
Confirmed under Windows XP x86 using LO 4.0.3.3. I also tested under LO 3.6.5 and the same excessive code is added. Editing under LO 3.5.7 produced a clean HTML code, as expected.
This bug is pretty annoying. To avoid it I have to: 1) Remove formatting 2) Save 3) Relaod 4) Change to bold 5) Save 6) Reload 7) Change colour 8) Save or... I have to manually fix the html. In addition, it also affect simpler scenarios (just change a letter from lowercase to uppercase in a bold/italic section). I am actually wondering how this bug was not detected before. It would be worth adding a non-regression test about it.
Created attachment 81103 [details] Screenshot from Chrome I think it's been fixed on LO 4.0.4.2 (Win7 32bit) Please mark WORKSFORME if you agree
LO 4.0.4.2 does not solve the problem. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML> <HEAD> <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=utf-8"> <TITLE></TITLE> <META NAME="GENERATOR" CONTENT="LibreOffice 4.0.4.2 (Windows)"> <META NAME="CREATED" CONTENT="20130619;10490201"> <META NAME="CHANGED" CONTENT="20130620;14113793"> </HEAD> <BODY LANG="en-US" DIR="LTR"> <P><B>This is </B><B>b</B><B>lack</B>: great.<BR><FONT COLOR="#b84747"><B>This is </B></FONT><FONT COLOR="#b84747"><B>c</B></FONT><FONT COLOR="#b84747"><B>olored</B></FONT><FONT COLOR="#b84747">: why not.</FONT><BR>This is black again: fine.</P> </BODY> </HTML>
you seriously care about what the HTML code produced by Writer looks like, as opposed to how the HTML document is rendered by Web browsers? and don't mind the many elements that it inserts that no application other than Writer can read? well the problem was obviously introduced by the RSIDs in 3.6 (commit 062eaeffe7cb986255063bb9b0a5f3fb3fc8e34c)
(This is an automated message.) It seems that the commit that caused this regression was identified. (Or at least a commit is suspected as the offending one.) Thus setting keyword "bisected".
(In reply to Michael Stahl from comment #5) > you seriously care about what the HTML code produced by Writer looks like, > as opposed to how the HTML document is rendered by Web browsers? Absolutely. Bad code is bad code. > and don't mind the many elements that it inserts that no application other > than Writer can read? When I save a copy of a Writer ODT file as HTML, this is by far the biggest problem I have with the output. Most of the other issues I have with that output pipeline amount to deleting the HTML > HEAD > STYLE element and tweaking the BODY tag. Incidentally, this behavior persists in LO Writer 4.4.0.3, and I suspect that it's related to revision tracking. Steps to replicate: 1. Create a new Writer document. (If a template comes up, select everything, apply the Text Body paragraph style, and delete everything.) 2. Type the following line as the document's only content: Take a look at this bug. 3. Highlight everything and use Ctrl-M to clear direct formatting, just to ensure a clean slate. 4. Select the word "this" and use Ctrl-I to italicize it. 5. Click in a random place to clear the selection, then replace the letters "is" in "this" with "at" - "Take a look at _that_ bug." 6. File > Save a copy > HTML format. Open the file, and you'll see the paragraph rendered thus: <p class="western">Take a look at <i>th</i><i>at</i> bug.</p>
Comment 11 to bug 76021 appears to stem from the same issue: "Moreover, if I take exactly the same document and add some text, then all these classes change! Also note the strange duplication of classes that do exactly the same thing (.T13,.T14,.T15,.T18)" This is notable in that 76021 relates to "export as XHTML" whereas 65925 involves "save as HTML" - implying that the root problem is in Writer rather than either of those filters.
Migrating Whiteboard tags to Keywords: (bibisected) [NinjaEdit]
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present on a currently supported version of LibreOffice (5.4.1 or 5.3.6 https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to "inherited from OOo"; 4b. If the bug was not present in 3.3 - add "regression" to keyword Feel free to come ask questions or to say hello in our QA chat: http://webchat.freenode.net/?channels=libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug-20170901
*** Bug 112563 has been marked as a duplicate of this bug. ***
In response to comment 10's request for a retest, I performed the steps described in comment 7 and got the same result; nothing has changed. This is on the Portable build of 5.4.1.2, under Windows 10 Home (32-bit, version 1703, build 15063.540). I have no reason to expect that the test would yield different results on other platforms.
Problem is still there. For information, I still keep using LO 4.3.7.2 when I need to edit HTML files. It is not perfect, but the code generated is clearly better than the newer versions.
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
Dear matta2006, To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
Reproduced in 7.2 alpha1+ following the steps in comment 7. A closing tag directly followed by an exactly equivalent opening tag should be removed from the HTML code. Version: 7.2.0.0.alpha1+ / LibreOffice Community Build ID: e9da22d3308557640e0edc45f72b1897f016d19b CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3 Locale: en-AU (en_AU.UTF-8); UI: en-US TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2021-05-21_07:07:00 Calc: threaded
*** Bug 148974 has been marked as a duplicate of this bug. ***
(In reply to Michael Warner from comment #17) > *** Bug 148974 has been marked as a duplicate of this bug. *** I'm author of Bug 148974 and I'm surprised that this bug hasn't been fixed since 2013. Many users use LibreOffice to write articles and then paste formatted text into Website CMS editor. This bug generates bloated HTML code that can be penalized by search engines.
Dear matta2006, To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug