Description: Open the attached document. Export to XHTML. View the resulting .html file in a browser. Steps to Reproduce: See above. Actual Results: Duplicated "Hello" text. One to the right of the image, one below. Expected Results: Just one "Hello", below the image, as in the .odt. Reproducible: Always User Profile Reset: No Additional Info: .
Code pointer: filter/source/xslt/odf2xhtml/export/xhtml/body.xsl
Created attachment 176965 [details] Trivial sample document.
@Svante, can you perhaps immediately say what needs to be done in the XSLT to fix this?
*** Bug 146263 has been marked as a duplicate of this bug. ***
confirmed on Windows builds Version: 7.2.4.1 (x64) / LibreOffice Community Build ID: 27d75539669ac387bb498e35313b970b7fe9c4f9 CPU threads: 8; OS: Windows 10.0 Build 19044; UI render: Skia/Vulkan; VCL: win Locale: en-US (en_US); UI: en-US Calc: threaded For the exported XHTML the "Hello" in the UL is duplicated.
also reproduced in Version: 5.4.0.0.alpha1+ Build ID: 9feb7f7039a3b59974cbf266922177e961a52dd1 CPU threads: 4; OS: Linux 5.10; UI render: default; VCL: gtk3; Locale: en-US (en_US.UTF-8); Calc: group
and Version: 4.3.0.0.alpha1+ Build ID: c15927f20d4727c3b8de68497b6949e72f9e6e9e
Hej Tor, I am about to power down for the holiday season, something similar I had fixed years ago, but can not remember The input content.xml of your bug.odt shows: ODF INTPUT: <text:list xml:id="list1078085969" text:style-name="L1"> <text:list-item> <text:p text:style-name="P1"> <draw:frame draw:style-name="fr1" draw:name="Image1" text:anchor-type="paragraph" svg:width="3.528cm" svg:height="3.528cm" draw:z-index="0"> <draw:image xlink:href="Pictures/100000010000006400000064A2FB08F214CB5BE3.png" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" draw:mime-type="image/png"/> </draw:frame> <text:span text:style-name="T1">Hello</text:span> </text:p> </text:list-item> </text:list> HTML OUTPUT: <ul> <li> <div class="P1" style="margin-left:0cm;"><span class="Bullet_20_Symbols" style="display:block;float:left;min-width:0.635cm;">•</span> <!--Next 'div' is emulating the top height of a draw:frame.--> <!--Next ' div' is a draw:frame. --> <div style="height:3.528cm;width:3.528cm; padding:0; float:left; position:relative; left:0cm; " class="fr1" id="Image1"><img style="height:3.528cm;width:3.528cm;" alt="" src="data:image/png;base64, <some-base64-data>" /> </div> <!--Next 'div' added for floating.--> <div style="display:inline; position:relative; left:0cm;">Hello</div> <div xmlns:loext="urn:org:documentfoundation:names:experimental:office:xmlns:loext:1.0" style="clear:both; line-height:0; width:0; height:0; margin:0; padding:0;"> </div>Hello<span xmlns:loext="urn:org:documentfoundation:names:experimental:office:xmlns:loext:1.0" class="odfLiEnd" /> </div> </li> </ul> The ODF output shows first the image and beyond the list bullet with the "Hello" text. As you see from the comments that are provided in the HTML output that the error is likely in the xhtml/body.xsl. Obviously, the span content is being matched twice by XSLT. I have not worked a while with XSLT and quite busy with other tasks, guess I have to pass on this one. Have a nice holiday season...
Sugested patch in https://gerrit.libreoffice.org/c/core/+/127683
Hmm, Svante now sent me a slightly improved version of the body.xsl that seems to fix the problem, too, in a cleaner fashion, and he presumably actually understands how it works. Will resubmit the patch wit that instead. Sadly, though, that patch is based on https://github.com/oasis-tcs/odf-tc/blob/master/src/test/resources/odf1.3/tools/odf2html/export/xhtml/body.xsl . Thus it lacks some (small) cleanups that have been done to the body.xsl in the LibreOffice sources. I had no idea that our XSLT for this filter apparently is just a copy of some more authoritative (?) XSLT from OASIS... Perhaps we then shouldn't have those files in git at all, but download them from upstream? No idea, and I don't really want to know.
Michael usually synchs these files, whenever we are submitting changes on OASIS side he moves the parts to LO archive and vice versa. I would assume the LO repo is the more authoritative repo - as long we keep it synched.. :-)I wanted to test the files with the spec and wrap my head more around it, but holiday time and family interrupted my aim, so finger crossed we did not break something - I just saw the pattern that your new parameter was quite in parallel to the existing one... Hopefully, one of us finds some minutes in silence to think this through... ;-)
But do we want to have the authoritative copy in LO, if the fact is that the only person who understands it works on the OASIS side? Isn't that counter-productive?
I spend this morning looking into this - but I fear there are regressions in the ODF 1.3 part 4 formula spec at this chapter 4.4 some content is missing: https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part4-formula/OpenDocument-v1.3-os-part4-formula.html#__RefHeading__1017896_715980110 I have tested the XSLT "directly" with saxon out-of-the-box using Maven build environment and pom.xml. Otherwise, with LO there would be noise with changing automatic style names during load/save of specs in between tests.My test transformation can be triggered stand-alone via mvn install (activating all test documents in the pom.xml - see https://github.com/svanteschubert/odf-tc/blob/html-floating-fix/pom.xml)All test files and HTML output with indent can be found at my fork: https://github.com/svanteschubert/odf-tc/tree/html-floating-fix/docs/odf1.3/tmp-test-output I can also add some background to the meaning of the XSLT code part - what it does: 1. An ODF paragraph with a draw:frame a child becomes an HTML div not as usually HTML p to be valid HTML. 2. A draw:frame with other elements on the same level (siblings being more than ODF soft page breaks) is becoming a "floating" div (CSS float left) embracing its following siblings within the new div. Until the next following sibling is a draw:frame than this will become a left floating div embracing again its following siblings. This handling is tricky. 3. The variable we are extending "stopAtFirstFrame" is marking a mode to deal with the content before the first draw:frame. The problem that you have correctly fixed - likely partly - earlier, Tor, and I have now understood better is that the above complex CSS floating routine of draw:frame - template mode="frameFloating" - is being triggered from two spots. Aside from <xsl:template match="text:p | draw:page"> the mode="frameFloating" is also entered from draw:frame. Which is why you added the parameter and also from the apply-templates from the list routine, where the duplication is triggered. Sorry, I have to leave this issue now my time-boxed morning is over and I have my own windmills I need to ride against. Please make sure you test the specification before/after for regressions (best after some indent on the XML, e.g. I used manually JEdit editor with the XML plugin), Good hunting, Tor! Svante
Tor Lillqvist committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/11d2a2f5d260bb27d0e67f90579ca761cb2250ea tdf#146264: Add a somewhat questionable hack to fix the issue It will be available in 7.4.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Will add a unit test for this bug fix, too. https://gerrit.libreoffice.org/c/core/+/127935
Tor Lillqvist committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/6bfeb2290c585e0e5fe982dde6ac57e4afca2e2f tdf#146264: Add unit test It will be available in 7.4.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
I have an update on this issue as Michael and I have worked on the XSLT filter for the ODF spec (being the same as the one for LO) but using some JavaScript for MathML by default. We just finished a pull-request: https://github.com/oasis-tcs/odf-tc/pull/47 Same fix I did for https://bugs.documentfoundation.org/show_bug.cgi?id=154989 There, was a erroneous recursion in the XSLT, which I removed, making the fix (earlier mentioned 'ugly hack' obsolete - so I removed it again). Now I added some test files to the ODF TC git repo and will collect further, that Michael will add for the LO regression test as well. The major enhancement is that the alignment of images/frames is done by CSS position. It all depends on the ODF attribute @text:anchor-type 1) if the anchor is at tha character 'as-char' CSS position:static (the default) is used 2) if the anchor exist, but not 'as-char' CSS position:relative with float:left is used 3) if the achor does not exist, CSS position:absolute will position relative to its parent (mostly the body/page). The latter case fixed https://bugs.documentfoundation.org/show_bug.cgi?id=154989 In addition, I added a pageHeight, background color for ODF graphics @draw:fill-color), fixed obvious typos/bugs in XSLT flow: https://github.com/oasis-tcs/odf-tc/pull/47/commits/f93dd81a5c6ba8f06a67e98f9f3dc4fd79ccab0c (note here I have not omitted the 'ugly hack' but renamed it, removed later to see it makes no difference).
Michael will take this back to LO sources.. Thank you for that, Michael! PS: I forgot to mention that the doubled bullet was due to the existent @style:num-suffix, which is not rendered by LO. Michael and I decided to remove both @style:num-prefix and @style:num-suffix for the HTML rendering of bullets! PPS: Some might be able fix the HTML layout completly in the way LO is rendering it by taking into account the image property @style:vertical-pos="top" to be found in the styles.xml parent Graphics style. Due to style:vertical-pos="top" the image has to be shown on top of the paragraph, where the first list-item is equal to the paragraph in LO. For this reason, the image comes before the list and the label of the list item should be hold back for images with such an attribute. But also the other attributes should be considered by test documents: https://tdf.github.io/odftoolkit/odf1.3/OpenDocument-v1.3-reference.html#attribute_style:vertical-pos_1 "below" "bottom" "from-top" "middle" "top" Also the test documents have to take into account that there might be multiple paragraphs ahead and/or after the image. This might become a follow-up issue for someone... Already too much time invested on this in my spare time, triggered by a hackfest from Thorsten Behrens and Michael Stahl's suggestion to work on these issue. Nice trick, Michael! ;-)
Svante Schubert committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/5178ade8a12cc52c02cd6288932e5a85dfbaea1b XHTML export: Removing bullet suffix, which is not viewed in LO - see tdf146264 It will be available in 7.6.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.