Exporting from .ODT to .EPUB format generates unnecessarily voluminous XHTML that is difficult to read and edit, because it contains vast numbers of redundant <span>s with identical properties. For example, the following paragraph, which contains no formatting except for the rendering of 82nd: “One in particular, a Ranger, uh, force known as the 82nd Airborne, had a particular nickname, and a specific song that they took as their own, that invoked that nickname. The rhyme of the song does not work in Saamen, but the part I can remember of the song goes like this: Results in the following XHTML: <p class="para12"><span class="span15">“One in particular, </span><span class="span15">a Ranger, uh, force </span><span class="span15">known as </span><span class="span15">the 82</span><span class="span41">nd</span><span class="span15"> Airborne, had a particular nickname, and a specific song that they took as their own, that invoked that nickname. </span><span class="span15"> </span><span class="span15">The rhyme of the song does not work in Saamen, but the </span><span class="span15">part I can remember of the </span><span class="span15">song goes like this:</span></p> When what it SHOULD produce is this: <p class="para12"><span class="span15">“One in particular, a Ranger, uh, force known as the 82</span><span class="span41">nd</span><span class="span15"> Airborne, had a particular nickname, and a specific song that they took as their own, that invoked that nickname. The rhyme of the song does not work in Saamen, but the part I can remember of the song goes like this:</span></p> No less than SEVEN TIMES in that one paragraph, LibreOffice *closes* a span of class span15 only to immediately begin a new span *also* of class span15. I can find no clear reason why it is generating so many redundant spans. My hypothesis would be that it is because the source .ODT document ITSELF contains many such redundant and unnecessary duplicated formatting codes. This is wasteful and unnecessary, and results in XHTML documents much larger than they need to be, that probably also take much longer to *render* than the need to. It should probably be considered malformed. LibreOffice should automatically collapse adjacent spans (and its own formatting regions) of the same type. Currently I have to have a custom Perl script to perform this cleanup. The resulting reduction in the uncompressed size of the XHTML files within the epub is as much as 30%.
Please attach a sample file, reduce the size as much as possible without private information, and paste the information in Menu/Help/About LibreOffice, there is a copy icon.
Created attachment 194494 [details] Sample paragraph, ODT version Version: 7.6.4.1 (X86_64) / LibreOffice Community Build ID: 60(Build:1) CPU threads: 12; OS: Linux 6.8; UI render: default; VCL: gtk3 Locale: en-US (en_US.UTF-8); UI: en-US Gentoo official package Calc: threaded
Created attachment 194495 [details] EPUB export of same sample
*** This bug has been marked as a duplicate of bug 141187 ***