Description: I noticed that the latest versions of LibreOffice (24.8) and earlier don't work well in exporting to EPUB when there is more than one level of headings. I wrote a document with two level and had to divide to chapters by page breaks instead of headers due to this. Still, the division worked well, but then the generated table of contents for the EPUB was mangled, with headers cut in the middle. Checking some further, I found that the text is saved and exported with to many HTML tags, most of them fully redundant. It looks like whenever I correct a typo or change even a single character, the editor adds </span><span> around the change, although the style and everything else stay the same. This inflates the file without adding any useful data and also comes in the way of the EPUB formatting. I suspect that removing the redundant tags could help with the programs response to the user and also help in creating smaller and more efficient EPUB files. Steps to Reproduce: 1.created a document with two or more levels of headers, each header preceded by a page break 2.Export to EPUB using 'divide by page breaks' 3.open in an EPUB viewer or editor - check generated table of contents 4. Use an EPUB editor to see the HTML in the text files Actual Results: Some heading are truncated (not all) on a large document (more than 100 pages) Expected Results: I expected all headers to show fully, even if flattened. Reproducible: Always User Profile Reset: No Additional Info: Can send actual files if needed Noticed it on earlier versions but have no data. Version: 24.8.2.1 (X86_64) / LibreOffice Community Build ID: 0f794b6e29741098670a3b95d60478a65d05ef13 CPU threads: 8; OS: Windows 11 X86_64 (10.0 build 22631); UI render: Skia/Raster; VCL: win Locale: en-GB (he_IL); UI: en-US Calc: CL threaded
Created attachment 197561 [details] Erroneous EPUB
Thank you for the report. - The issue with headings other than h1 is tracked in bug 114164. - The issue with truncated headings is mentioned in bug 121146 comment 7, but that bug is focused on export of a Table of Content. - The issue with messy HTML is tracked in bug 141187. I suggest focusing this report on the truncated outline headings, confirmed by bug 121146 comment 7.
So if I am understanding correctly what you're saying here, fundamentally the reason why LibreOffice generates fragmented/truncated entries for EPUB embedded tables of contents is because it cannot correctly read its own mangled XHTML. TOC generation is broken by the very same redundant SPAN tags that it fills the generated XHTML text with. My prediction based on this assumption would be that it works for tables of contents built from headings that you have never edited, but the instant you edit a heading, boom, that TOC entry will now be broken.