Bug 166015 - EPUB output produces incorrect chapter titles
Summary: EPUB output produces incorrect chapter titles
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
25.2.2.2 release
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: EPUB-Export
  Show dependency treegraph
 
Reported: 2025-04-03 02:17 UTC by Peyton R
Modified: 2025-04-03 11:42 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments
Export an EPUB of this ODT file. The chapter names in the TOC will be incorrect. (27.01 KB, application/vnd.oasis.opendocument.text)
2025-04-03 02:17 UTC, Peyton R
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peyton R 2025-04-03 02:17:08 UTC
Created attachment 200135 [details]
Export an EPUB of this ODT file. The chapter names in the TOC will be incorrect.

Attached is an ODT file with a few short chapters. The chapter names are styled as Header 2, and consist of readable words. When the EPUB is viewed, the chapter names are only a few characters long, typically the initial letters of one or more words in the title. For example, the first chapter is Something New, but appears in the EPUB table of contents as simply "S".

Examining toc.xhtml inside the EPUB file, the chapter name has become <a href="sections/section0001.xhtml">S</a>

Inside section0001.xhtml, the "S" has been separated from the "omething new", as shown here:
<p class="para0"><span class="span0">S</span><span class="span0">omething new</span>

A user, while using Writer, will see this as "Something New", but the resulting EPUB table of contents will have only the letter "S" as the chapter name.

The problem may have arisen as the Header 2 was edited and re-edited, causing Writer to have broken the words into more than one span.

I'm guessing that the ideal solution would be to join consecutive <span>s if they have the same attributes, but this would be a bigger fix than just EPUB export.
Comment 1 Olivier Hallot 2025-04-03 11:42:28 UTC
Confirmed.

Actually the issue of having multiple <span>'s affects other applications such as tokenizers for computer aided translations tools.

To merge several <span>'s you can use "Clean direct formatting (Ctrl+M)" before EPUB export. Admittedly it can be tedious.

Version: 25.2.2.2 (X86_64) / LibreOffice Community
Build ID: 7370d4be9e3cf6031a51beef54ff3bda878e3fac
CPU threads: 12; OS: Linux 6.11; UI render: default; VCL: kf5 (cairo+wayland)
Locale: en-US (pt_BR.UTF-8); UI: en-US
Calc: threaded