The current LibreOffice behavior produces XHTML exports with an HTML heading hierarchy not strictly and semantically corresponding to the heading hierarchy of the ODF document. Concretely LibreOffice produces XHTML exports with many <h1> headings. While for an (X)HTML document to be semantically correct, there must be one and only one <h1> per document. And the heading hierarchy should then follow, <h2>, <h3>, etc. down maximum to <h6>. The current LibreOffice behavior produces the following XHTML export of the given ODT file: <p class="Title">Titre principal</p> <h1 class="Heading_20_1"><a id="a__Titre_1"><span/></a>Titre 1</h1> <p class="P1">para</p> <h2 class="Heading_20_2"><a id="a__Titre_2"><span/></a>Titre 2</h2> <p class="P1">para</p> <h3 class="Heading_20_3"><a id="a__Titre_3"><span/></a>Titre 3</h3> <p class="P1">para</p> while we want to have the following: <h1>Titre principal</h1> <h2 class="whatever"><a id="a__Titre_1"><span/></a>Titre 1</h1> <p class="P1">para</p> <h3 class="whatever_else"><a id="a__Titre_2"><span/></a>Titre 2</h3> <p class="P1">para</p> <h4 class="whatever_again"><a id="a__Titre_3"><span/></a>Titre 3</h4> <p class="P1">para</p> Technical explanation of the current LibreOffice bug: Title has child Heading1. Heading1 has child Heading2. Heading2 has child Heading3. etc. And Title is the root of the ODF heading hierarchy. Es gibt: Title→Heading1→Heading2→Heading3→etc. As h1 is the root of the (X)HTML hierarchy, in XHTML the heading hierarchy is: h1→h2→h3→h4→etc. The current LibreOffice behavior is the following: Title→Heading1→Heading2→Heading3→etc. => p→h1→h2→h3→etc. while it should logically and semantically be: h1→h2→h3→h4→etc. This is a heading root mismatch between ODF and (X)HTML export Changing this behaviour could be implemented as a filterOption that would postprocess the current output
Created attachment 127740 [details] odt source file
Created attachment 127741 [details] (x)html output
Code pointer: filter/source/xslt/odf2xhtml/export/xhtml/body.xsl By doing some tests, "Titre principal" uses this part: http://opengrok.libreoffice.org/xref/core/filter/source/xslt/odf2xhtml/export/xhtml/body.xsl#2803 The good headers this one: http://opengrok.libreoffice.org/xref/core/filter/source/xslt/odf2xhtml/export/xhtml/body.xsl#1204 headers are created when matching text:h 1174 <xsl:template match="text:h"> unzipping odt and reformating it, we got this: <text:p text:style-name="Title">Titre principal</text:p> <text:p text:style-name="P1" /> <text:h text:style-name="Heading_20_1" text:outline-level="1">Titre 1</text:h> <text:p text:style-name="P1" /> <text:p text:style-name="P1">para</text:p> <text:p text:style-name="P1" /> <text:h text:style-name="Heading_20_2" text:outline-level="2">Titre 2</text:h> <text:p text:style-name="P1" /> <text:p text:style-name="P1">para</text:p> ... Notice "text:p" for "Titre principal" Just my 2 cents because I know too few about xsl. Anyway, I confirm this on pc Debian x86-64 with master sources updated today.
The desired behavior (improved semantic and accessibility) presented by Laurent Godard is backed by the following W3C document: "Using h1-h6 to identify headings" https://www.w3.org/TR/WCAG20-TECHS/H42.html
Superseded by: Bug 111492 and Bug 111493