Bug 85231 - Unclear difference between HTML export and save, html5 instead of html4
Summary: Unclear difference between HTML export and save, html5 instead of html4
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: (X)HTML-Export
  Show dependency treegraph
 
Reported: 2014-10-20 10:59 UTC by lukasseon
Modified: 2022-05-16 18:47 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description lukasseon 2014-10-20 10:59:38 UTC
I've experienced this when I've exported LO's LICENSE document to HTML (from read-only mode). The encoding was:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

It's very old technology and export to something like this isn't a right way. I think we should switch to HTML5 and add ability to select XHTML optionally.

The current problem is that there's no clear way to save/export document to HTML.
We have:
* save dialog - olde HTML4 Transitional as one and only option to save docs in HTML
* export dialog - XHTML 1.1 plus MathML 2.0 in exporting

What is the difference between them? LO can edit HTML5 docs with no problems.

My proposal:
* make HTML5 save format instead of HTML4 Transitional
* stop offer to save in HTML4 Transitional
* remove HTML entry from export dialog, HTML could be edited in LO Writer, unlike PDF
* make two entries (split current option) in save dialog: HTML5 (.html) and XHTML (.html)
Comment 1 Robinson Tryon (qubit) 2014-10-26 20:19:15 UTC
(In reply to Rezonansowy from comment #0)
> I've experienced this when I've exported LO's LICENSE document to HTML (from
> read-only mode). The encoding was:
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
> 
> It's very old technology and export to something like this isn't a right
> way. I think we should switch to HTML5 and add ability to select XHTML
> optionally.

(Marking as Enhancement)

> 
> The current problem is that there's no clear way to save/export document to
> HTML.
> We have:
> * save dialog - olde HTML4 Transitional as one and only option to save docs
> in HTML
> * export dialog - XHTML 1.1 plus MathML 2.0 in exporting
> 
> What is the difference between them?

Well they're different export mechanisms. I think the HTML4 ("html") export is a bit older addition to the codebase.

> LO can edit HTML5 docs with no problems.
> 
> My proposal:
> * make HTML5 save format instead of HTML4 Transitional
> * stop offer to save in HTML4 Transitional
> * remove HTML entry from export dialog, HTML could be edited in LO Writer,
> unlike PDF
> * make two entries (split current option) in save dialog: HTML5 (.html) and
> XHTML (.html)

This seems mostly reasonable. IIRC, stuff in the Save(-As) dialog are mostly editable formats, and stuff in the Export dialog are more read-only formats; I'm not sure the precise thinking about HTML in the Save dialog :-)

The one question I have is: Would we lose support with older browsers if we moved export to a newer version of HTML?

Status -> NEW
Comment 2 lukasseon 2014-10-26 22:33:57 UTC
(In reply to Robinson Tryon (qubit) from comment #1)
> The one question I have is: Would we lose support with older browsers if we
> moved export to a newer version of HTML?

Rather no, most HTML common tags are same, and the specific others, like <section> or <article> are purely semantic and safely ignored in older browser. See https://en.wikipedia.org/wiki/HTML5#Error_handling
Comment 3 Robinson Tryon (qubit) 2015-02-02 22:16:34 UTC
On Mon, Feb 2, 2015 at 11:00 PM, <bugzilla-daemon@bugs.documentfoundation.org> wrote:
    Rezonansowy changed bug 85231
    What 	Removed 	Added
    Severity 	enhancement 	normal

If you don't consider this an enhancement, then you need to justify why it's a defect

Status -> NEEDINFO

It would also be helpful to have the version set, just in case someone makes changes to HTML export in the future.
Comment 4 lukasseon 2015-02-03 00:04:47 UTC
(In reply to Robinson Tryon (qubit) from comment #3)
> On Mon, Feb 2, 2015 at 11:00 PM,
> <bugzilla-daemon@bugs.documentfoundation.org> wrote:
>     Rezonansowy changed bug 85231
>     What 	Removed 	Added
>     Severity 	enhancement 	normal
> 
> If you don't consider this an enhancement, then you need to justify why it's
> a defect
> 
> Status -> NEEDINFO
> 
> It would also be helpful to have the version set, just in case someone makes
> changes to HTML export in the future.

See above:

> The current problem is that there's no clear way to save/export document to > > HTML.
> We have:
> * save dialog - olde HTML4 Transitional as one and only option to save docs in HTML
> * export dialog - XHTML 1.1 plus MathML 2.0 in exporting
Comment 5 lukasseon 2015-02-26 00:28:45 UTC
I think we have enough information.
Comment 6 lukasseon 2015-02-26 00:32:45 UTC
This bug applies to other LO components as well.
Comment 7 tommy27 2016-04-16 07:25:59 UTC Comment hidden (obsolete)
Comment 8 QA Administrators 2017-05-22 13:21:39 UTC Comment hidden (obsolete)
Comment 9 QA Administrators 2021-08-14 04:05:29 UTC Comment hidden (obsolete)
Comment 10 Christophe Strobbe 2022-05-16 18:47:10 UTC
I confirm this bug with LibreOffice 7.1.4.2 on OpenSUSE.

Version: 7.1.4.2 / LibreOffice Community
Build ID: 10(Build:2)
CPU threads: 8; OS: Linux 5.5; UI render: default; VCL: kf5
Locale: en-GB (en_GB.utf8); UI: en-GB
Calc: threaded

When this bug was submitted in October 2014, HTML5 was still relatively new, since it first become a W3C recommendation in that same month: https://www.w3.org/blog/news/archives/4167 
HTML 5 has now been around long enough to be considered as a replacement for HTML 4.01, which has been marked as a "superseded recommendation" by the W3C since March 201!: https://www.w3.org/TR/html401/ 
LibreOffice's XHTML export function generates code with the identifier "XHTML 1.1 plus MathML 2.0". XHTML 1.1 has also been marked as a "superseded recommendation" since March 2018: https://www.w3.org/TR/xhtml11/ .

HTML5 has brought a large number of elements, attributes and improvements, such as adding the figure and figcaption elements (relevant to other LibO bugs), allowing the embedding of both MathML and SVG (which was probably the main use case for XHTML support) and adding support for WAI-ARIA for better accessibility (see https://html.spec.whatwg.org/multipage/dom.html#wai-aria ; accessibility is an area in which LibO also has a number of bugs).

I think this makes a strong use case for replacing both XHTML 1.1 and HTML 4 with HTML 5. (One might add an option to export HTML 5 with XML syntax.)