Bug 155469 - Character set settings changed
Summary: Character set settings changed
Status: RESOLVED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer Web (show other bugs)
Version:
(earliest affected)
7.5.2.2 release
Hardware: All Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-05-24 06:32 UTC by Josep
Modified: 2023-05-29 12:14 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Original file displayed correctly (712 bytes, text/html)
2023-05-29 08:07 UTC, Josep
Details
Second file wrongly displayed (557 bytes, text/html)
2023-05-29 08:10 UTC, Josep
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Josep 2023-05-24 06:32:16 UTC
Description:
I've been using LibreOffice for years to edit static html pages. But since some days ago every document I edit with special characters inside, all of them are changed when saved. I also tried with a new document to reproduce the issue and it hapens too.


Steps to Reproduce:
1.Create and save a new html document and type inside:
documentó


Possible issue.

The original (fine) document contains meta:
<meta http-equiv="content-type" content="text/html; charset=windows-1252"/>

The new one (or original one saved with last LibreOffice version)
<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
2.
3.

Actual Results:
When displayed using web browser it appears as:
documentó

Expected Results:
It should show
documentó
Is the contents I can see before saving it.


Reproducible: Always


User Profile Reset: Yes

Additional Info:
Possible issue.

The original (file) document contains meta:
<meta http-equiv="content-type" content="text/html; charset=windows-1252"/>

The new one (or original one saved with last LibreOffice version)
<meta http-equiv="content-type" content="text/html; charset=utf-8"/>

LibreOffice version used:
Version: 7.5.2.2 (X86_64) / LibreOffice Community
Build ID: 53bb9681a964705cf672590721dbc85eb4d0c3a2
CPU threads: 12; OS: Windows 10.0 Build 22621; UI render: Skia/Raster; VCL: win
Locale: es-ES (es_ES); UI: es-ES
Calc: CL threaded
Comment 1 Mike Kaganski 2023-05-24 08:13:58 UTC
In commit e4f53484d255f844169957c411dc3e872af7d3bb (tdf#148413: Drop HTML export encoding configuration; use UTF-8, 2022-04-06), we changed to UTF-8 unconditionally. This happened in LibreOffice 7.4.

But I cannot reproduce the problem that you describe:

> When displayed using web browser it appears as:
> documentó

The document created according to your directions shows fine in Chrome and in Firefox, because it is a proper UTF-8-encoded HTML.

Could you please attach such a problematic HTML document?
Comment 2 Josep 2023-05-29 08:07:59 UTC
Created attachment 187570 [details]
Original file displayed correctly

This file is a sample for correctly displayed file on browser.
Comment 3 Josep 2023-05-29 08:10:30 UTC
Created attachment 187571 [details]
Second file wrongly displayed

This file is the same test1.html file but just opened and saved with LibreOffice Writer.
This new version appears wrong on browser.
Comment 4 Mike Kaganski 2023-05-29 09:10:47 UTC
(In reply to Josep from comment #3)
> Created attachment 187571 [details]
> Second file wrongly displayed
> 
> This file is the same test1.html file but just opened and saved with
> LibreOffice Writer.

This file is perfectly correct UTF-8-encoded HTML file. The encoding is correctly reflected in the meta. Nothing wrong, except

> This new version appears wrong on browser.

which means you have a problem with your browser (Wikipedia claims, that "UTF-8 is the dominant encoding for the World Wide Web (and internet technologies), accounting for 97.9% of all web pages, over 99.0% of the top 10,000 pages, and up to 100% for many languages, as of 2023" - so no real browser can avoid supporting UTF-8).
Comment 5 Josep 2023-05-29 11:46:27 UTC
(In reply to Mike Kaganski from comment #4)
> (In reply to Josep from comment #3)
> > Created attachment 187571 [details]
> > Second file wrongly displayed
> > 
> > This file is the same test1.html file but just opened and saved with
> > LibreOffice Writer.
> 
> This file is perfectly correct UTF-8-encoded HTML file. The encoding is
> correctly reflected in the meta. Nothing wrong, except
> 
> > This new version appears wrong on browser.
> 
> which means you have a problem with your browser (Wikipedia claims, that
> "UTF-8 is the dominant encoding for the World Wide Web (and internet
> technologies), accounting for 97.9% of all web pages, over 99.0% of the top
> 10,000 pages, and up to 100% for many languages, as of 2023" - so no real
> browser can avoid supporting UTF-8).

Hello and thanks for the response.
I'm currently using Firefox 113.01

Tested with Edge 113.0.1774.57 and same result.

Could it be possible that problem comes from apache server that is providing this files?

Thanks.
Comment 6 Mike Kaganski 2023-05-29 12:14:42 UTC
(In reply to Josep from comment #5)
> Could it be possible that problem comes from apache server that is providing
> this files?

I am not an expert in web server configuration; however, if a web server advertises content with some fixed encoding (that could possibly be configured), then browsers would indeed have to use that incorrectly advertised encoding, instead of the one indicated in the file itself.