Bug 37615 - Make UTF-8 the default charset for saving as HTML
Summary: Make UTF-8 the default charset for saving as HTML
Status: RESOLVED DUPLICATE of bug 148413
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.5.0 RC1
Hardware: All Windows (All)
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: (X)HTML-Export
  Show dependency treegraph
 
Reported: 2011-05-26 00:35 UTC by soshial
Modified: 2022-05-12 15:24 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
screenshot of Options - Load-Save - HTML Compatibility (16.54 KB, image/png)
2011-05-26 01:02 UTC, clio
Details
an HTML file saved after fresh install of v4.1.1.2 on Win7 (15.17 KB, text/html)
2013-09-17 20:28 UTC, soshial
Details

Note You need to log in before you can comment on or make changes to this bug.
Description soshial 2011-05-26 00:35:35 UTC

    
Comment 1 Don't use this account, use tml@iki.fi 2011-05-26 00:51:12 UTC
So? Isn't that what the <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=windows-1252"> tag in the saved file says then? Do we document that it would save in UTF-8? I fail to see how this is a bug. Sure, it would be more "cool" and modern to save as UTF-8, but unless I miss something this is just a low priority enhancement request.
Comment 2 Don't use this account, use tml@iki.fi 2011-05-26 00:52:21 UTC
To avoid confusion, let me add that the initial title of this bug report was "HTML files are saved in local encoding (e.g. cp1251) instead of utf8"
Comment 3 clio 2011-05-26 01:02:21 UTC
Created attachment 47177 [details]
screenshot of Options - Load-Save - HTML Compatibility

You can choose any encoding in Options - Load-Save - HTML Compatibility settings.
Comment 4 Don't use this account, use tml@iki.fi 2011-05-26 01:08:00 UTC
OK, so clearly NOTABUG then.
Comment 5 soshial 2011-05-26 02:18:35 UTC
(In reply to comment #3)
> Created an attachment (id=47177) [details]
> screenshot of Options - Load-Save - HTML Compatibility
> 
> You can choose any encoding in Options - Load-Save - HTML Compatibility
> settings.

Then why not enable UTF8 in that HTML Compatibility menu by default?
Can it be a small enhancement request?

Thanks.
Comment 6 Don't use this account, use tml@iki.fi 2011-05-26 03:17:58 UTC
OK, sure.
Comment 7 Björn Michaelsen 2011-12-23 12:04:07 UTC Comment hidden (noise)
Comment 8 soshial 2012-01-24 11:06:09 UTC
This bug persist in LO 3.5.0 RC1 on Windows XP SP3
Comment 9 ign_christian 2013-06-25 08:18:03 UTC
I've just did fresh install LO 4.0.4.2 with new user profile on PCLinuxos KDE. Unicode (UTF-8) has been applied as default character set (same as screenshot).
Comment 10 soshial 2013-09-17 20:26:33 UTC
This bug happens only on Windows platform. Proof of the bug happening see in attachment.
Comment 11 soshial 2013-09-17 20:28:25 UTC
Created attachment 86016 [details]
an HTML file saved after fresh install of v4.1.1.2 on Win7

proof of the bug
Comment 12 Joel Madero 2014-11-04 03:37:04 UTC
Should stop calling it a bug - it works as expected. It's an enhancement request which might never be implemented if no developer wants to volunteer to tackle it.

Moving to NEW as REOPENED is incorrect.
Comment 13 Andreas Heinisch 2021-08-10 13:45:15 UTC
This can be solved in various ways, and atm I am not sure which is the best approach:

- The easiest way to solve this is to just change the encoding to UTF8 for all plattforms in https://opengrok.libreoffice.org/xref/core/cui/source/options/opthtml.cxx?r=da9bba7c#62 (m_xCharSetLB->SelectTextEncoding(RTL_TEXTENCODING_UTF8);)

- The best way however should to check, if the function SvxTextEncodingBox::FillWithMimeAndSelectBest and SvtSysLocale::GetBestMimeEncoding can be changed to always show the utf8 charset. However, these functions retrieve the charset from the Windows code page, and it contains the windows-1252 charset on Windows. You may change the charset on Windows to UTF8 as well.

- Maybe also the SvxTextEncodingBox::FillWithMimeAndSelectBest can be adjusted in order to get always the UTF8 charset on every platform.

Opinions?
Comment 14 Andreas Heinisch 2022-05-12 15:24:22 UTC

*** This bug has been marked as a duplicate of bug 148413 ***