Bug 151446 - command line convert-to option codepage issue
Summary: command line convert-to option codepage issue
Status: RESOLVED DUPLICATE of bug 98153
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
7.4.1.2 release
Hardware: All Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-10-10 09:47 UTC by Evgen
Modified: 2023-02-21 09:10 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Evgen 2022-10-10 09:47:12 UTC
Description:
command line like 
soffice.exe --convert-to html Test.ods
produces html file with UTF8 coding at 7.4.1 release while 7.3.6 produces it with default codepage CP1251
As well the options specified in the documentation for "--convert-to" either do not work or it is not clear what they are doing

Also in most cases LibreOffice do nothing without any diagnostics, сan't figure out what's wrong

Steps to Reproduce:
1. look at https://help.libreoffice.org/latest/ru/text/shared/guide/start_parameters.html?&DbPAR=SHARED&System=WIN  at --convert-to  option
2.  Create test calc and/or text file  test.ods or test.odt
3. Try convert it to html from command line
4. Look at codepage of test.html
5. Next try convert to html with non UTF-8 codepage
6. Next try convert to txt

Actual Results:
There is a set of windows command files (cmd)
-----
REM no results
"C:\Program Files\LibreOffice\program\soffice.exe" --convert-to "txt:Text (encoded):UTF8" "Test.ods"
-----
REM no results
"C:\Program Files\LibreOffice\program\soffice.exe" --convert-to txt "Test.ods"
-----
REM test.html 1052 bytes long  with UTF8 coding
REM start with <!DOCTYPE html>
"C:\Program Files\LibreOffice\program\soffice.exe" --convert-to html "Test.ods"
-----
REM 7.4.1.1 create test.html with UTF8 coding
REM 7.3.6.2 createtest.html with CP1251 coding
"C:\Program Files\LibreOffice\program\soffice.exe" --convert-to "html" "Test.ods"
--------
REM no results, no diagnostic
"C:\Program Files\LibreOffice\program\soffice.exe" --convert-to "html:" "Test.ods"
-----
REM test.html 0 bytes long created
"C:\Program Files\LibreOffice\program\soffice.exe" --convert-to "html:XHTML Writer File:UTF8"  "Test.ods"
-----
REM test.html with UTF8 coding created
"C:\Program Files\LibreOffice\program\soffice.exe" --convert-to "html:XHTML Writer File:UTF8"  "test.odt"
-----
REM no results, no diagnostic
"C:\Program Files\LibreOffice\program\soffice.exe" --convert-to "html:HTML Document (Writer):UTF8"  "test.odt"
-----
REM test.html 2415 bytes long  with UTF8 coding
REM start with <?xml version="1.0" encoding="UTF-8"?>
"C:\Program Files\LibreOffice\program\soffice.exe" --convert-to "html:XHTML Calc File:UTF8"  "test.ods"




Expected Results:
Documentation should not soar brains and Libreoffice should work as expected


Reproducible: Always


User Profile Reset: No



Additional Info:
The documentation should contain clear instructions with examples how to convert Calc files (i.e. .ods  and .xls ) to html with required codepage
Comment 1 Mike Kaganski 2022-10-10 10:05:43 UTC
It was changed in tdf#148413 ( commit e4f53484d255f844169957c411dc3e872af7d3bb ), and now it simply uses UFT-8 unconditionally.

So now there is *no* way to convert anything to html "with required codepage" - what is the use case?
Comment 2 Evgen 2022-10-10 10:11:48 UTC
If  there is *no* way to convert anything to html "with required codepage" what do then spells <<--convert-to "html:XHTML Writer File:UTF8">> and <<--convert-to "txt:Text (encoded):UTF8">>  mean?
Comment 3 Mike Kaganski 2022-10-10 11:03:14 UTC
(In reply to Evgen from comment #2)

You are talking about different things.

1. When talking about HTML export, I already answered you.
2. You also talk about TXT and XHTML export, which have their own arguments.

Additionally, your original description is confusing, with all those "REM no results" which do not allow one to understand what you are talking about - does it mean it doesn't produce a file? or the resulting file is wrong? or that you expect some result on screen?

Talking about the latter (result on screen), there was a change in 6.3 [1], which made LibreOffice support the proper console mode. And in v.7.4, the respective help page was corrected [2] to point to the proper console binary that must be called instead of soffice.exe.

Also, there was some progress with documenting filter arguments in the recent releases; the start parameters help page that you mentioned got links to list of document filters, filter options for Lotus, dBase and Diff files, and filter options for CSV files. Indeed, there are other filters that still need documentation (e.g., Text (encoded) filter - bug 140781, which I filed myself, and which was resolved somehow, but without a dedicated help page similar to CSV one AFAICT - Olivier, do you have an idea?), and you are welcome to participate in implementing those help pages [3].

[1] https://wiki.documentfoundation.org/ReleaseNotes/6.3#Windows
[2] https://help.libreoffice.org/7.4/en-US/text/shared/guide/start_parameters.html?&DbPAR=SHARED&System=WIN
[3] https://www.libreoffice.org/community/docs-team/
Comment 4 Evgen 2022-10-12 07:45:48 UTC
Well, let's look at --convert-to help ( https://help.libreoffice.org/latest/ru/text/shared/guide/start_parameters.html?&DbPAR=SHARED&System=WIN ):

--convert-to OutputFileExtension
[:OutputFilterName
[:OutputFilterParams[,param]]]
[--outdir output_dir]

and "--convert-to "html:XHTML Writer File:UTF8" *.doc"

What is OutputFileExtension in this example?  html is
What is OutputFilterName? "XHTML Writer File" is

Look now at File Conversion Filter Names at https://help.libreoffice.org/latest/ru/text/shared/guide/convertfilters.html?&DbPAR=SHARED&System=WIN

Where is "XHTML Writer File"  in "Filter Name" column? 

So the help is confusing
Comment 5 Evgen 2022-10-12 07:55:13 UTC
>>Additionally, your original description is confusing, with all those "REM no results" which do not allow one to understand what you are talking about

All those strings are pure cmd files that you can run from command line,
I was just trying to understand how it works and what is wrong, without any success.
Comment 6 Mike Kaganski 2022-10-12 09:18:27 UTC
(In reply to Evgen from comment #4)
> So the help is confusing

Heh, see bug 98153 comment 10.
Comment 7 Buovjaga 2023-02-21 09:10:03 UTC
The discussion stopped and it seems we can close this as dupe of bug 98153

*** This bug has been marked as a duplicate of bug 98153 ***