Bug 160970 - Problem in command line file conversion (XLSX to DBF) with special character
Summary: Problem in command line file conversion (XLSX to DBF) with special character
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
7.6.6.3 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL: https://ask.libreoffice.org/t/change-...
Whiteboard:
Keywords:
Depends on:
Blocks: Commandline
  Show dependency treegraph
 
Reported: 2024-05-07 08:02 UTC by joerg.goerner
Modified: 2024-11-21 13:49 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Address list as sample (8.62 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2024-05-07 08:06 UTC, joerg.goerner
Details

Note You need to log in before you can comment on or make changes to this bug.
Description joerg.goerner 2024-05-07 08:02:05 UTC
Description:
I use the file conversion methode in the command line like this:
"C:\Program Files\LibreOffice\program\scalc.exe" --convert-to dbf Testlist.xlsx

If a cell contains a string with the czech character 'š' (ASCII 154) conversion ends before this row. I have also tried it with different character sets.

Steps to Reproduce:
1. Creating a simple address list in excel, like this:
   PLZ	ORT	STRASSE
   14169	Berlin	Teltower Damm 1
   140 00	Praha	Antala Staška 2
   42781	Haan	Schallbruch 3
2. Save the Excel file
3. Try to convert the excel-file by command line

Actual Results:
The dbf-file will end with after first line of data 

Expected Results:
the complete address list with all records


Reproducible: Always


User Profile Reset: No

Additional Info:
Version: 7.6.6.3 (X86_64) / LibreOffice Community
Build ID: d97b2716a9a4a2ce1391dee1765565ea469b0ae7
CPU threads: 12; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: de-DE
Calc: CL threaded
Comment 1 joerg.goerner 2024-05-07 08:06:19 UTC
Created attachment 194012 [details]
Address list as sample
Comment 2 Stéphane Guillou (stragu) 2024-05-23 05:17:55 UTC
If using the GUI, the default character set used is "Western Europe (DOS/OS2-850/International), which results in this error message:

Error saving the document Testlist:
Write Error.
Cell SfxBaseModel::impl_store <file:///home/stragu/Downloads/Testlist.dbf>
failed: 0x40c03(Error Area:Sc Class:Write Code:3) arg1=C3 arg2=Western
Europe (DOS/OS2-850/International) at /home/tdf/lode/jenkins/workspace/
lo_gerrit/tb/src_master/sfx2/source/doc/sfxbasemodel.cxx:3304 contains
characters that are not representable in the selected target character set "$
(ARG2)".

Resulting file only has one address.

Using the command line, I get in the console:

warn:connectivity.drivers:151848:151848:connectivity/source/drivers/dbase/DTable.cxx:521: Parsing warning: 0 records claimed, recovering
warn:sc:151848:151848:sc/source/ui/docshell/docsh8.cxx:986: ScDocShell::DBaseExport com.sun.star.sdbc.SQLException message: "The string “Antala Staška 2” cannot be converted using the encoding “ibm850”. at /home/tdf/lode/jenkins/workspace/lo_gerrit/tb/src_master/connectivity/source/commontools/dbtools2.cxx:910" SQLState: 22018 ErrorCode: 22018
    wrapped: 
warn:sc:151848:151848:sc/source/ui/docshell/docsh8.cxx:1045: ScDocShell::DBaseExport encoding error, string with default replacements: ``Antala Staška 2''
Error: Please verify input parameters... (SfxBaseModel::impl_store <file:///home/stragu/Downloads/Testlist.dbf> failed: 0x40c03(Error Area:Sc Class:Write Code:3) arg1=C3 arg2=Western Europe (DOS/OS2-850/International) at /home/tdf/lode/jenkins/workspace/lo_gerrit/tb/src_master/sfx2/source/doc/sfxbasemodel.cxx:3304 at /home/tdf/lode/jenkins/workspace/lo_gerrit/tb/src_master/sfx2/source/doc/sfxbasemodel.cxx:1822)

Same result.

One would need to pick a suitable character set for it, see: https://help.libreoffice.org/latest/en-US/text/shared/guide/lotusdbasediff.html

For example this works for me, using the encoding "Windows-1250/WinLatin 2 (Central European)":

soffice --headless --convert-to dbf:dBase:33 ./Testlist.xlsx

Does an equivalent command work for you?
Comment 3 QA Administrators 2024-11-20 03:16:52 UTC Comment hidden (obsolete)
Comment 4 joerg.goerner 2024-11-21 13:49:39 UTC
Sorry for the delay!

The recommended character set dBase:33 causes some other problems.

Result for 'Antala Staška 2' is then 'Antala StaÜka 2'. That's not correct but would not be a big problem for me in this case.

The original file has some more records.

Unfortunately with this setting all german vowel mutations like 'ä', 'ö', 'ü' are incorrect and file converting ends now before the record containing the term 'M³ Raum'.

I have tried a lot of differnt filter parameters, but no type has solved the problem.
Maybe it's not possible to convert a file with some different international records?

In my opinion it would be acceptable not to get all characters correct but skipping the following records could be dangerous.