Description: When attempting to convert an ANSI-encoded CSV file to PDF via the command line using LibreOffice, special characters such as ™ and ® are replaced with question marks � in the resulting PDF file. The command used for conversion is as follows: 'soffice.exe --headless --convert-to pdf "D:\MyCsvFile.csv" --outdir "D:\conversionResults"' This issue does not occur when opening the same file using the LibreOffice GUI and subsequently exporting it to PDF via Calc. Furthermore, saving the CSV file in UTF-8 encoding using Notepad++ allows for successful conversion via the command line, preserving all original content. To work around the issue, specifying the ANSI encoding with the flag --infilter="CSV:44,34,ANSI" in the command line enables successful conversion of ANSI-encoded files to PDF. From these observations, it appears that there may be a problem with identifying the encoding of the source file when it is loaded for conversion using the command line interface. Additionally, it's worth noting that this issue is reproducible in all stable releases following version 7.4.7.2. Steps to Reproduce: 1. Create a .csv file that contains special characters such as ™ and ® and save it using ANSI encoding. Or take the .csv file from the attachments. 2. Install any of the affected versions of LibreOffice (any starting from 7.5.0.1 to 7.6.6). 3. Convert the csv file to PDF using the following command line, replacing paths as necessary: 'soffice.exe --headless --convert-to pdf "D:\MyCsvFile.csv" --outdir "D:\conversionResults"'. 4. Inspect the result PDF document. Actual Results: Special characters such as ™ and ® from the source file are replaced with question marks � in the result PDF file. Expected Results: All the content from the original file is preserved in the resulting PDF document without any unwanted replacements. All special characters should be kept. Reproducible: Always User Profile Reset: No Additional Info: Version: 7.6.5.2 (X86_64) / LibreOffice Community Build ID: 38d5f62f85355c192ef5f1dd47c5c0c0c6d6598b CPU threads: 8; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win Locale: en-US (en_US); UI: en-US Calc: threaded
Created attachment 193215 [details] CSV File in ANSI encoding that contains a special characters. Use it as a source
Created attachment 193216 [details] Result PDF file that shows the problem with replaced special symbols
Maybe in relation with https://bugs.documentfoundation.org/show_bug.cgi?id=150714 default encoded is UTF-8-encoded So is needed to put the encoded if it is different. CSV files are plain text with no encoded definition. >>To work around the issue, specifying the ANSI encoding with the flag --infilter="CSV:44,34,ANSI" in the command line enables successful conversion of ANSI-encoded files to PDF. It is not a workaround, it is part of command line options. I think, not a bug.
(In reply to m_a_riosv from comment #3) > Maybe in relation with > https://bugs.documentfoundation.org/show_bug.cgi?id=150714 > default encoded is UTF-8-encoded > So is needed to put the encoded if it is different. CSV files are plain text > with no encoded definition. > > >>To work around the issue, specifying the ANSI encoding with the flag --infilter="CSV:44,34,ANSI" in the command line enables successful conversion of ANSI-encoded files to PDF. > It is not a workaround, it is part of command line options. > > I think, not a bug. Thanks for your reply. Why does it work ok, when I load my .csv file via GUI then? I thought some logic automatically determines the encoding before loading the content (which could be broken). Maybe it's a deluxe request, but I think it would be extremely useful if converting from csv to pdf via the command line could have identified encoding automatically (similarly to what is done when opening via UI).
@https://bugs.documentfoundation.org/show_bug.cgi?id=160289#c4 there is no logic on Import via GUI, except the __Dialog__ »remembers« the last settings taken by $USER. (and $User has the Option to change something! ) IMHO it would be a bad idea to apply implicitly such a rule to Commandline-conversions my vote: RESOLVED ⇒ NOT A BUG