Bug 166211 - Calc any write to a csv file corrupts file 25.2.2.2
Summary: Calc any write to a csv file corrupts file 25.2.2.2
Status: RESOLVED DUPLICATE of bug 166208
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
25.2.2.2 release
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-04-16 17:58 UTC by dpkesling
Modified: 2025-04-28 15:58 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
representative data file (60.86 KB, application/vnd.ms-excel)
2025-04-16 18:01 UTC, dpkesling
Details
the line ending changes from WINDOWS (cr-lf) to UNIX (lf) (398.98 KB, image/png)
2025-04-16 19:34 UTC, Mateusz Wlazłowski
Details
R session screengrab (138.39 KB, image/png)
2025-04-18 02:38 UTC, dpkesling
Details
untouched data file (1.11 MB, application/vnd.ms-excel)
2025-04-18 02:39 UTC, dpkesling
Details
modified file saved by Calc showing corrupted fields (134.67 KB, image/png)
2025-04-18 02:40 UTC, dpkesling
Details
Calc-saved modified data file (1.23 MB, application/vnd.ms-excel)
2025-04-18 02:42 UTC, dpkesling
Details
Calc Options page grab (169.98 KB, image/png)
2025-04-18 02:43 UTC, dpkesling
Details
my machine's tech deets (71.30 KB, image/png)
2025-04-18 02:43 UTC, dpkesling
Details

Note You need to log in before you can comment on or make changes to this bug.
Description dpkesling 2025-04-16 17:58:06 UTC
Description:
I download a csv file generated by another program. I use an "R" language program to read it (data_raw <- read.csv("000-mislocates.csv", header = TRUE, stringsAsFactors = FALSE, na.strings = "N/A"). Program runs fine. Anything that causes Calc to re-write that csv file causes it to scramble data. This is NEW to 25.2.2.2. I downgraded to 24.8.6.2 and the problem disappeared. One of the columns of my data contains a comma, ex "August 26, 2023" and I presume this is the field which causes v.25 to wreck, but v.24 can handle just fine. The sinister part is that FROM THE CALC SCREEN, you do not see any problems, the table looks just fine. However, when you perform a "read.csv" from the "R" program, the columns are scrambled, the parse/placement into variables based upon column numbers doesn't interpret the columns correctly... seems to insert "phantom columns" with invented bad data... AGAIN, this problem is NOT apparent from the Calc screen presentation where everything looks fine. It seems pretty obvious that the NEW version of Calc cannot deal with data containing a comma, whereas the OLD version could.

Steps to Reproduce:
1.Download good .csv file generated by another program, which contains a data field containing a comma ("June 11,2023", for example)
2.Perform any Calc operation which re-writes the file... such as "rename" or "save", "save as"
3.

Actual Results:
Internal representation of the column fields is corrupted. Phantom columns are added, filled with a consistent pattern of junk data.

Expected Results:
I would expect a 100% correct transfer of data and column formatting when saving data. The copy should exactly replicate the original.


Reproducible: Always


User Profile Reset: No

Additional Info:
I have a sample data file which I will attach for your use.
NO. I did not reset my UserProfile as this problem is not related to that process.
Comment 1 dpkesling 2025-04-16 18:01:23 UTC
Created attachment 200362 [details]
representative data file
Comment 2 Mike Kaganski 2025-04-16 18:13:27 UTC
Please also provide the document after save. It is not clear, which options were selected in your import dialog, nor what options were set in the export filter settings - both are important here. Even if you didn't set them explicitly, their default values would depend on your locale, etc. So the exported data at least helps to see the end result on your system. And please do provide screenshots of your import and export filter option dialogs. Thanks.
Comment 3 Mateusz Wlazłowski 2025-04-16 19:34:08 UTC
Created attachment 200364 [details]
the line ending changes from WINDOWS (cr-lf) to UNIX (lf)

Are you by any chance on Linux ? When I save as the csv file, the content is the same, only the line ending changes from WINDOWS (cr-lf) to UNIX (lf)
Comment 4 dpkesling 2025-04-18 02:38:18 UTC
Created attachment 200382 [details]
R session screengrab

requested data files xraw2023.csv is untouched download from host process, yraw2023.csv was independently downloaded, then had two column heading texts changed, then "Save" with "Save as CSV" selected.
Comment 5 dpkesling 2025-04-18 02:39:11 UTC
Created attachment 200383 [details]
untouched data file

data file from host
Comment 6 dpkesling 2025-04-18 02:40:41 UTC
Created attachment 200384 [details]
modified file saved by Calc showing corrupted fields

compare this file to the "untouched" xraw data
Comment 7 dpkesling 2025-04-18 02:42:07 UTC
Created attachment 200385 [details]
Calc-saved modified data file

modified by Calc to change two column heading texts, then "Save"
Comment 8 dpkesling 2025-04-18 02:43:17 UTC
Created attachment 200386 [details]
Calc Options page grab
Comment 9 dpkesling 2025-04-18 02:43:54 UTC
Created attachment 200387 [details]
my machine's tech deets
Comment 10 dpkesling 2025-04-18 02:47:01 UTC
I have added a number of files to this Incident as per your request. The xraw2023.csv is pristine as was downloaded. yraw2023.csv started as an identical, independent download... only touch was to change two header fields, then "Save" in Calc using "Save as CSV" rather than ODS.
Comment 11 Mike Kaganski 2025-04-18 05:09:40 UTC
(In reply to dpkesling from comment #6)
> Created attachment 200384 [details]
> modified file saved by Calc showing corrupted fields

This means, that in your *CSV export* settings, UTF-7 (not UTF-8!) encoding is selected. You likely chose it when saving some newly created file.

1. Open your original CSV file in libre calc.
2. File->Save As.
3. In the File Picker dialog, check the "Edit filter settings" checkbox, and press OK.
4. Confirm file format (Use Text CSV Format), if asked.
5. In the "Character set" field of the Export Text File dialog, make sure to change "UTF-7" (the encoding creating those "+AC0-" things) to "UTF-8" (or the other wanted encoding).

Curiously, bug 166208 was created ~simultaneously. If the UTF-7 encoding is selected automatically, not as the user error, then that would be a real bug; but we need to learn how to reproduce.

*** This bug has been marked as a duplicate of bug 166208 ***
Comment 12 Mike Kaganski 2025-04-18 05:15:34 UTC
(In reply to dpkesling from comment #0)
> User Profile Reset: No
> 
> Additional Info:
> NO. I did not reset my UserProfile as this problem is not related to that
> process.

Note that *if* you *did* try to reset the user profile, it would fix your problem. It is very sad when users know what will work and what will not, without actually testing.