Description: Calc isn't consistent when displaying boolean values stored in .xlsx files. Gnumeric, Excel, and GoogleSheets represent the same spreadsheet file data differently than Calc. Near row 50, libreoffice-calc switches from "TRUE" or "FALSE" text representation to 1 or 0 integer value representations. Steps to Reproduce: 1. In Excel, start a new spreadsheet. 2. Leave cell A1 empty. 3. Put 0 in cell A2, 1 in cell A3, and pull down on the lower corner of the cell to fill 101 rows with increasing integers. 4. Put a title for column B in cell B1, like "column" or something 5. Put "TRUE" in B2 and "FALSE" in B3. 6. Select B2:B3 and then pull-down auto-fill, again, so that you have 100 rows of True/False alternating values. 7. Save the file as standard "Excel Workbook" (.xlsx) file which is the newer standard. I didn't test with "Strict OpenXML". 8. Now, save the same data as "Excel 97-2003" (.xls), too. 9. Open both export data files in Libre Office (calc) Actual Results: The .xls version opens fine, looks as expected. The .xlsx version opens, but data is not shown in a consistent representation. During my testing, the representation always starts out as "TRUE"/"FALSE", but changes to 0/1 at least once around row 50, but not always on the same row, and sometimes switches back and forth at least one more time within the first 100 rows. Expected Results: The same data type should have the same representation where identically formatted. Since the data wasn't manually formatted, it is the default, and, presumably, all the same in this example. Regardless, all other spreadsheet programs work as expected with this data, so... Reproducible: Always User Profile Reset: No Additional Info: Version: 7.5.0.3 (X86_64) Build ID: 50(Build:3) CPU threads: 2; OS: Linux 6.2; UI render: default; VCL: gtk3 Locale: en-US (en_US.UTF-8); UI: en-US Calc: threaded
Created attachment 185163 [details] screenshot of LO doing the wrong thing
Created attachment 185164 [details] screenshot of googlesheets with same file
Created attachment 185165 [details] screenshot of excell with same file
Please attach a sample .xlsx file to reproduce for the poor lucky souls who don't have Excel.
Created attachment 185193 [details] Example xlsx file Yes, of course; I meant to do that. Here is a file created via python pandas to test this.
Created attachment 185194 [details] Another example file Here is an example file I created starting with the previous one, but, in Excel, I used the method discussed in my reproduction instructions to created a second column of T/F data and saved to this file. So, this is an official Excel save file.
[Automated Action] NeedInfo-To-Unconfirmed
Confirm in Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: b052ec2f2fbe0f3044ba824c064a280a5ee9cd7f CPU threads: 4; OS: Linux 5.15; UI render: default; VCL: gtk3 Locale: cs-CZ (cs_CZ.UTF-8); UI: en-US Calc: threaded Works in Version 4.1.0.0.alpha0+ (Build ID: efca6f15609322f62a35619619a6d5fe5c9bd5a)
after this commit in LO 4.2 are all cells in range B2:C201 formatted as Boolean 0/1 add940c952b3dd382be62cb28a0842d81a21e53f is the first bad commit commit add940c952b3dd382be62cb28a0842d81a21e53f Author: Matthew Francis <mjay.francis@gmail.com> Date: Sat Sep 5 22:39:35 2015 +0800 source-hash-835fee82efb70b40b94f6babc2706ee1eb66dcf7 commit 835fee82efb70b40b94f6babc2706ee1eb66dcf7 Author: Kohei Yoshida <kohei.yoshida@collabora.com> AuthorDate: Thu Oct 31 16:25:32 2013 -0400 Commit: Kohei Yoshida <kohei.yoshida@collabora.com> CommitDate: Mon Nov 4 13:59:17 2013 -0500 Populate raw cell values using ScDocumentImport. Also fix incorrect const methods. Methods that populate the document model should not be marked const even if the compiler allows it. After this commit in 7.4 are cells formatted as TRUE/FALSE 72898afa9e57ff1f23eb82bff9f0cf0fbbe693c3 is the first bad commit commit 72898afa9e57ff1f23eb82bff9f0cf0fbbe693c3 Author: Jenkins Build User <tdf@pollux.tdf> Date: Wed Feb 23 16:58:49 2022 +0100 source 8f5d8669ca74fed8608e438a8436c173e35d43a6 https://git.libreoffice.org/core/+/8f5d8669ca74fed8608e438a8436c173e35d43a6
Created attachment 185237 [details] printscreen from excel
FWIW, I can reproduce this without Excel at all. Also, there must be something very specific in order to trigger this issue, because I'm getting mixed results. I was trying to reproduce this in LO 7.4.5, without starting from an imported file. I tried several different minor variations in the procedure: * Having a column label (or not). * Using AutoFilter (or not). * Using one unique value (either true or false, not both). * Dragging the mouse in order to fill the data, all in one step (i.e. one dragging movement), or instead making 2 movements (i.e. dragging once up to row 99, release the mouse, dragging again up to row 210). * Filling more (or less) rows. * Using TRUE() and FALSE(), or TRUE/FALSE, or combining these. I could go on with additional minor differences. Observations: Using AutoFilter, the first time I get many different values within the AutoFilter list itself (many FALSE, TRUE, 0 and 1s). I mean that AutoFilter itself lists these values repeated times (with its own checkbox, etc.). After I PageDown, PageUp, the list in AutoFilter now shows each value 1 time (i.e. up to 4 different items at most, depending on the values being displayed on the cells). To be clear, I didn't even select a specific value; the resulting list is not filtered, no action was performed other than scroll down/up, and yet the AutoFilter list is different. Since I haven't modified the values, why AutoFilter would recognize them differently first, but as equivalent values the second time? I kept trying different minor adjustments to the procedure. Using "Save" or using "Save As" (I'm not sure anymore), I ended up with 2 xlsx files with the same original content, but one file opens up showing TRUE/FALSE only, whereas the other displays TRUE/FALSE/0/1. In some cases (of all my attempts), I only closed the file, whereas in others I also closed LO entirely. Sometimes the 0/1 values are "in the middle" of the data rage; sometimes they are only seen in the bottom half of the rows. To be clear, I never introduced numbers in these attempts. Now I have 2 xlsx files created with LO 7.4.5, with the exact same original content, showing 2 different things, with 2 different size weights: * 6KB (incorrect, with 0/1 too) vs. * 17KB (with TRUE/FALSE only). I see problems with AutoFilter, and with how the values are displayed in the respective cells (and in which range). The only problem is that I cannot identify a step-by-step procedure that I could describe in order to reproduce exactly what I did so to obtain these files, or why I see the 0/1 values in different ranges (considering that I am opening the same exact file). Although I'm not able to describe a clear procedure ATM, I am sharing this experience here, just in case someone else sees some similar behavior.
apparently fixed in 24.2 with (believe it or not) commit d15c4caabaa21e0efe3a08ffbe145390e802bab9 Author: Justin Luth on Tue Sep 20 08:14:41 2022 -0400 (code, not commit date) tdf#123026 xlsx import: recalc optimal row height on import
(In reply to Justin L from comment #12) > apparently fixed in 24.2 with (believe it or not) FWIW, I can still see some inconsistencies, or at least differences, depending on version. I am not saying that the results are incorrect; just that the exporting/saving is slightly different depending on file format. On a "recent" (around 4 weeks old) LO Dev 24.2, saving/exporting as XLS and reopening: * shows all TRUE/FALSE values as 1 and 0 respectively, * cell content is (functions) =TRUE() and =FALSE() respectively, * with cell format as default (standard). On the same LO Dev version, saving/exporting as XLSX and reopening: * shows all TRUE/FALSE values as TRUE and FALSE respectively, * cell content is (functions) =TRUE() and =FALSE() respectively, * with cell format as Boolean. On the same LO Dev version, saving/exporting as ODS and reopening: * shows all TRUE/FALSE values as TRUE and FALSE respectively, * cell content is (values) TRUE and FALSE respectively, * with cell format as Boolean. Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: cf8f7b91f41821b79495c0388359c4cb1156ea67 CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win Locale: en-US (es_AR); UI: en-US Calc: CL threaded This LO Dev build includes the commit mentioned in comment 12 by Justin.
(In reply to ady from comment #13) > FWIW, I can still see some inconsistencies, or at least differences, > depending on format. However, these really are rather disconnected from this particular bug, which is talking about xlsx format (almost exclusively). > I am not saying that the results are incorrect; just that the > exporting/saving is slightly different depending on file format. Generally you could say the results are incorrect. At least for something as "simple" as true/false verses 1/0 or cell format. > On a "recent" LO Dev 24.2, saving/exporting as XLS and reopening: > * shows all TRUE/FALSE values as 1 and 0 respectively, Sounds like the XLS bug equivalent of 8f5d8669ca74fed8608e438a8436c173e35d43a6 tdf#122098 xlsx im/export: let formulas guess their number format Reported as new bug 156861. > On the same LO Dev version, saving/exporting as ODS and reopening: > * cell content is (values) TRUE and FALSE respectively, That seems unlikely, and I can't reproduce. I still see formula =True().
(In reply to Justin L from comment #14) > > On the same LO Dev version, saving/exporting as ODS and reopening: > > * cell content is (values) TRUE and FALSE respectively, > That seems unlikely, and I can't reproduce. I still see formula =True(). I shall clarify my steps then: 1. New Calc worksheet (do not start from a previously-saved file). 2. Introduce the following string values, not as functions, not as numbers, and with the default standard cell format: A1: TRUE A2: FALSE 3. Save the file as ODS, close and reopen it. Result, as described in comment 13: * shows all TRUE/FALSE values as TRUE and FALSE respectively, * cell content is (values) TRUE and FALSE respectively, * with cell format as Boolean.
(In reply to ady from comment #15) > 3. Save the file as ODS, close and reopen it. > > Result, as described in comment 13: > * shows all TRUE/FALSE values as TRUE and FALSE respectively, > * cell content is (values) TRUE and FALSE respectively, > * with cell format as Boolean. To be clear, since LO considers those strings as meaning the same as the respective functions (while working in English), this is probably OK for ODS. Importing/exporting from other tools in ODS might have different results, but I have not tested such scenario.