gzip csv files generated in turns by pandas do not open correctly in libreoffice: Python code: import pandas as pd a=pd.DataFrame({'a': [1,2], 'b':[3,4]}) b=pd.DataFrame({'a': [3,4], 'b':[5,6]}) a.to_csv('a.csv.gz', compression='gzip', header=True, mode='w') b.to_csv('a.csv.gz', compression='gzip', header=False, mode='a') The output file a.csv.gz gunzips correctly and opens correctly in less but in libre office only header and 2 first data rows appear when openning: libreoffice a.csv.gz
OK, looks like gunzip or less can handle 2 concatenated gzip files and libreoffice can not. Another example $ gunzip -c z1.csv.gz ,a,b 0,1,3 1,2,4 $ gunzip -c z2.csv.gz 0,3,5 1,4,6 $ cat z1.csv.gz z2.csv.gz > z3.csv.gz $ gunzip -c z3.csv.gz ,a,b 0,1,3 1,2,4 0,3,5 1,4,6 while libreoffice when opens z3.csv.gz shows only z1 contents
So question if it works without the gzip/gunzip steps. Or is issue in the concatenation? Does a concatenated CSV get fully parsed into the Calc CSV input dialog? If not, is there an EOF or some other control character being inserted during the append.
Issue is only when gzipped files are concatenated. everything is OK when files are not gzipped or gzipped after concatenation. Only case when files are first gzipped then concatenated does not work. pandas gzip less etc. handle the case when file consists of several gzips concatenated together and it would be good if calc could.
OK then, guess we are doing the correct thing then and this would be enhancement to handle stream of multiple gzip'd documents. Onus would be on the user to ensure the format is correct--header on first, and subsequent only the data still in matching layout. Please post a couple of sample concatenated gzip .csv streams. But, kind of a specialized workflow, so not sure it belongs as a core feature of the Calc import dialog. @Eike?
Created attachment 150513 [details] Simple example - 2 csv.gz
Created attachment 150514 [details] Simple example - 4 csv.gz
Created attachment 150515 [details] Simple example - 4 csv.gz - 2 empty
Make sense. Moving to NEW...