Created attachment 114824 [details] The .csv file to open Hello. I am trying to import a .csv file (see attachment). The file has 16069 rows and is not especially large (3.3 Mb). But when I open it with LO, it says "The data could not be loaded completely because the maximum number of characters per cell was exceeded." First problem : Calc should not have any problem loading a file with this number of rows. Second problem : After looking twice, I noticed that the last line of the Calc sheet does match the last line of the .csv file ( i.e. " La Taille de la Vente;37350;LA CELLE-GUENAND;INDRE-ET-LOIRE;CENTRE;491439.0;2217742.0;3811Z;Collecte des déchets non dangereux;;") but the row number does not match the one of the .csv file : - in the Calc sheet, the last row is number 15405 - in the .csv file, the last row is number 16069 So I suppose that some rows have been skipped during import.
On pc Debian x86-64 with master sources updated today, I could reproduce this. I noticed this on console: warn:legacy.osl:23655:1:sc/source/ui/docshell/impex.cxx:602: lcl_appendLineData: data overflow
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present on a currently supported version of LibreOffice (5.0.5 or 5.1.2 https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System Please DO NOT - Update the version field - Reply via email (please reply directly on the bug tracker) - Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to "inherited from OOo"; 4b. If the bug was not present in 3.3 - add "regression" to keyword Feel free to come ask questions or to say hello in our QA chat: http://webchat.freenode.net/?channels=libreoffice-qa Thank you for your help! -- The LibreOffice QA Team This NEW Message was generated on: 2016-04-16
Tested today positively on LO 5.0.5.2 under Windows 7.
Thank you for your feedback, since it's positive, let's put this one to WFM.
Sorry if I was not clear : when I wrote "tested positively", I meant that the problem is still there.
Problem still exists in 5.1.3.2
Still present in LO Version: 5.1.6.2 Build ID: 1:5.1.6~rc2-0ubuntu1~xenial2 on Ubuntu 16.04 There are also similar bugs on the OOo bug tracker about this (see https://bz.apache.org/ooo/show_bug.cgi?id=123340 for instance) , so this might be inherited ( I am not runnig virtualbox anymore, so I can't easily check with the 3.3. version without wrecking my current install.)
Using LibreOffice Calc 5.4.0.3, I can open Attachment 114824 [details]. However, that version of Calc still chokes on delimited text files that are a bit larger.
The csv is invalid because it contains text values with line breaks that are indistinguishable from line breaks between 2 records. Any csv parser will fail with this.
(In reply to Andreas Säger from comment #9) > The csv is invalid because it contains text values with line breaks that are > indistinguishable from line breaks between 2 records. Any csv parser will > fail with this. Could you elaborate on this? True, the line breaks may be semantically wrong, but I don't see how they make the file syntactically incorrect or harder/impossible to parse.
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
Hi This bug is still present in Version: 6.0.7.3 "The data could not be loaded completely because the maximum number of characters per cell was exceeded." The strange popup doesn't even say what line it is on. It should just continue after what ever issue it finds. I can put a $100 bug-bounty on a fix roll out. can't believe this PR was filed 4 years ago, and still broken.
Created attachment 153283 [details] bt with debug symbols On pc Debian x86-64 with master sources updated today, I could reproduce this. I attached a bt just after: warn:legacy.osl:21334:21334:sc/source/ui/docshell/impex.cxx:628: lcl_appendLineData: data overflow
(In reply to Mamoth from comment #0) > - in the Calc sheet, the last row is number 15405 > - in the .csv file, the last row is number 16069 Last row number in LO depends on CSV open dialog options. And in MSO Excel, which opens CSV without warning, the last row number is 16008. On the other side, if open and saved with LO or MSO, csv remains at 16069. This file is not correct. I guess that's what Andreas referred to. Reporter should have given minimal example so that we know what a problem in LO is, not that we look for a problem in his file.
Timur, you have closed this bug on the basis that the second problem reported by Mamoth is invalid. But you are ignoring the fact that the first problem he reported (the unhelpful error message) is valid. Even if there is a problem with the file, LibreOffice needs to inform the user in a way that allows them to understand and locate the problem. I suggest this bug be reopened to deal with this first problem. Incidentally, it is still reproducible for me with the following setup: Version: 6.3.1.1 Build ID: 30(Build:1) CPU threads: 12; OS: Linux 5.2; UI render: default; VCL: gtk3; Locale: en-GB (en_CA.UTF-8); UI-Language: en-US Calc: threaded
(In reply to Tristan Miller from comment #15) > Timur, you have closed this bug on the basis that the second problem > reported by Mamoth is invalid. But you are ignoring the fact that the first > problem he reported (the unhelpful error message) is valid. Even if there > is a problem with the file, LibreOffice needs to inform the user in a way > that allows them to understand and locate the problem. I suggest this bug > be reopened to deal with this first problem. Incidentally, it is still > reproducible for me with the following setup: > ... LO gives some hint, a cell contains too much characters. So it means something prevented from the good split on the data. LO does just csv file parsing to split at the right location depending on the entered parameters, it can't do a semantic analyze which would allow it to guess where a value should stop.
(In reply to Julien Nabet from comment #16) > LO gives some hint, a cell contains too much characters. So it means > something prevented from the good split on the data. > LO does just csv file parsing to split at the right location depending on > the entered parameters, it can't do a semantic analyze which would allow it > to guess where a value should stop. The least it could do is indicate at which cell or which row the parsing failed.
Tristan:the simplest thing would be to stop when there's a pb Eike: do you think we may add some throw exception or at least an assert in https://opengrok.libreoffice.org/xref/core/sc/source/ui/docshell/impex.cxx?r=f805e0b9#636 in else part? 626 static bool lcl_appendLineData( OUString& rField, const sal_Unicode* p1, const sal_Unicode* p2 ) 627 { 628 OSL_ENSURE( rField.getLength() + (p2 - p1) <= SAL_MAX_UINT16, "lcl_appendLineData: data overflow"); 629 if (rField.getLength() + (p2 - p1) <= SAL_MAX_UINT16) 630 { 631 rField += OUString( p1, sal::static_int_cast<sal_Int32>( p2 - p1 ) ); 632 return true; 633 } 634 else 635 { 636 rField += OUString( p1, SAL_MAX_UINT16 - rField.getLength() ); 637 return false; 638 } 639 } Indeed, putting more than SAL_MAX_UINT16 characters in a cell is more than suspect no ?
Bug still present @Version: 6.4.2.2
It's not a bug in Calc. It's a bug in the data generating application. Meanwhile the cell's string limit is not 64k anymore so theoretically upping that limit could be possible. BUT.. big BUT: The data is broken at several places, that's why it is tried to stuff so many characters into one cell resulting in less rows than the original. First broken field is in row 3385, after import go to D3385 and you'll see quite a few records crammed into that cell. Reason is that the record 052.01076;DECONS - Le Pian (Louens);40271311900012;"Louens" 1701 - Route de Soulac;33290;LE PIAN-MEDOC;GIRONDE;AQUITAINE;360329.91;1999074.6;3831Z;Démantèlement d'épaves;; is broken data in that the field content ;"Louens" 1701 - Route de Soulac; starts with a field opening " quote character but is not closed, so the CSV import correctly keeps running that field until a field closing " quote character is encountered in row 3415 of 052.01340;SITA SUD OUEST;70198020300148;Centre de Valorisation des Déchets "Graulin";33470;LE TEICH;GIRONDE;AQUITAINE;334690.0;1958026.0;3821Z;Traitement et élimination des déchets non dangereux;; (the Graulin";). The correct data in the wrong field instead should had been ;"""Louens"" 1701 - Route de Soulac"; There are a few of such cases, visually detectable by a tall row and usually the bad data in column D. Rows of raw data: 3385: 052.01076;DECONS - Le Pian (Louens);40271311900012;"Louens" 1701 - Route de Soulac;33290;LE PIAN-MEDOC;GIRONDE;AQUITAINE;360329.91;1999074.6;3831Z;Démantèlement d'épaves;; 3573: 052.02306;RIGHINI -Tonneins;72615004800023;"Rapetout" Route de Verteuil;47400;TONNEINS;LOT-ET-GARONNE;AQUITAINE;438289;1934226;1623Z;Fabrication de charpentes et d'autres menuiseries;; 3800: 052.06068;USSGETOM;25330430700015;"Lichon" - ZA de Coussères;33210;FARGUES;GIRONDE;AQUITAINE;392400.0;1951200.0;8412Z;Administration publique (tutelle) de la santé - de la formation - de la culture et des services sociaux - autre que sécurité sociale;; 3843: 052.06793;LA GRANDE JAUGUE;50818282100019;"Touban" Pôle environnemental;33160;SAINT-MEDARD-EN-JALLES;GIRONDE;AQUITAINE;359012;1992919;3821Z;Traitement et élimination des déchets non dangereux;; 9783: 064.00202;Usine de traitement d'Hugueneuve;05781313100026;"500 Route du Gros Cerveau;83190;OLLIOULES;VAR;PROVENCE-ALPES-COTE-D'AZUR;886494;1799096;3600Z;Captage - traitement et distribution d'eau;; 14296: 053.02584;SMIRTOM DE LA REGION DE L'AIGLE;25610217900015;"les champs rouges" déchètterie route de l'aigle à crulai BP 174;61300;L'AIGLE;ORNE;BASSE-NORMANDIE;457856.0;2416389.0;3811Z;Collecte des déchets non dangereux;; 16033: 100.02151;SODEC;42396801500010;"La Grande pièce" RD 90;18100;SAINT-HILAIRE-DE-COURT;CHER;CENTRE;574286.0;2245553.0;3821Z;Traitement et élimination des déchets non dangereux;5.(d);Décharges - à l'exception des décharges de déchets inertes et des décharges qui ont été définitivement fermées avant le 16 juillet 2001 ou dont la phase de gestion après désaffection requise par les autorités compétentes conformément à l'article 13 de la directive 1999/31/CE du Conseil du 26 avril 1999 concernant la mise en décharge des déchets s'est achevée recevant 10 tonnes par jour ou d'une capacité totale de 25 000 tonnes