Apache web server log files contain text data separated by spaces; data fields containing spaces is enclosed in double quotes. LibreOffice 3.4.x properly parses this data, 3.6.0.4 does not seem to want to do so: it only sees the first and last double quotes and dumps everything in between into one column instead of splitting it across columns. I don't know about 3.5.x nor have I tried to confirm this on other platforms. Example data line (w/o wrapping!) 66.249.73.206 - - [10/Aug/2012:10:03:45 +0200] "GET /news/tema/anteprima-istantanea HTTP/1.1" 500 7055 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
I've verified that 3.5.6.2 does not have this bug - and I noticed that a lot of work on CSV import went into 3.6, so this appears to be a regression due to work in this area.
REPRODUCIBLE with LibreOffice 3.6.1.1 (Build ID: 4db6344), German langpack installed, on MacOS X 10.6.8 (Intel). NOT reproducible with LibreOffice 3.5.6.2, therefore added keyword "regression". What I did to test: Using a text editor, I created a new text file containing just 4 times the example data line from comment #0 (line wrapping removed!). I saved this file as "Sample.csv". When I try to open this file with LibO, both LibO 3.5.6.2 and 3.6.1.1 recognize the file type correctly and show the "Text Import" dialog window. In the section "Separator Options", I select "Separated by" and check the check box "Space". In the field "Text delimiter", I leave the (pre-selected) ". Then I click "OK". LibO 3.5.6.2 handles the quoted items correctly, i.e. puts "GET /news/tema/anteprima-istantanea HTTP/1.1" "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" each in a single cell of its own. LibO 3.6.1.1 puts all quoted items together, i.e.: GET /news/tema/anteprima-istantanea HTTP/1.1" 500 7055 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) in a single cell. To describe the problem using Regular Expression terminology, I would say that LibO 3.6.1.1 handles the " too greedy; i.e., it searches for the contents of a quoted item using "(.*)" instead of using "([^"]*)"
@Calc Team: Hello Kohei, Markus, and Eike, please take a look at this nasty bug. It is a regression probably introduced during the work on CSV import for LibreOffice 3.6. I hope that it should be rather easy to fix this issue -- just a simple oversight here or there ... Thank you very much in advance!
Taking over.
Eike Rathke committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=b44a402d5a05dd32aa2e1ab80c9ea75b560dc3b9 resolved fdo#53325 CSV space delimiter and quoted field
(In reply to comment #5) > Eike Rathke committed a patch related to this issue. Hello Eike, thank you very much for fixing this issue so fast! If the patch works as intended, you will backport it to 3.6.x, won’t you? ;-) Thank you again!
I already submitted a call for review, see http://nabble.documentfoundation.org/REVIEW-3-6-resolved-fdo-53325-CSV-space-delimiter-and-quoted-field-td4002501.html
Eike Rathke committed a patch related to this issue. It has been pushed to "libreoffice-3-6": http://cgit.freedesktop.org/libreoffice/core/commit/?id=0e176a7411beced06ce27c5f059aa97e7de4212d&g=libreoffice-3-6 resolved fdo#53325 CSV space delimiter and quoted field It will be available in LibreOffice 3.6.2.
Eike Rathke committed a patch related to this issue. It has been pushed to "libreoffice-3-6-1": http://cgit.freedesktop.org/libreoffice/core/commit/?id=76ae3173bb16f5ce4899026bb2bed109ecee6ce4&g=libreoffice-3-6-1 resolved fdo#53325 CSV space delimiter and quoted field It will be available already in LibreOffice 3.6.1.
(In reply to comment #9) > It will be available already in LibreOffice 3.6.1. @Eike: Thank you again!