Created attachment 167635 [details] IP-Table When importing this: (Line 575) First occured at line 1, then 73 ..... "1.72.0.0","1.79.255.255","JP","Japan" "1.80.0.0","1.95.255.255","CN","China" "1.96.0.0","1.111.255.255","KR","Korea (Republic of)" "1.112.0.0","1.115.255.255","JP","Japan" "1.116.0.0","1.119.255.255","CN","China" "1.120.0.0","1.159.255.255","AU","Australia" "1.160.0.0","1.174.206.255","TW","Taiwan (Province of China)" "1.174.207.0","1.174.207.255","NO","Norway" "1.174.208.0","1.175.255.255","TW","Taiwan (Province of China)" ..... "1.208.0.0","1.255.255.255","KR","Korea (Republic of)" "2.0.0.0","2.16.0.255","FR","France" .... the result is this (misformated tab 'cause of copy&paste, that's not the fault): .... 1.72.0.0 1.79.255.255 JP Japan 1.80.0.0 1.95.255.255 CN China 1.96.0.0 1111255255 KR Korea (Republic of) 1.112.0.0 1115255255 JP Japan 1.116.0.0 1119255255 CN China 1.120.0.0 1159255255 AU Australia 1.160.0.0 1174206255 TW Taiwan (Province of China) 1.174.207.0 1174207255 NO Norway 1.174.208.0 1175255255 TW Taiwan (Province of China) .... 1.208.0.0 1255255255 KR Korea (Republic of) 2.0.0.0 2.16.0.255 FR France ..... But not only at the beginning, there are serveral faults like this after import. Anonther fault: Line 288: if your list is ,-seperated, this makes trouble: .... "34729984","34732031","IT","Italy" "34732032","34732287","UG","Uganda" "34732288","34732543","AM","Armenia" "34732544","34732799","TZ","Tanzania, United Republic of" "34732800","34733055","DE","Germany" "34733056","34733823","TR","Turkey" "34733824","34734079","US","United States of America" .... result in: 2.17.240.0 2.17.247.255 IT Italy 2.17.248.0 2.17.248.255 UG Uganda 2.17.249.0 2.17.249.255 AM Armenia 2.17.250.0 2.17.250.255 TZ Tanzania"2.17.251.0 2.17.251.255 DE Germany 2.17.252.0 2.17.254.255 TR Turkey 2.17.255.0 2.17.255.255 US United States of America 2.18.0.0 2.18.3.255 IT Italy At the preview, all looks fine Reproducible: Always Steps to Reproduce: 1. right click attached csv 2. select open with libreoffice 3. use standard setting (UTF-8, german, Seperator: , character delimeter: ") Version: 6.4.7.2 (x64) Build-ID: 639b8ac485750d5696d7590a72ef1b496725cfb5 CPU-Threads: 12; BS: Windows 10.0 Build 18363; UI-Render: GL; VCL: win; Gebietsschema: de-DE (de_DE); UI-Sprache: de-DE Calc: threaded
additional info to line 288: a faulty input 'causes this problem, but maybe it would be better, if a delimeted phrase (e.g. by ") is taken as ONE expression So that the input "foo","bar","foo,bar" results to: foo bar foo,bar and not to: foo bar foo bar"
Repro (German in import settings is important) with Version: 7.0.3.1 (x64) Build ID: d7547858d014d4cf69878db179d326fc3483e082 CPU threads: 12; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win Locale: ru-RU (ru_RU); UI: en-US Calc: CL
Repro (German in import settings is important) with Version: 6.4.7.2 (x64) 6.4.7.2 (x64) languages testet: German : fault German (Austria) : fault German (Swiss) : NO fault German (Luxembourd) : fault German (Liechtenstein): NO fault Frisian (german accent): fault Spanish (Argentinia) : fault Spanish (Bolivia) : fault Spanish (Spain) : fault English: (USA) : NO fault English: (Kanada) : NO fault Icelandic : fault French (Belgium) : fault French (Niger) : NO fault OK, so far from me
I suppose it's not a bug actually. The "problematic" parts have groups of exactly three digits, separated by the locale's *thousand separator*. It's normal that these strings are recognized as valid whole numbers. All cases where the groups were not three are recognized as not valid numbers, thus kept as is. This works as expected. You should mark the columns as string explicitly on import, and that would disable the number recognition.
I can't agree with your result. As you see, it's not only depending on "exactly three digits, separated by the locale's *thousand separator*", but with languages settings before import. AND: the values are bracketedby " with indicates, this IS TEXT AND: in the preview, it is imported correctly So, I take your solution as workaround. Tahnks, but, sorry, for me, it's REOPEND"
OK, depending on wikipedia: https://en.wikipedia.org/wiki/Decimal_separator it make more or less sense, but in my opinion, values in "-brackets should be conversion-proof. If I want to have numbers, I could create a file like this: 1.72.0.0,1.79.255.255,"JP","Japan" but I created this: "1.72.0.0","1.79.255.255","JP","Japan"
It is not "FIXED", it's NOTABUG. Fixed is where something (known) had been changed in the software to remove the pre-existing actual problem in it. (In reply to Frank B. from comment #5) > I can't agree with your result. > > As you see, it's not only depending on "exactly three digits, separated by > the locale's *thousand separator*", but with languages settings before > import. 1. Of course, as I wrote: "exactly three digits, separated by the locale's *thousand separator*". The phrase combined two factors: grouping of three digits, and *locale-specific* thousand separators. > AND: the values are bracketedby " with indicates, this IS TEXT No. In CSV, quotes are not for TEXT. CSV does not have notion of "data type" at all, ALL its content is text. It's up to application to try (or not) to interpret the textual elements of each record in CSV. Quotes in CSV are not to mark elements as "text", it's to mask some characters in elements that otherwise would have special meaning in CSV, like separators or newlines. That all is described in RFC 4180 [1]. > AND: in the preview, it is imported correctly Preview does not "imports" things (in the sense that it does not interpret values), it just shows you the content of N first lines, *separated* by the rules that you define, so that you have the idea what gets into which cell. The conversions are performed only at actual import. (In reply to Frank B. from comment #6) > in my opinion, values in "-brackets should be conversion-proof. > > If I want to have numbers, I could create a file like this: > > 1.72.0.0,1.79.255.255,"JP","Japan" > > but I created this: > > "1.72.0.0","1.79.255.255","JP","Japan" As described above, that's your misconception, and if you imply some special processing of quotes does not necessarily mean that it's what the format says about it. [1] https://tools.ietf.org/html/rfc4180
(In reply to Mike Kaganski from comment #7) > (In reply to Frank B. from comment #6) > > in my opinion, values in "-brackets should be conversion-proof. > > > > If I want to have numbers, I could create a file like this: > > > > 1.72.0.0,1.79.255.255,"JP","Japan" > > > > but I created this: > > > > "1.72.0.0","1.79.255.255","JP","Japan" > > As described above, that's your misconception, and if you imply some special > processing of quotes does not necessarily mean that it's what the format > says about it. And by the way, the import dialog does include the dedicated checkbox for processing like you want in Other Options: [ ] Format quoted field as text.
@Mike Kaganski: Thanks, after "studying" RFC 4180, I agree - your right, I'm wrong. (then there is an error in Excel ;-) Excel imports this parts as numbers, too - but it leaves the . between it... so that it looks like an IP, but it is a number, too
(In reply to Frank B. from comment #9) > (then there is an error in Excel ;-) > Excel imports this parts as numbers, too - but it leaves the . between it... > so that it looks like an IP, but it is a number, too Well, I'd not call it a bug in Excel: it simply does more work, and attempts to create a number formatting that would represent the converted number as it was represented in CSV. So seeing that a value string had thousand separators, it applies a number format with thousand separators to the cell after the import... :-) That is likely to result in very inconsistently formatted cells throughout the file, but OTOH it keeps visual representation ... both ways have upsides and downsides.