Description: The problem is only summary described. When I 1.Choose File - Open. 2.Locate the *.dbf file that you want to import. 3.Click Open. The Import dBASE files dialog was not shown as before. (v5.4.x.x) However, this file once open from LO v5.4, and saves as a new file with dbf format, it now can be opened by v6.0. Steps to Reproduce: 1.Choose File - Open. 2.Locate the *.dbf file that want to import. 3.Click Open. Actual Results: shows reading error message Expected Results: popup: Import dBASE files dialog Reproducible: Always User Profile Reset: Yes OpenGL enabled: Yes Additional Info: The problem never happened on v5.4.x.x and before since Ooo v1 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0
Created attachment 140324 [details] sample files and description in zh-TW
(In reply to chichang4911 from comment #1) > Created attachment 140324 [details] > sample files and description in zh-TW Additional information in the zip file: addresses.dbf is the original file generated by Visual Foxpro v6.0, which can not be opened by LibreOffice 6.0.2, showing "Read Error. Impossible to connect to file." addresses_v5.dbf is generated by opening the addresses.dbf with Calc v5.x successfully, then saved as a new dbf file. It can be opened by Calc 6.0.2. addresses_mad.dbf is generated by opening the addresses.dbf with madedit successfully, then saved as a new dbf file. It can also be opened by Calc 6.0.2.
Created attachment 140325 [details] 5.4.0.2 failed to open the file Bisected: 5.4.0.1 could open (but with other problem), and in 5.4.0.2 it failed to open. Tested under Linux (Kubuntu 16.04).
Created attachment 140326 [details] 5.4.0.1 could open but the encoding is not correctly handled 5.4.0.1 could open the file, but choosing the encoding Big5 the Chinese characters weren't shown correctly.
Created attachment 140327 [details] In 5.4.0.0 beta2 it could open and show correctly In 5.4.0.0 beta2 it could be opened and shown correctly with encoding Big5.
Notice that my testing (and bisecting) is under Linux. The original reporter is using Windows. According to his report (to us) he could open it with 5.4.4 (Windows version) correctly.
(In reply to Franklin Weng from comment #6) > Notice that my testing (and bisecting) is under Linux. The original > reporter is using Windows. According to his report (to us) he could open it > with 5.4.4 (Windows version) correctly. Hi chichang4911, please test it with newer version and let us know if it works. Thank you. http://www.libreoffice.org/download/libreoffice-fresh/
I can confirm with file addresses.dbf [chinese BIG5 coding] Version: 6.1.0.0.alpha0+ Build ID: 44b4ad7d210097fdaed7dd94c5746b03f43592d3 CPU threads: 4; OS: Linux 4.4; UI render: default; VCL: gtk3; and Version: 6.1.0.0.alpha0+ Build ID: e108a31a8fee09c2fa4031e45e45ed73bbdb7c6f CPU threads: 1; OS: Windows 10.0; UI render: default; TinderBox: Win-x86@42, Branch:master, Time: 2018-03-03_23:36:02 regression, works in LO $.4, Linux
On pc Debian x86-64 with master sources updated some days ago, I could reproduce this. Just for the information, it can be opened from Base. 1) Launch Base 2) Choose "Connect to an existing DB" then "dBASE" 3) Select path where dbase file is. 4) Click "Finish" 5) Choose file name for the odb file I'll give it a try.
Just to be sure: I used hexdump to view the first 32 bytes of addresses.dbf and got this: 00000000 30 12 03 03 06 00 00 00 48 03 d8 01 00 00 00 00 |0.......H.......| 00000010 00 00 00 00 00 00 00 00 00 00 00 00 03 78 00 00 |.............x..| According to this https://www.clicketyclick.dk/databases/xbase/format/dbf.html#DBF_STRUCT, byte 29 (so the 30th byte since we start from 0) is equal to the hexa 78 (or 120 in decimal) In https://opengrok.libreoffice.org/xref/core/include/rtl/textenc.h, I don't see value 120. Could you indicate the precise encoding of the file? Perhaps I'm wrong or missed something.
Created attachment 140330 [details] bt from console log On console, I noticed this log: DBaseImport: dbtools::OCharsetMap doesn't know text encoding So here's the bt from this point.
First byte of dbf file corresponds to the version (see https://www.clicketyclick.dk/databases/xbase/format/dbf.html#DBF_STRUCT) Mapping of version value: https://www.clicketyclick.dk/databases/xbase/format/dbf.html#DBF_NOTE_1_TARGET (notice that it might change slightly when searching other references) addresses.dbf => 30 (so Visual Foxpro as expected I suppose) addresses-v5.dbf => 83 (File with DBT) If I remember well it's the default version LO uses (DBT for MEMO fields which may be used). addresses-mad.dbf => 03 (File without DBT) But the pb is not the version, it's the unrecognized encoding "78" In addresses-mad.dbf, we also got 78 and in addresses-v5.dbf, we got 0 (equivalent to "don't know")
(In reply to raal from comment #7) > (In reply to Franklin Weng from comment #6) > > Notice that my testing (and bisecting) is under Linux. The original > > reporter is using Windows. According to his report (to us) he could open it > > with 5.4.4 (Windows version) correctly. > > Hi chichang4911, > please test it with newer version and let us know if it works. Thank you. > > http://www.libreoffice.org/download/libreoffice-fresh/ I will try to test more versions of Libreoffice, Test os: windows 10 (64bits) Test files: addresses.dbf, addresses-mad.dbf, addresses-v5.dbf Results addresses addresses-mad addresses-v5 v5.1.6.2 ok ok ok v5.2.7.2 ok ok ok v5.3.7.2 ok ok ok v5.4.0.1 ok ok ok v5.4.0.2 xx xx ok v5.4.1.2 xx xx ok v5.4.4.2 xx xx ok
(In reply to chichang4911 from comment #13) > ... > I will try to test more versions of Libreoffice, > ... Thank you for your feedback, but above all, as asked in comment 6, could you indicate the precise encoding of the file? (addresses.dbf)
(In reply to Julien Nabet from comment #14) > (In reply to chichang4911 from comment #13) > > ... > > I will try to test more versions of Libreoffice, > > ... > Thank you for your feedback, but above all, as asked in comment 6, could you > indicate the precise encoding of the file? (addresses.dbf) As mentioned in comment #4, the encoding is zh_TW.Big5. Version before 5.4.0.0 beta2 choosing Big5 as encoding could show Chinese correctly. But it broke in 5.4.0.1, though opening the file was okay.
Created attachment 140333 [details] bt concerning charset mapping So hexa 78 is used here in dbfDecodeCharset and mapped to RTL_TEXTENCODING_MS_950 See https://opengrok.libreoffice.org/xref/core/connectivity/source/commontools/dbtools.cxx#2020 See bt Quote from https://en.wikipedia.org/wiki/Code_page_950, "Code page 950 is Microsoft's implementation of the de facto standard Big5" So no pb in this part.
I submitted a first patch to review here for the encoding part: https://gerrit.libreoffice.org/#/c/50731/ + another patch for dealing with timestamp and empty value: https://gerrit.libreoffice.org/#/c/50732/
Julien Nabet committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=a77b493392ecdfe2e58bb0fcfa7363a8583dffe4 Related tdf#116171: don't try to convert empty value in timestamp It will be available in 6.1.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
(In reply to chichang4911 from comment #13) > (In reply to raal from comment #7) > > (In reply to Franklin Weng from comment #6) > > > Notice that my testing (and bisecting) is under Linux. The original > > > reporter is using Windows. According to his report (to us) he could open it > > > with 5.4.4 (Windows version) correctly. > > > > Hi chichang4911, > > please test it with newer version and let us know if it works. Thank you. > > > > http://www.libreoffice.org/download/libreoffice-fresh/ > > I will try to test more versions of Libreoffice, > Test os: windows 10 (64bits) > Test files: addresses.dbf, addresses-mad.dbf, addresses-v5.dbf > Results > addresses addresses-mad addresses-v5 > v5.1.6.2 ok ok ok > v5.2.7.2 ok ok ok > v5.3.7.2 ok ok ok > v5.4.0.1 ok ok ok > v5.4.0.2 xx xx ok > v5.4.1.2 xx xx ok > v5.4.4.2 xx xx ok v6.0.2.1 xx ok ok
When I test "libo-60-64~2018-03-04_05.34.43_LibreOfficeDev_6.0.3.0.0_Win_x64.msi" addresses.dbf still shows "read error impossible to connect to the file" addresses-mad.dbf, addresses-v5.dbf can be imported
(In reply to chichang4911 from comment #20) > When I test > "libo-60-64~2018-03-04_05.34.43_LibreOfficeDev_6.0.3.0.0_Win_x64.msi" > > addresses.dbf still shows "read error impossible to connect to the file" > addresses-mad.dbf, addresses-v5.dbf can be imported Since I can reproduce this with master sources, it's expected. No need further tests for the moment.
Julien Nabet committed a patch related to this issue. It has been pushed to "libreoffice-6-0": http://cgit.freedesktop.org/libreoffice/core/commit/?id=bab7cef648025038055d3284773d33f102d42f13&h=libreoffice-6-0 Related tdf#116171: don't try to convert empty value in timestamp It will be available in 6.0.3. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
(In reply to Franklin Weng from comment #15) > As mentioned in comment #4, the encoding is zh_TW.Big5. Version before > 5.4.0.0 beta2 choosing Big5 as encoding could show Chinese correctly. But > it broke in 5.4.0.1, though opening the file was okay. Is it possible to bibisect this, to find out which commit exactly broke it?
Stephan: I may be wrong but I think it's due to some work done last year, see https://cgit.freedesktop.org/libreoffice/core/log/?qt=grep&q=dbase
chichang4911: just to be sure with 5.4.0.2 or before + with a brand new LO profile (see https://wiki.documentfoundation.org/UserProfile#Windows), does LO open addresses.dbf directly or does LO ask about encoding before opening it?
(In reply to Julien Nabet from comment #25) > chichang4911: just to be sure with 5.4.0.2 or before + with a brand new LO > profile (see https://wiki.documentfoundation.org/UserProfile#Windows), does > LO open addresses.dbf directly or does LO ask about encoding before opening > it? Argh, I meant 5.4.0.1 but don't bother, I'll test this. I retrieved the archive.
I confirm I don't reproduce the pb with 5.4.0.1 since LO asks about the encoding. The regression is due to https://cgit.freedesktop.org/libreoffice/core/commit/?id=7f1465a9599e9665159dd2d823a6e9064cca5703 but this patch fixes a broken situation and so reveals a bug. Indeed, load_CharSet from sc/source/ui/unoobj/filtuno.cxx wasn't searching encoding of the file but displayed a list of encodings that a user could choose with at the beginning of the list either 850 encoding by default or the last one used. See in particular https://cgit.freedesktop.org/libreoffice/core/diff/sc/source/ui/unoobj/filtuno.cxx?id=7f1465a9599e9665159dd2d823a6e9064cca5703
Let's remove targets since it's not fixed for the moment.
Yeah, whatever the claims that it worked in the past (with or without presenting a dialog first, asking the user to chose an appropriate text encoding from a list), I think I understand now what goes wrong on current master, and how to fix it. Gerrit change is forthcoming.
My patch was wrong but hopefully Stephan proposed one here: https://gerrit.libreoffice.org/#/c/50772/ (so now, even if I'm still interested and concerned by this tracker since this regression is partly due to me, let's not pretend I'm the guy who will fix this and unassign myself)
Stephan Bergmann committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=5ad62544bce42396faaae2bc79c7517af6ff085b tdf#116171: Tunnel arbitrary rtl_TextEncoding from sc to sdbc:dbase connection It will be available in 6.1.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Stephan Bergmann committed a patch related to this issue. It has been pushed to "libreoffice-6-0": http://cgit.freedesktop.org/libreoffice/core/commit/?id=db7dae40a2082d5d2b1ac22008d32ef9ebf86f4e&h=libreoffice-6-0 tdf#116171: Tunnel arbitrary rtl_TextEncoding from sc to sdbc:dbase connection It will be available in 6.0.3. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Stephan Bergmann committed a patch related to this issue. It has been pushed to "libreoffice-5-4": http://cgit.freedesktop.org/libreoffice/core/commit/?id=8d96cdf9ac8dedd54620d31bafbccc76d75d7757&h=libreoffice-5-4 tdf#116171: Tunnel arbitrary rtl_TextEncoding from sc to sdbc:dbase connection It will be available in 5.4.7. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Stephan Bergmann committed a patch related to this issue. It has been pushed to "libreoffice-5-4-6": http://cgit.freedesktop.org/libreoffice/core/commit/?id=5598fa704df171544a913a8cfda62a183f1a1a66&h=libreoffice-5-4-6 tdf#116171: Tunnel arbitrary rtl_TextEncoding from sc to sdbc:dbase connection It will be available in 5.4.6. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.