Created attachment 74864 [details] Incorrect XLSX file and screenshots in PDF Problem description: I have an XLSX-file which opens incorrectly. Absent text data in cells, numbers are displayed. MSO Excel 2007 opens the file correctly. I don't know the creator application. Screenshots are in PDF-file. Steps to reproduce: 1. Open test.xlsx Current behavior: See test.pdf page 2 Expected behavior: See test.pdf page 1 Operating System: All Version: 4.0.0.3 release
Right, when opening your original xlsx file with Libreoffice all texts are suppressed at import. MS Excel 2010 does fine. But: Open the file with Excel 2010 and write something in a cell. Save the document and open it in Libreoffice. Voila! All text ist properly imported. Maybe the original document is corrupted?! Tested with: Version 4.0.0.3 (Build ID: 7545bee9c2a0782548772a21bc84a9dcc583b89) on Windows 7.
OK, I checked it. But... Excel opens this file but LibO - not. I don't know document is corrupted or not (I don't know XLSX specification). Now I know what some file may be opened without some data. This is not a test without Excel. It's bug. Formatting errors, it is not a problem. Data loss is a problem.
The solution couldn't be more simple: In the xlsx file there is a file named sharedStrings.xml. In your file it is uppercase SharedStrings.xml, so libo doesn't recognize it. MS Office seems to be more generous with this, so opening it with Excel is no problem. I renamed the file to the lowercase variant and zipped the folder. Voila! It opens correctly with libo. And you're right, it may be a random case, but we should be as generous as Excel is. Kohei, Markus, what's your opinion to this?
Created attachment 75102 [details] Renamed to sharedStrings.xml - works fine
Сonfirm Verified with LibO 4.0.0.3 and OOOpro 3.2.1 Also I verified original file (without changing the text data to the word 'test').
Lets start with the easy part. The file is not valid according to OOXML. In xl/_rels/workbook.xml.rels it references sharedStrings.xml in contrast to the SharedStrings.xml that it actually contains. I checked that we respect the value in the file so we are conformant with the standard in this matter. Now to the question if we should support these broken documents. I think it would be relatively easy to check for files named any version of sharedstrings.xml when the import of the file mentioned in the relationship file fails. However I would give this a low priority for now and make it an easy hack. The place in the code where we try to load the shared strings table is: http://opengrok.libreoffice.org/xref/core/sc/source/filter/oox/workbookfragment.cxx#214 We would need to change the code that if we don't find the file in this case we check if any of variation of sharedStrings.xml is found. Anybody interested in this task can ask me for more details.
Hi, I want to work on this bug. I couldn't find which part of the code checks the path of "sharedStrings.xml". I assume the code checks the path case sensitive and we can make is case insensitive. But this is not a proper solution I think. Proper solution is checking variations only in case we couldn't find the "sharedStrings.xml" as Markus said. Where should i look at for the solution? Thanks and sorry for my possible grammar mistakes :)
Here's an entry point: http://opengrok.libreoffice.org/search?project=core&q=sharedStrings
(In reply to comment #7) > Hi, > > I want to work on this bug. Great. > > I couldn't find which part of the code checks the path of > "sharedStrings.xml". I assume the code checks the path case sensitive and we > can make is case insensitive. But this is not a proper solution I think. > Proper solution is checking variations only in case we couldn't find the > "sharedStrings.xml" as Markus said. > > Where should i look at for the solution? > > Thanks and sorry for my possible grammar mistakes :) So it is more difficult than that. The shared strings don't have to be saved into sharedStrings.xml or any other variation of that string. OOXML defines relationships for this and stores their the name of the file. As long as the name of the file and entry in the relation are equal the file is valid. What we are now trying to do is to accept that there are OOXML generators out there that produce invalid OOXML files which most likely use a variation sharedStrings.xml. So at the place that I mentioned in Comment 6 we use the path saved in the relations table to finally import the file and we need to check there if this file exists. So you should start by starting Libreoffice in a debugger and set a breakpoint at that place and inspect the stacktrace to see if there is a way to get a list of all the files. Please feel free to ask for more details if you need them.
Should we fix only the case of sharedStrings or all the relationships?
(In reply to comment #10) > Should we fix only the case of sharedStrings or all the relationships? Please start with sharedStrings. As long as we don't get any bug reports for other relationships I don't want to add more complexity for them.
*** Bug 49840 has been marked as a duplicate of this bug. ***
These ones come apparently from some accounting software like 1C:Enterprise.
Probably, incorrect XLSX file (attached) was created by 1C-Enterprise. 1C-Enterprise has a error when exporting to XLSX. I will try to inform the 1C about this error. Is there a progress in correction this bug in LibO?
(In reply to comment #6) > Lets start with the easy part. The file is not valid according to OOXML. > ... > > Now to the question if we should support these broken documents. I think it > would be relatively easy to check for files named any version of > sharedstrings.xml when the import of the file mentioned in the relationship > file fails. > ... Technically speaking, the software which generated an invalid string is at fault and should be fixed instead. But users just need to open such files, most of the times not knowing they are corrupted or which app generated them. That, of course, to a certain extent - no software should be asked to open garbage. A comprehensive - but hard - way of doing it would be to check for the correct string first. Then, if not found, search for obvious variations, maybe case-insensitive. if found, inform the user that the file is corrupt and ask what to do - abort, open anyway or fix it. This would ruin the "easy hack" game, I know, just wanted to add my 2¢. For now, "just open" should work.
*** Bug 67037 has been marked as a duplicate of this bug. ***
If the file is not according to OOXML, then why LO opens it without any messages? IMHO the best way is show the message while loading that file is corrupted and offer the user to fix it.
Just taking a wild guess here... feel free to shoot this comment down at will :). If the error is only in the case sensitivity (SharedStrings.xml vs sharedStrings.xml), could it be that case-sensitivity wasn't a design concern for OOXML? (Windows filesystems, the registry afaik are also case insensitive) In which case, there could be (and likely are) other implementations of OOXML that produce files incompatible w/ LO in this regard.
Created attachment 83494 [details] Initial patch Here's an initial patch that just fixes it for this particular case.
When LO will accept this patch?
Draga, rfw2nd: Please submit this patch to gerrit, so developers can recognize this. Within bugzilla it's doomed to get lost in the noise...
Hi, it's not lost in the noise yet ;-) I reproduced the bug and then applied the patch. I confirm that it fixes the issue. Could it be possible to review this patch ?
I've uploaded it to https://gerrit.libreoffice.org/#/c/5763/ rfw
rfw2nd: Please mail me if you want your real name in your patch (at gerrit). Please follow the license statement procedure at https://wiki.documentfoundation.org/Development/Developers so we can push that fix to master. Thanks!
Licence statement sent. :) On Mon, Sep 2, 2013 at 1:41 PM, <bugzilla-daemon@freedesktop.org> wrote: > *Comment # 24 <https://bugs.freedesktop.org/show_bug.cgi?id=60889#c24>on bug > 60889 <https://bugs.freedesktop.org/show_bug.cgi?id=60889> from Thomas > Arnhold <thomas@arnhold.org> * > > rfw2nd: Please mail me if you want your real name in your patch (at gerrit). > Please follow the license statement procedure athttps://wiki.documentfoundation.org/Development/Developers so we can push that > fix to master. > > Thanks! > > ------------------------------ > You are receiving this mail because: > > - You are on the CC list for the bug. > >
Raymond Wells committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=571f49ffed792095fd41e2d07dbe30befa99a5b8 fdo#60889: FILEOPEN: Incorrect opening XLSX file (sharedStrings.xml) The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
http://dev-builds.libreoffice.org/daily/libreoffice-4-1/Win-x86_9-Voreppe/current/libreoffice-4-1~2013-09-23_20.48.02_LibreOfficeDev_4.1.3.0.0_Win_x86.msi 24-Sep-2013 19:54 not patched
Draga, you tested the 4-1 branch. Works fine on master and should be included in the next 4.2 release. Works fine with Version: 4.2.0.0.alpha0+ Build ID: 164b6ce7b27c0a9ec19019e7b078b9f8f382007d TinderBox: Win-x86@39, Branch:master, Time: 2013-09-28_16:39:42 So marking this as fixed.
WinXp SP3 http://dev-builds.libreoffice.org/daily/master/Win-x86@39/current/ master~2013-09-30_01.41.28_LibreOfficeDev_4.2.0.0.alpha0_Win_x86.msi 30-Sep-2013 03:45 Start "C:\Program Files\LibreOfficeDev 4\program\soffice.exe" Error message: ... not found msvcr110.dll ... I installed the Visual C++ Runtime VS2012 (vcredist_x86.exe) Now new error message (my translate from Russian): soffice.exe - Entry point not found The procedure entry point GetTickCount64 could not be located in the DLL KERNEL32.DLL What else should I install?
WinXp SP3 http://dev-builds.libreoffice.org/daily/master/Win-x86@39/current/ master~2013-09-30_01.41.28_LibreOfficeDev_4.2.0.0.alpha0_Win_x86.msi 30-Sep-2013 03:45 Start "C:\Program Files\LibreOfficeDev 4\program\soffice.exe" Error message: ... not found msvcr110.dll ... I installed the Visual C++ Runtime VS2012 (vcredist_x86.exe) Now new error message (my translate from Russian): soffice.exe - Entry point not found The procedure entry point GetTickCount64 could not be located in the DLL KERNEL32.DLL What else should I install? http://msdn.microsoft.com/en-us/library/windows/desktop/ms724411%28v=vs.85%29.aspx MSDN about GetTickCount64: Minimum supported client - Windows Vista LO 4.2: Minimum supported client - Windows Vista ?
Master builds are not compatible with XP. That's OK.
Migrating Whiteboard tags to Keywords: (EasyHack DifficultyBeginner SkillCpp) [NinjaEdit]