Bug Hunting Session
Bug 85617 - formula with external reference garbled after save as .xlsx
Summary: formula with external reference garbled after save as .xlsx
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.3.2.2 release
Hardware: Other All
: medium normal
Assignee: Eike Rathke
URL:
Whiteboard: target:4.5.0 target:4.4.1 target:4.3....
Keywords: bibisected, bisected
Depends on:
Blocks:
 
Reported: 2014-10-29 19:43 UTC by raal
Modified: 2018-02-06 22:38 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description raal 2014-10-29 19:43:48 UTC
VLOOKUP formula garbled after save as .xlsx
LO 43.2.2, win 7
Steps to reproduce:
 - open attachment from bug 85553 https://bugs.freedesktop.org/attachment.cgi?id=108570
 - The formula in column C is:  =VLOOKUP(A3;'file:///C:/temp/home/yar4e/Рабочий стол/home/finman/Рабочий стол/Файл 1.xls'#$Лист1.A3:B23;2;0)
 -  Save as  .xlsx
 - close, reopen
 
 Actual results: Formula garbled =VLOOKUP(A3;[1]$Лист1.A3:B23;2;0)
 Expected results:  formula works

Works in LO 4.2.6, regression
Comment 1 Buovjaga 2014-11-04 15:58:35 UTC
Confirmed.

Version: 4.3.2.2.0+
Build ID: 4.3.2.2 Arch Linux build-1
Comment 2 raal 2014-12-07 13:03:47 UTC
git bisect log
# bad: [0777cd085a7633a48e03d25948cc67fce87b7ac7] source-hash-b800d0b6ad74ce4a9adb23b865dd174d1eefa47b
# good: [812c4a492375ac47b3557fbb32f5637fc89d60d9] source-hash-dea4a3b9d7182700abeb4dc756a24a9e8dea8474
git bisect start 'latest' 'oldest'
# bad: [8677ba6e74a774fb44ec7831f14e53d8663f59ed] source-hash-eb213e490d9a366477b921d1a408d85c4638499e
git bisect bad 8677ba6e74a774fb44ec7831f14e53d8663f59ed
# bad: [42455e7bcd46db5523f7f8e1931bbd0bd4b51cfc] source-hash-b5608fd429790a3d1153341b2c86303b7090b15a
git bisect bad 42455e7bcd46db5523f7f8e1931bbd0bd4b51cfc
# good: [253e662b25e57508dbc46753f0f28af36d5f4e25] source-hash-f7d51f43deda5e28df63f1b8e168e84838d0d0b4
git bisect good 253e662b25e57508dbc46753f0f28af36d5f4e25
# good: [8461c942c2060278f64283368705e0183709c4d7] source-hash-c7d390bf21623c148ff5c3955561b903d9581da8
git bisect good 8461c942c2060278f64283368705e0183709c4d7
# good: [14dd07abf36b67545552130557c06b3ae1855ff0] source-hash-67c20d42b5ca06458b154356877f4ad5952736f4
git bisect good 14dd07abf36b67545552130557c06b3ae1855ff0
# good: [db39c7aad664b60c46f31289f79b3b58b0800a1d] source-hash-a7e1ffc248bed431693c6d50c02e7c936c67f360
git bisect good db39c7aad664b60c46f31289f79b3b58b0800a1d
# good: [4072c01f04fb6217345bb02ee0c7fb9e3f947662] source-hash-5ccb510ef7dd6688b86038b37563583f64107936
git bisect good 4072c01f04fb6217345bb02ee0c7fb9e3f947662
# first bad commit: [42455e7bcd46db5523f7f8e1931bbd0bd4b51cfc] source-hash-b5608fd429790a3d1153341b2c86303b7090b15a

http://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=5ccb510ef7dd6688b86038b37563583f64107936..b5608fd429790a3d1153341b2c86303b7090b15a
Comment 3 Matthew Francis 2014-12-25 13:43:40 UTC
The below appears to be the commit where the behaviour changed.

Adding Cc: to erack@redhat.com. Could you possibly take a look at this? Thanks


commit 8c23a767d926d8d08213f5e2f8e81775c653cbd7
Author: Eike Rathke <erack@redhat.com>
Date:   Tue Aug 5 22:51:01 2014 +0200

    write OOXML externalReferences, externalLinks, fdo#45286
    
    This for the first time writes external references (hopefully) correctly
    and adds the necessary relationship streams and the externalLink streams
    with sheetData. At least Excel 2013 loaded the result without
    complaining, so do we.
    
    Change-Id: I3d615490a60c5420ae13c0bfc6297642d86a07b9
Comment 4 Markus Mohrhard 2014-12-27 00:10:55 UTC
Looks related to Bug 79442. The new formula grammar for OOXML might not handle external links correctly. I'll look into it.
Comment 5 Markus Mohrhard 2014-12-27 05:41:00 UTC
It is purely an import issue.

The problem is in converting relative to absolute URLs which for some reason encodes all non-ascii unicode characters now. Not sure why this did not happen in the past.

There might also be an issue around parsing in ScCompiler that I have fixed through some of my code but I need to check if that one is really necessary or if the code in address.hxx for parsing external references doesn't handle it already.
Comment 6 Markus Mohrhard 2014-12-27 05:49:44 UTC
Ok, you need both fixes.

The first one is to teach the formula compiler how to parse the OOXML indexed external references. For that we just need to handle in the new OOXML grammar external references and skip similar to the XL_A1 grammar.

That imports the reference correctly but due to the mentioned unicode issue it imports two external references into the link manager and the information are stored in the wrong one. After changing the decode parameter so that unicode characters are kept if they fit into UTF-16 we handle everything correctly.

@Eike: You might want to have a look at my commits that I did not miss something especially in the handling of external references in the OOXML grammar.
Comment 7 Markus Mohrhard 2014-12-27 06:00:32 UTC
I'd need a test document with some special conditions. If you are able to generate a test document with MSO on Windows that contains some letters like russian or any other exotic language please write me a mail.
Comment 8 Commit Notification 2014-12-27 06:03:07 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=808b0bf831d4a4fdb60c01865d0a7dd36cf9d521

NO_DECODE breaks non-ascii urls, fdo#85617

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 9 Commit Notification 2014-12-27 06:03:11 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=18cccd62fb5b730319878df6fac748d5cdf73f1f

handle index based external refs in formulas in ooxml import, fdo#85617

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 10 Eike Rathke 2015-01-08 22:44:18 UTC
The internal handling of file names of external references needs to be reworked, there is no consistency in the case of files imported from Excel file formats. Internally URLs should be fully encoded and only decoded for display purposes.

Additionally the external file names when written to .xlsx need at least to be IURI encoded, which seems to be what Excel does, and not fully decoded as in the .xls format.

Worth to be mentioned, once the original .xls file is saved as .ods and reloaded, the formulas even deliver a result and not just #N/A

Btw, this isn't exactly a regression, 4.2.x didn't work either, it just failed differently and Excel couldn't read the external references that we wrote at all. Export of external references as Excel expects them is only implemented since 4.3.2 with the commit indicated in comment 3. Import currently fails for the non-ascii characters, but does the same for a file converted with Excel.

I'll work on this.
Comment 11 Commit Notification 2015-01-09 15:12:00 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=19feb49bd1b232425b7be19b55b950cbbc1bff67

Revert "NO_DECODE breaks non-ascii urls, fdo#85617"

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 12 Commit Notification 2015-01-09 15:12:04 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=7eb5e135422f1a5830a44d129300bc3fafb4627d

write externalLink Relationship Target IURI encoded, fdo#85617 related

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 13 Commit Notification 2015-01-09 15:12:07 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b6339617b1cc3136f9e527acd0746d712cd21643

do not drop entire external reference, fdo#85617 related

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2015-01-09 15:12:11 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=97a8b3ed5e5bd42e213d3230fa764b0f5d10f0f2

Resolves: fdo#85617 always store fully encoded external document name

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 16 raal 2015-01-09 18:20:10 UTC
(In reply to Markus Mohrhard from comment #7)
> I'd need a test document with some special conditions. If you are able to
> generate a test document with MSO on Windows that contains some letters like
> russian or any other exotic language please write me a mail.

I can create test document.
Comment 17 Commit Notification 2015-01-10 16:11:34 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6c5748de1826f9a712a9a5795b1bc81cb7c63d1d&h=libreoffice-4-4

handle index based external refs in formulas in ooxml import, fdo#85617

It will be available in 4.4.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 18 Commit Notification 2015-01-10 16:11:48 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b460414bcd1dfd2f260696e306c49c56585c6464&h=libreoffice-4-4

Resolves: fdo#85617 always store fully encoded external document name

It will be available in 4.4.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 19 Commit Notification 2015-01-10 16:13:17 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "libreoffice-4-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=90bbc9e6f51a2281f7cf06577252791e62a9189b&h=libreoffice-4-3

handle index based external refs in formulas in ooxml import, fdo#85617

It will be available in 4.3.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 20 Commit Notification 2015-01-10 16:13:24 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-4-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=80bbe4a4dcbb0286f472ca3ee0e21746995fe4a6&h=libreoffice-4-3

Resolves: fdo#85617 always store fully encoded external document name

It will be available in 4.3.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 21 Commit Notification 2015-01-13 14:48:12 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-4-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=0e4df45891035e9f18070173a02c5bdc2eda9a49&h=libreoffice-4-4-0

Resolves: fdo#85617 always store fully encoded external document name

It will be available in 4.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 22 Commit Notification 2015-01-13 14:48:16 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "libreoffice-4-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6d26deaa4c40f554fb6bc036047564d466b361fd&h=libreoffice-4-4-0

handle index based external refs in formulas in ooxml import, fdo#85617

It will be available in 4.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 23 Robinson Tryon (qubit) 2015-12-17 08:38:15 UTC
Migrating Whiteboard tags to Keywords: (bibisected)
[NinjaEdit]