Bug 148942 - FILESAVE Calc corrupts external file references in XLSX files making Excel unable to open
Summary: FILESAVE Calc corrupts external file references in XLSX files making Excel un...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
7.2.0.0.alpha0+
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:xlsx, implementationError
Depends on:
Blocks: XLSX-Corrupted
  Show dependency treegraph
 
Reported: 2022-05-05 03:23 UTC by Tim B
Modified: 2023-02-20 08:42 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
XLSX from MSO (8.92 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2022-05-05 14:27 UTC, Timur
Details
Sample MS XLSX files (54.35 KB, application/gzip)
2022-05-06 04:03 UTC, Tim B
Details
An external link file created by MS Office taken from inside an unzipped XLSX file. (380 bytes, application/xml)
2022-10-11 01:00 UTC, Tim B
Details
A version of the same external link file but having been resaved by LO 7.4.2 (370 bytes, application/xml)
2022-10-11 01:01 UTC, Tim B
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tim B 2022-05-05 03:23:21 UTC
Description:
We get sent XLSX files by a supplier (an order form). These contain links to external data sources that we do not have access to. The path varies, but inevitably, when we send it back with our changes, it doesn't open in Excel (2016 is what they're using, but we have a computer with 365 kept up to date that does the same).

I have done some basic testing (Ubuntu 22.4 LO7.3.2.1) and found the issue appears to be LibreOffice changing the "Target=" value in "xl/externalLinks/_rels/externalLink<n>.xml.rels". Specifically, "file://C:/path/to/file.xls" in that Target is replaced with "../../../../C:/path/to/file.xls" on Linux, or on Windows (Win10 LO6.4.4.2), it seems to strip out the part before a space in the filename (the filename has multiple spaces - break is on the first one).

I have manually changed the broken Target= value in one of the Linux created test files back to what the source file had originally, and Excel is then happy to open it.

I have only done basic testing on one specific file, as I don't have access to Excel myself, but the issue has been across dozens of files for some time.

Steps to Reproduce:
1. Create xlsx file with Excel (2016) that contains links to external data sources by full path
2. Open xlsx file in LO
3. Save as new file still in XLSX format
4. Attempt to open in Excel

Actual Results:
Excel reports corruption and asks to recover. 

Expected Results:
File should open without error. LO should not change the target path for an external link.


Reproducible: Always


User Profile Reset: No



Additional Info:
Version: 7.3.2.1 / LibreOffice Community
Build ID: 30(Build:1)
CPU threads: 4; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Ubuntu package version: 1:7.3.2~rc1-0ubuntu2
Calc: threaded

NOTE: We also have had this issue in older Ubuntu builds such as whatever is in Ubuntu 20.10, aswell as on Windows with LO6.x, and this is occuring on multiple machines for multiple users.


I can supply example files if necessary.
Comment 1 Xisco Faulí 2022-05-05 09:40:29 UTC Comment hidden (obsolete)
Comment 2 Timur 2022-05-05 14:27:08 UTC
Created attachment 179946 [details]
XLSX from MSO
Comment 3 Tim B 2022-05-06 04:03:52 UTC
Created attachment 179958 [details]
Sample MS XLSX files

Tar containing an example XLSX file from MS Office, the same file opened and resaved with LO (7.x on Ubuntu), and a PNG image of the result of recovery in MS Excel.
Comment 4 Timur 2022-09-06 12:22:24 UTC
When I confirmed this, LO had another bug 148835, later resolved. 
So let's go back now to the reporter. 

Tim, please confirm the steps, because I don't reproduce a bug:
Open Hasbro_STORE\ FORM_August\ Cat\ Feat_All\ Store.xlsx 
Save as XLSX
Open in Excel 2016. 

Where, in which cell, is that external link?
Comment 5 Tim B 2022-10-11 00:53:15 UTC
Sorry for the delay... Steps are correct, however changing a cell might be a trigger for it to break. I can't remember.


The issue has come up again today but now with a different supplier. We're now on LO 7.3.6-0ubuntu0.22.04.1. Not sure what version of MS Office this supplier uses. But the issue is the same with the addition of "../../../" to paths in links.

I see the possible same bug is fixed after 7.4, so I'm just downloading 7.4.2 to see if that still has the issue, and I'll report back in the next hour or 2 with a yes or no.
Comment 6 Tim B 2022-10-11 01:00:56 UTC
Created attachment 182960 [details]
An external link file created by MS Office taken from inside an unzipped XLSX file.
Comment 7 Tim B 2022-10-11 01:01:51 UTC
Created attachment 182961 [details]
A version of the same external link file but having been resaved by LO 7.4.2
Comment 8 Tim B 2022-10-11 01:05:18 UTC
7.4.2 still has the same issue. Haven't tried opening my 7.4.2 edit in Excel (haven't got a copy available), but I assume it will still be broken since the attached files show a bad path still.
Comment 9 Buovjaga 2023-02-20 08:42:20 UTC
I bibisected the behaviour change with linux-64-7.2 repo to 107a20ee079ae852b3b33412f234aab2dc35168f
The behaviour changed from

./xl/externalLinks/_rels/externalLink1.xml.rels:
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/externalLinkPath" Target="C:/TOYS/CATALOGUES%202007/All%20Stores/All%20store%20allocation.xls" TargetMode="External"/>

to

./xl/externalLinks/_rels/externalLink1.xml.rels:
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/externalLinkPath" Target="../../../../../../../C:/TOYS/CATALOGUES%202007/All%20Stores/All%20store%20allocation.xls" TargetMode="External"/>

However, even the old behaviour changed the Target path from the original XLSX.

Maybe Attila has an idea.