Bug 76047 - FILESAVE: Links to external data not preserved for XLSX
Summary: FILESAVE: Links to external data not preserved for XLSX
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.1.6.2 release
Hardware: All All
: high major
Assignee: Attila Szűcs
URL:
Whiteboard: target:7.1.0
Keywords: dataLoss, filter:xlsx
: 100999 120845 (view as bug list)
Depends on:
Blocks: Calc-External-Datalink
  Show dependency treegraph
 
Reported: 2014-03-11 23:21 UTC by wickles
Modified: 2021-02-26 07:34 UTC (History)
16 users (show)

See Also:
Crash report or crash signature:


Attachments
zip archive containing various files demonstrating bug (447.13 KB, application/zip)
2014-03-11 23:21 UTC, wickles
Details
zip archive containing various files demonstrating bug (447.13 KB, multipart/zip)
2014-03-11 23:25 UTC, wickles
Details
zip archive containing various files demonstrating bug (447.13 KB, multipart/zip)
2014-03-11 23:26 UTC, wickles
Details
ZIP containing ODS, XLS, XLSX created under LOv4162, LOv4262, and LOv4400 (730.93 KB, application/zip)
2014-08-23 07:54 UTC, Owen Genat (retired)
Details
Sample file with the links (19.99 KB, application/vnd.oasis.opendocument.spreadsheet)
2018-04-18 23:19 UTC, m_a_riosv
Details
Source file for links (19.06 KB, application/vnd.oasis.opendocument.spreadsheet)
2018-04-18 23:20 UTC, m_a_riosv
Details
Much simpler example files from LO 7.1alpha (24.25 KB, application/x-zip-compressed)
2020-10-29 10:29 UTC, NISZ LibreOffice Team
Details
The reproducer ods in Calc with Edit Links dialog (54.78 KB, image/png)
2020-10-29 10:31 UTC, NISZ LibreOffice Team
Details

Note You need to log in before you can comment on or make changes to this bug.
Description wickles 2014-03-11 23:21:08 UTC
Created attachment 95625 [details]
zip archive containing various files demonstrating bug

.ods and .sxc files retain external data links but the "update every [n seconds]" setting either gets reset or corrupted. .xls and .xlsx files do not retain external data links. 


To reproduce:

1. Create a new document
2. Insert external data of your choice with some low value for the "update every [n seconds]" setting, to verify that the data is in fact updating every n seconds. 
3. Save as one of the specified file types and close the document.

Then:

Case 1 (.ods, .sxc)

Open the file again and it will retain the link to external data, but will have corrupted or reset the "update every [n seconds]" setting. Open up the Edit->Links window, hit modify, and the setting will have changed from n seconds to some other value. In my tests the results were that with n=10 it would change to 9,999, and with n=60 it would change to 60,000. So it seems this value is becoming corrupted somehow. 

Case 2 (.xls, .xlsx)

Open the file again and it will not retain the link to external data, at all. 


These are the only file types I've tested, and I'm not sure if external data is only supposed to work with certain formats and not others. Some test files are included in the attached zip archive. 

I also have a .xls file that I have been working with for a while (since 4.1 I think) that DOES work as expected. I have included a stripped down, but still working, version of this file in the attached archive. 

As another test, I tried converting the working .xls file to a .ods file and it resets just like a new .ods file does. This is also included in the archive. 


I found a related bug (https://www.libreoffice.org/bugzilla/show_bug.cgi?id=57330) that I will comment on as well.
Comment 1 wickles 2014-03-11 23:25:15 UTC Comment hidden (obsolete)
Comment 2 wickles 2014-03-11 23:26:34 UTC
Created attachment 95628 [details]
zip archive containing various files demonstrating bug
Comment 3 Owen Genat (retired) 2014-08-23 07:54:03 UTC
Created attachment 105136 [details]
ZIP containing ODS, XLS, XLSX created under LOv4162, LOv4262, and LOv4400

TL;DR Summary
-------------
Case 1:
- Provided files do (mostly) have very high update frequency values.
- Change-in-update-frequency behaviour reported (10 sec to 9999 sec and 60 sec to 60,000 sec) may have been related to v4.2.1.1 under which the provided files were created. I cannot confirm this behaviour using the versions indicated below.

Case 2:
- Saving to XLS is OK however there is strange Import Options dialog behaviour when attempting to edit links. Refer note [1] below.
- Saving to XLSX seems to completely break external links. 

Testing overview
----------------
There appear to be a number of issues highlighted by this bug. Attachment 95628 [details] contains these files:

a) coinmarketcap-new-broken.xls
b) coinmarketcap-new-broken.xlsx
c) coinmarketcap-new-resets-10-9999.ods
d) coinmarketcap-new-resets-10-9999.sxc 
e) coinmarketcap-new-resets-60-60000.ods
f) coinmarketcap-new-resets-60-60000.sxc
g) coinmarketcap-ods2xls-broken.xls
h) coinmarketcap-old-working.xls
i) coinmarketcap-old_xls2ods-resets.ods

I have initially tested the links in the above files by opening each under these versions:

1) v4.1.6.2 Build ID: 40ff705089295be5be0aae9b15123f687c05b0a
2) v4.2.6.2 Build ID: 185f2ce4dcc34af9bd97dec29e6d42c39557298f
3) v4.3.0.4 Build ID: 62ad5818884a2fc2e5780dd45466868d41009ec0
4) v4.4.0.0.alpha0+ Build ID: e379401618268ed7f7f5885a36b90e1f4f6cd4af TinderBox: Linux-rpm_deb-x86_64@46-TDF, Branch:master, Time: 2014-08-18_05:51:03

Intial test
-----------
      Ask to update?   Are links updated?   Update every N sec?

a) 1) N                -                    -
   2) N                -                    -
   3) N                -                    -
   4) N                -                    -
b) 1) N                -                    -
   2) N                -                    -
   3) N                -                    -
   4) N                -                    -
c) 1) Y                Y                    99,999
   2) Y                Y                    99,999
   3) Y                Y                    99,999
   4) Y                Y                    99,999
d) 1) Y                Y                    99,999
   2) Y                Y                    99,999
   3) Y                Y                    99,999
   4) Y                Y                    99,999
e) 1) Y                Y                    60,000
   2) Y                Y                    60,000
   3) Y                Y                    60,000
   4) Y                Y                    60,000
f) 1) Y                Y                    60,000
   2) Y                Y                    60,000
   3) Y                Y                    60,000
   4) Y                Y                    60,000
g) 1) N                -                    -
   2) N                -                    -
   3) N                -                    -
   4) N                -                    -
h) 1) Y                Y                    60[1]
   2) Y                Y                    60[1]
   3) Y                Y                    60[1]
   4) Y                Y                    60[1]
i) 1) Y                Y                    99,999[1]
   2) Y                Y                    99,999[1]
   3) Y                Y                    99,999[1]
   4) Y                Y                    99,999[1]

[1] Edit > Links... > Modify... displays the Text Import dialog. Clicking Cancel then displays the update time.

Summary:
- Behaviour for each file appears consistent across versions.
- Only new XLS/XLSX lose the ability to prompt for update.
- Update frequency for nearly all files that update is set to a very high value. This may be the result of v4.2.1.1 under which the files appear to have been created (refer New File Test below).

Fix/Break Test
--------------
Using v4.4.0.0 (4) in a simple attempt to try and repair / break some of the provided files:

- Open (a). Save as XLS (new name).  Close. Reopen. LINKS (STILL) BROKEN.
- Open (b). Save as XLSX (new name). Close. Reopen. LINKS (STILL) BROKEN.
- Open (c). Save as XLS (new name).  Close. Reopen. LINKS BROKEN.
- Open (c). Save as XLSX (new name). Close. Reopen. LINKS BROKEN.
- Open (e). Save as XLS (new name).  Close. Reopen. LINKS BROKEN.
- Open (e). Save as XLSX (new name). Close. Reopen. LINKS BROKEN.
- Open (g). Save as XLS (new name).  Close. Reopen. LINKS (STILL) BROKEN.
- Open (g). Save as XLSX (new name). Close. Reopen. LINKS (STILL) BROKEN.
- Open (h). Save as XLS (new name).  Close. Reopen. OK.
- Open (h). Save as XLSX (new name). Close. Reopen. LINKS BROKEN.

It appears in simple terms that once the links are broken they cannot be repaired by re-saving in an up-to-date version. Furthermore links in the "old-working" XLS stop working after being saved as XLSX.

New File Test
-------------
Create a new file using the indicated file formats / versions with:
- http://coinmarketcap.com/currencies/views/all/
- Select table HTML__currencies-all
- Update frequency of 60 sec

ODS:
(1) Save. Close. Reopen. LINKS UPDATE. FREQ PRESERVED.
(2) Save. Close. Reopen. LINKS UPDATE. FREQ PRESERVED.
(3) LINK FAIL ON ENTRY. This appears to be bug 49043 again.
(4) Save. Close. Reopen. LINKS UPDATE. FREQ PRESERVED.

XLS:
(1) Save. Close. Reopen. LINKS UPDATE. FREQ PRESERVED[1].
(2) Save. Close. Reopen. LINKS UPDATE. FREQ PRESERVED[1].
(3) N/A see above.
(4) Save. Close. Reopen. LINKS UPDATE. FREQ PRESERVED[1].

XLSX:
(1) Save. Close. Reopen. LINKS BROKEN.
(2) Save. Close. Reopen. LINKS BROKEN.
(3) N/A see above.
(4) Save. Close. Reopen. LINKS BROKEN.

Refer attached for files created from this last series of tests.
Comment 4 Owen Genat (retired) 2014-08-23 08:05:54 UTC
As a result of comment 3 I am confirming this bug, as there are problems, but not as many as originally indicated. The main issue is with the XLSX file format not preserving links to external data (HTML tables). Summary amended for clarity. Status set to NEW. Version set to v4.1.6.2 (but may be older). Operating System set to All. Bug 57330 removed from See Also list as it relates to XLS.

If the update frequency issue (or a problem specific to the XLS format that is not covered by bug 57330) can be demonstrated in a clear and simple manner, please open a new bug for this as the policy is: one bug, one issue. Thanks.
Comment 5 wickles 2014-08-27 21:08:44 UTC
I can confirm that I am no longer experiencing the update frequency issue in version 4.3.0.4. XLS files also now retain the external data link, and the only problem remaining is with XLSX, as explained in comments 3 and 4.
Comment 6 QA Administrators 2015-09-04 02:47:34 UTC Comment hidden (obsolete)
Comment 7 Buovjaga 2015-11-18 10:09:57 UTC
XLSX still losing links.

Win 7 Pro 64-bit Version: 5.1.0.0.alpha1+
Build ID: b216cc1b8096eb60c27f67e8c27b7cd756c75e38
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2015-11-12_00:06:20
Locale: fi-FI (fi_FI)
Comment 8 m_a_riosv 2016-07-19 21:44:28 UTC
*** Bug 100999 has been marked as a duplicate of this bug. ***
Comment 9 Bartosz 2016-10-05 23:08:06 UTC
It is working for me with LibreOffice 5.2.2 (Ubuntu 16.04
Comment 10 Andrew 2016-10-06 22:37:40 UTC
The problem I identified in Bug 100999, which was redirected to be a duplicate of this bug (whether it is or not I don't know), still exists.

That is:

1. save links in .ods file, works fine
2. save links in .xls file, works fine
3. save links in .xlsx file, fails

LibreOffice 5.1.5.2
Windows 10
Comment 11 Buovjaga 2016-10-07 04:41:35 UTC Comment hidden (obsolete)
Comment 12 Andrew 2016-10-07 06:31:04 UTC Comment hidden (obsolete)
Comment 13 Buovjaga 2016-10-07 06:49:35 UTC Comment hidden (obsolete)
Comment 14 Bartosz 2016-10-07 10:11:01 UTC
Test scenario:
1. I have opened coinmarketcap-new-update_60s_LOv4162.xlsx, coinmarketcap-new-update_60s_LOv4262.xlsx and coinmarketcap-new-update_60s_LOv4400_2014-08-18.xlsx filew with LO 5.2.2
2. All links are working. After clicking it with CTRL + Right Mouse Buttton it is moving to corrsponding site.

So in my opinion the original bug report was resolved.

There are still missing message after .xlsx run: "This file contatins links to other files. Should they be updated?"
Comment 15 Andrew 2016-10-07 10:58:06 UTC Comment hidden (obsolete)
Comment 16 Andrew 2016-10-07 11:03:20 UTC
It may be relevant that Bartosz is running on Ubuntu, and I on Windows 10.

I do get the "This file contains links to other files. Should they be updated?" message.
Comment 17 Buovjaga 2016-10-12 10:47:25 UTC
Bartosz: please test with coinmarketcap-old-working.xls, saving it to .xlsx. I can repro the losing of links with that.
The file is in attachment 95628 [details]

Win 7 Pro 64-bit Version: 5.3.0.0.alpha0+
Build ID: e2f6c7f0d0cc14f851d7028ff846c5dc658a81c6
CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2016-10-10_23:08:02
Locale: fi-FI (fi_FI); Calc: group
Comment 18 Andrew 2018-01-25 07:18:07 UTC
I have tried this again with Version 5.3.7.2 (x64)
With .xlsx I am still having many problems.
Saving all the files as .xls and they seem to be working OK.
Comment 19 m_a_riosv 2018-04-18 23:18:47 UTC
It is on
Version: 6.1.0.0.alpha0+
Build ID: c8c74a0b4ca6f3a3619f423b6548c80c52392ae0
CPU threads: 4; OS: Windows 10.0; UI render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2018-04-14_22:59:27
Locale: es-ES (es_ES); Calc: group

Links that are broken they are arrays, not single cells references.

Attached a target.ods and a source01.ods

Saving the "target.ods" as "target.xlsx" and reopen/reload shows the issue, file name is changed for the array links, substituted by a number.

This issue causes a data lost, set up as critical.
Comment 20 m_a_riosv 2018-04-18 23:19:20 UTC
Created attachment 141477 [details]
Sample file with the links
Comment 21 m_a_riosv 2018-04-18 23:20:04 UTC
Created attachment 141478 [details]
Source file for links
Comment 22 Timur 2018-10-26 16:59:44 UTC
*** Bug 120845 has been marked as a duplicate of this bug. ***
Comment 23 m_a_riosv 2018-11-16 23:16:10 UTC
*** Bug 116331 has been marked as a duplicate of this bug. ***
Comment 24 m_a_riosv 2018-11-16 23:20:25 UTC
*** Bug 121472 has been marked as a duplicate of this bug. ***
Comment 25 m_a_riosv 2019-08-11 11:59:36 UTC
*** Bug 126821 has been marked as a duplicate of this bug. ***
Comment 26 Glen A. 2019-08-11 12:11:45 UTC
This is a critical data loss issue reported over 5 years ago – when can a fix be prioritized?
Comment 27 m_a_riosv 2019-10-25 21:03:39 UTC
Issue still
Version: 6.4.0.0.alpha1 (x86)
Build ID: cc57df8f942f239d29cb575ea5a7cb01405db787
CPU threads: 4; OS: Windows 10.0 Build 18362; UI render: GL; VCL: win; 
Locale: es-ES (es_ES); UI-Language: en-US Calc:
Comment 28 m_a_riosv 2020-03-31 21:19:52 UTC
*** Bug 122015 has been marked as a duplicate of this bug. ***
Comment 29 Julien Nabet 2020-06-04 16:07:56 UTC
(In reply to m.a.riosv from comment #21)
> Created attachment 141478 [details]
> Source file for links

I gave a try on pc Debian x86-64 with master sources updated today.
When saving ods target file into xlsx, I noticed these logs:
warn:svl.numbers:42845:42845:svl/source/numbers/zforlist.cxx:895: SvNumberFormatter::GetFormatStringForExcel - format not found: 4294967295
warn:legacy.osl:42845:42845:sc/source/core/tool/address.cxx:1601: ScRange::ExtendTo - cannot extend to invalid range
warn:legacy.osl:42845:42845:sc/source/core/tool/address.cxx:1601: ScRange::ExtendTo - cannot extend to invalid range
...

About Coinmarketcap, I noticed that tdf#117905 has been closed since public API is gone (see https://bugs.documentfoundation.org/show_bug.cgi?id=117905#c11)
Comment 30 Julien Nabet 2020-06-04 16:42:52 UTC
The 30 "warn:legacy.osl:42845:42845:sc/source/core/tool/address.cxx:1601: ScRange::ExtendTo - cannot extend to invalid range" I got correspond to the 30 REF pbs.

In xlsx, instead of having:
='file:///tmp/Source01.ods'#$SheetA.A1:E3

There's:
='file:///tmp/1'#$SheetA.A1:E3

Eike: thought you might be interested in this one.
Comment 31 egc 2020-10-01 21:37:00 UTC Comment hidden (no-value)
Comment 32 Timur 2020-10-02 07:30:52 UTC Comment hidden (obsolete)
Comment 33 egc 2020-10-03 23:21:55 UTC Comment hidden (off-topic)
Comment 34 Julien Nabet 2020-10-04 07:08:49 UTC Comment hidden (off-topic)
Comment 35 egc 2020-10-04 11:49:52 UTC Comment hidden (off-topic)
Comment 36 Buovjaga 2020-10-04 11:54:33 UTC Comment hidden (off-topic)
Comment 37 Julien Nabet 2020-10-04 12:06:49 UTC Comment hidden (off-topic)
Comment 38 egc 2020-10-04 12:26:31 UTC Comment hidden (off-topic)
Comment 39 egc 2020-10-04 12:41:38 UTC Comment hidden (off-topic)
Comment 40 Julien Nabet 2020-10-04 13:17:47 UTC Comment hidden (off-topic)
Comment 41 egc 2020-10-04 13:53:17 UTC Comment hidden (off-topic)
Comment 42 Julien Nabet 2020-10-04 15:33:10 UTC Comment hidden (off-topic)
Comment 43 egc 2020-10-04 16:29:43 UTC Comment hidden (off-topic)
Comment 44 Julien Nabet 2020-10-04 17:00:36 UTC Comment hidden (off-topic)
Comment 45 egc 2020-10-04 20:02:28 UTC Comment hidden (off-topic)
Comment 46 Eike Rathke 2020-10-05 10:41:59 UTC Comment hidden (off-topic)
Comment 47 egc 2020-10-05 12:41:06 UTC Comment hidden (off-topic)
Comment 48 egc 2020-10-05 19:18:18 UTC Comment hidden (off-topic)
Comment 49 Commit Notification 2020-10-14 07:28:43 UTC
Attila Szűcs committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/0d193c12a673fade8ece9d84cc4024fafdf52c9b

tdf#76047 XLSX import: fix links to external data

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 50 egc 2020-10-16 08:49:59 UTC
Hi Attila, and thanks for taking care of this problem!

I don't know if i did the right thing ... I downloaded the daily package
Linux-rpm_deb-x86_64@86-TDF-dbg 	2020-10-16 06:42:15 	df74aef
from
https://dev-builds.libreoffice.org/daily/master/current.html
but in this the problem still shows up. After putting the right link to the external sheet, then saving the xlsx file and closing and reopening the document the external link is doubled again and looks like this:

file:///home/user/Software/Libreoffice7xx/home/user/Software/Libreoffice7xx/source.ods

Thanks!

Best,
egc
Comment 51 NISZ LibreOffice Team 2020-10-29 10:29:53 UTC
Created attachment 166840 [details]
Much simpler example files from LO 7.1alpha

This contains a very simple reproducer file:

Externaldata.ods - with two data connections to a CSV and an XLSX file, as seen in Edit - Links to External Files.

Externaldata-LO7.xlsx - the same example file saved to XLSX. Edit - Links to External Files is disabled when this is opened in Calc and Excels Data - Connections (ribbon) - Connections (button) also opens an empty dialog. 
Calc exports absolutely nothing of the connection information to XLSX format.

EDATAE13.csv - Data source file
EDATAE13.xlsx- Data source file
Comment 52 NISZ LibreOffice Team 2020-10-29 10:31:06 UTC
Created attachment 166841 [details]
The reproducer ods in Calc with Edit Links dialog

Version: 7.1.0.0.alpha1+ (x64)
Build ID: f27c4ec5c864395f4cdaec32d7e95ff24e4f43c8
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: default; VCL: win
Locale: hu-HU (hu_HU); UI: en-US
Calc: CL
Comment 53 egc 2020-11-14 00:31:17 UTC Comment hidden (obsolete)
Comment 54 egc 2020-12-10 17:05:15 UTC
In the following version the bug is still there:

Version: 7.1.0.0.beta1
Build ID: 828a45a14a0b954e0e539f5a9a10ca31c81d8f53
OS: Linux 5.7; UI render: default; VCL: kf5
Calc: CL
Comment 55 egc 2021-02-26 07:34:17 UTC
It works in 7.1.1 :-)
Thanks a lot!!