With LibreOffice 22.214.171.124 on Mint 19.1 Cinnamon
I have tested with Writer, Calc and Impress, and the problem is present with each program.
When one opens an existing document, upon saving (whether or not changes have been made) instead of updating the content and modification date of the file, a new file is created (new inode number) and the creation time of the document is changed to the saved/modification time.
Steps to Reproduce:
1. Open an existing document
2. Save it as is
3. Check the timestamps in the file explorer
check the inode and timestamps in command line using:
1- `stat \path\to\document
2- `sudo debugfs -R 'stat /path/to/document' /dev/sdXn | grep crtime
The inode number is modified.
The creation time is updated.
The inode number should stay the same.
The creation time should not be updated.
User Profile Reset: No
Confirmed in 126.96.36.199.alpha1+ (38ac0586448d4f07811b139f62f62686b029feba) / Ubuntu 18.04.
No issue in 188.8.131.52.
Bibisected to the following commit using repo bibisect-linux-64-6.1. I only checked whether the inode number stays the same in the stat command output. Adding Cc: to Miklos Vajna, please take a look.
author Miklos Vajna <email@example.com> 2018-01-08 16:49:25 +0100
committer Miklos Vajna <firstname.lastname@example.org> 2018-01-09 09:09:27 +0100
sfx2 store: try rename before copying
I'm not sure what to fix here. My understanding is that ~all software has this best practice, that if you overwrite a large file, then first you write to a tmp file next to the final target, and once you're done, you delete the original and move the tmp file to the final destination.
Sure, if really wanted we can keep this for Windows only where we have ReplaceFileW(), but I would rather close this as wontfix.
- https://lwn.net/Articles/682721/ "the standard POSIX atomic-overwrite trick (write to temporary file in same directory, rename over the top)"
- GLib's g_file_set_contents() does the same
How is it done in Windows? From the link to LWN it seems it uses the tunnel effect... is that correct? Else, couldn't it be implemented the same way in Linux?
Maybe a work around in Linux (to at least preserve the creation time) would be, upon creation of the temporary copy, to make the creation time of this temporary copy the same as the original file.
Windows provides <https://docs.microsoft.com/en-us/windows/desktop/api/winbase/nf-winbase-replacefilea>, the closest what we have is already happening on Linux, as far as I see. That's why I lean towards just closing this as won't fix.
Just checked: next to glib, rsync doesn't have any code to tweak ctimes.
Some additional thoughts.
Reading about the g_file_set_contents() function it points to the fact that changing the inode number also breaks existing hard links to the original file (wow!).
I don't know how it is done exactly at the code level, but to solve both the 'creation time' (crtime) and 'hard link' issues, here is another suggestion: why not overwrite the content of the original file with the content of the temporary file rather than overwriting the file itself?
To avoid the risk of losing any information in case of crash (whether the original data or the new data), just before the overwriting of the content an additional temporary copy of the original file could be created. This would preserve the content of that file in case of crash in the process of writing.
Added to note: mind that ctime (change time) and crtime (creation time) are different attributes.
We already detect the situation when the hard link count is >1 and go back to the old slow way. I just don't see a similar way for the creation time part, it seems the rest of the world just don't care about that. ;-)
Ah. Good that the hard link problem is taken care of. :)
Not sure that the rest of the world just don't care, see for instance this recent thread here:
I admit it is a different case-scenario so I don't try to lean heavily on it, but the thread is still an interesting read.
Reflecting on this read, I just noticed that for the file replacement, instead of doing:
$ mv SOURCE DEST
(which preserves neither the creation time nor the modification time)
one could do:
$ cp -p SOURCE DEST
(which preserve both).
Wouldn't it be simple/easy to implement?
Bonus: this method might also eliminate the need to take care of hard links. :)
Thanks for the 'cp -p' example, I'll take a look what APIs it uses.
I don't know if it can be useful but I also came across the case of rsync, which has an -X (--xattr) option to preserve extended attributes.
I also found some additional information in this Q&A:
By the way, thanks for having a look into this.
Ah, no. I just realized that timestamps attributes are independent of extended attributes.
This was a false lead, sorry.
(In reply to Miklos Vajna from comment #4)
I figured the following regarding rsync.
In case the file DEST already exists, the commands
$ rsync -t --inplace SOURCE DEST
preserves the crtime of DEST and assigns to DEST the mtime of SOURCE.
That is, the result regarding those timestamps is the same as with
$ cp -p SOURCE DEST
I got back to this and I'm afraid 'cp -a' can't copy the creation time of the destination document after all.
Here is what I tried:
1) Save test.odt using Writer.
2) Check creation time:
$ sudo debugfs -R "stat $PWD/test.odt" /dev/mapper/system-root | grep crtime
crtime: 0x5d246d4a:24aeed68 -- Tue Jul 9 12:32:42 2019
3) 'cp -a test.odt test2.odt'
4) Check creation time of test2.odt:
$ sudo debugfs -R "stat $PWD/test2.odt" /dev/mapper/system-root | grep crtime
crtime: 0x5d246eaf:7a4c63cc -- Tue Jul 9 12:38:39 2019
Based on this, I still think we can't do much here -> closing, sorry. The document has meta.xml BTW, which always contains the creation / modification time, and that is correct.
If it is discovered that there is some sane API we can use, similar to stat() and S_ISLNK(), then it would make sense to reopen this.
(Marking as "not our bug", given that this works fine on Windows, it's a shorcoming of the APIs we have available inside #ifdef UNX, I would say.)
(In reply to Miklos Vajna from comment #14)
> I got back to this and I'm afraid 'cp -a' can't copy the creation time of
> the destination document after all.
> Here is what I tried:
> 1) Save test.odt using Writer.
> 2) Check creation time:
> $ sudo debugfs -R "stat $PWD/test.odt" /dev/mapper/system-root | grep crtime
> crtime: 0x5d246d4a:24aeed68 -- Tue Jul 9 12:32:42 2019
> 3) 'cp -a test.odt test2.odt'
> 4) Check creation time of test2.odt:
> $ sudo debugfs -R "stat $PWD/test2.odt" /dev/mapper/system-root | grep crtime
> crtime: 0x5d246eaf:7a4c63cc -- Tue Jul 9 12:38:39 2019
> Based on this, I still think we can't do much here -> closing, sorry. The
> document has meta.xml BTW, which always contains the creation / modification
> time, and that is correct.
> If it is discovered that there is some sane API we can use, similar to
> stat() and S_ISLNK(), then it would make sense to reopen this.
> (Marking as "not our bug", given that this works fine on Windows, it's a
> shorcoming of the APIs we have available inside #ifdef UNX, I would say.)
Hello Miklos and thank you for looking into this.
Following your explanation, I fear you did not get the point.
Of course if you copy a file using 'cp -a SOURCE DEST' to a NEW (non existing) destination file DEST, the new destination will have an updated creation time (since it is created at the time the 'cp' command is called.
However, the command 'cp -a SOURCE DEST' applied to an already existing DEST file preserves the creation time of DEST.
This is what could be used to preserve the creation time of any document being edited, where SOURCE would be the cached and currently being edited version, and DEST the initial file from which the edition resumed.
Could you consider reassessing the scenario and therefore maybe reopening the bug?
PS: I realize I was maybe not very clear either in my initial explanation, sorry about that...
I get what you say, but it goes against the whole idea described comment 2, and comment 4 already mentions how we have an OS-provided API to do this on Windows and how on Linux we manually move the temp file to the final destination.
Given that create time is only observable via some debug API as root, I think the cost of supporting this scenario is way larger than the benefit of not copying over data from a temp file byte by byte to the final destination, especially for e.g. larger spreadsheets.
If some other developer find a good API we can use here for the UNX case, I think it makes sense to re-open this.