Bug 70224 - LibreOffice Shell extension 1-byte disk reads
Summary: LibreOffice Shell extension 1-byte disk reads
Status: RESOLVED DUPLICATE of bug 56007
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: framework (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All Windows (All)
: medium major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-10-07 12:38 UTC by stefan
Modified: 2022-08-11 08:44 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description stefan 2013-10-07 12:38:10 UTC
When opening an ODT or ODS file (with LibreOffice) in Windows Explorer by double-clicking it, the Explorer will actually freeze for some time, before anything happens. 

On virtual (and sometimes even physical) machines, this frozen state can persist so long that Dokan based file systems (Dokan on Windows is similar to FUSE under Linux) will crash (which for some reason seems to be the default behaviour of Dokan if an answer to some low-level file operation takes too long to complete). 

I need those Dokan based file system to work, so I checked the reason why a 12K ODS file with just some numbers in it (no scripts, no images, just plain text) takes considerably more than 10 seconds to load in LibreOffice. As it turns out, LibreOffice does quite a number of 1-byte reads (read() system calls) with a 4K-buffer. 

These 1-byte reads start at some byte N and continue for sometimes a few dozen, but mostly a few thousand bytes. This leads to about the following process (N may be any byte in the file, most of the time, but not always, within around the last 2000-4000 bytes, M is observed to be between about 20 and 4000):

seek(N)
read(buffer, 4096)
seek(N+1)
read(buffer, 4096)
.
.
.
seek(N+M, 4096)

LibreOffice will SOMETIMES do this anywhere in the file, even several times in different places, and even repeatedly in the same "spot", but it will ALWAYS do this with the last 2000-4000 bytes. Like this, the opening of a 12K file produces around the same amount of file operations a 8-16 MB file would require if loaded with a sequence of single, non-redundant 4K-reads. The main impact seems to be the amount of system calls, NOT the amount of data returned (I tried actually returning only 1 byte per read() call, the file took roughly the same time to load).

I confirmed this behaviour of LibreOffice (versions 3 and 4) file (both ODS and ODT) opening operations for Win7 Pro (64bit) and Win XP (32bit), both on physical machines (Quadcore i5 with 4 GB RAM, SSD disks) as well as virtual machines (KVM 1.2 on top of Gentoo Linux 64bit, virtio drivers installed in guests). I did NOT check the Linux binaries, but since the file operation speed of LibreOffice is comparably slow on Linux as it is on Windows, the assumption is here that - at least to some degree - the same or similar situation applies to Linux installations.
Comment 1 Urmas 2013-10-07 14:09:11 UTC
I cannot reproduce it. Both ODT file, <TEMP>\.tmp file and <PROFILE>\.tmp file are read/written in 4K blocks.
Can it be reproduced with clean profile, without JWM, etc?
Comment 2 stefan 2013-10-08 08:06:38 UTC
I forgot to mention: DOC and XLS files are loading fine.
Comment 3 stefan 2013-10-08 08:59:31 UTC
Yes, it doesn't matter whether profiles are new or existing, what software is installed, what Windows is installed, etc. The 1-byte reads however only have their adverse effect on Dokan based filesystems, NOT on native filesystems (NTFS, ...). 

My educated guess is the 1-byte reads always happen, but some NTFS caching mechanism or OS voodoo keeps them from mattering too much, while Dokan is simply not built to handle this large amount of micro reads and becomes very inefficient. After all, every single read() has to be rerouted from kernel space to user space, handled there by Dokan, then the Dokan based filesystem, and then rerouted to kernel space, where the OS completes the call.

Dokan comes with a mirrorFS example, best way would be to install it and start from there.

For now, we will convert all files from ODT/ODS to DOC/XLS, these are working fine, but I definitely *do* want to use ODF formats in the future for obvious reasons, so if anyone could fix this, I would appreciate it.

UPDATE: The problem does NOT occur with OpenOffice 3.3 (found it on some older workstations and servers where people were not complaining :-) on SBS2003 and XP (didn't try Win7 yet).
Comment 4 Urmas 2013-10-08 16:18:21 UTC
What exact steps are needed to reproduce ReadFile API calls of 1 byte?
What are "'read' system calls" and how they are related to LO? Which file is read this way?
Comment 5 stefan 2013-10-08 19:55:22 UTC
> What exact steps are needed to reproduce ReadFile API calls of 1 byte?

Two ways I know of, I didn't investigate any further:

A) In Windows Explorer, click once on (i.e., select) any ODS or ODT file stored in a Dokan based filesystem (I suggest mirrorFS, it comes as an example with Dokan). I guess some sort of Explorer plugin loads parts of the file for meta information like author etc.; some 1-byte reads here. Normally (i.e., on NTFS or network shares), this doesn't hurt, but with Dokan it takes a LONG time to finish and bad things (like the filesystem software crashing) sometimes happen, but not always.
B) Double-click on any ODS or ODT file stored in a Dokan based filesystem in order to load it, LOTS of 1-byte reads here while the file is being loaded, the filesystem virtually ALWAYS crashes.

> What are "'read' system calls" and how they are related to LO?

Ok, you got me there, normally I'm more of a POSIX programmer; from my perspective, "read" under POSIX does more or less exactly what "ReadFile" under Windows does, so I accidentally used the wrong name here, sorry for that, my bad. Of course I meant "ReadFile system call". As to how they are related to LibreOffice, I can only imagine that LibreOffice will at some point through some means (probably "hidden" in some libraries) SPECIFIC TO LibreOffice and ODF files (because the problem ONLY appears with LibreOffice and ONLY with ODF files, see further down) do a series of ReadFile calls in order to read and interpret the ODS and ODT files.

> Which file is read this way?

Pick *any* ODT or ODS (and, though I didn't try to confirm, I just guess we may include ODP) file ANYWHERE inside a Dokan based filesystem. No matter how large or small, what contents, it (by double-clicking in Explorer) always leads to huge amounts of 1-byte reads.

All other file types I tried (doc, xls, txt, pdf, png, jpg, avi... I just clicked EVERYTHING I found) are working fine.

In the meantime, I tested OpenOffice 4.0.1 in the same environment (i.e., I uninstalled LibreOffice and installed OpenOffice 4.0.1 on one of the machines - for now). I got no problems loading exactly the same files with OpenOffice that crashed Dokan with LibreOffice, so I think we can safely assume that the problem is specific to LibreOffice and ODF files.

BTW, I am aware that just confirming this issue already means an extra lot of work while bringing next to no benefit, because of the special environment required (Dokan) and the comparably small number of people that are likely to be ever concerned by it.
Comment 6 Urmas 2013-10-08 21:48:10 UTC
You could save some time by telling that it was Explorer reading that file.

Confirmed.
Comment 7 Urmas 2013-10-08 23:43:04 UTC
The probable culprit is a creative algorithm at core/shell/source/win32/zipfile/zipfile.cxx:253
Comment 8 Urmas 2013-10-11 18:02:21 UTC

*** This bug has been marked as a duplicate of bug 56007 ***
Comment 9 Kevin J. Hodge 2022-08-11 08:44:03 UTC
In a sense, yes, the person you marry should "complete you," making you feel completely whole as a person -- but you should already love who you are and feel blessed that the person you want to be with makes you feel even better! https://dltutuapp.com/tutuapp-download/ https://showbox.run/ https://kodi.software/