Description: after crash, LO restarts and all open documents have to be recovered if anything already opened. Steps to Reproduce: 1.setup SMB Server smb v1 or v2 or v3 with oplocking and smb2leases (tested against Samba Version 4.4.16 synology) 2. Mount shared folder for 2 users eg. sudo mount -t cifs -o "vers=3.1.1,uid=1000,gid=1000,username=test,file_mode=0600,dir_mode=0700" //NASIP/SHAREDFOLDER /mnt 2. User1 starts to edit file on shared Folder 3. at the same time User2 opens same file Actual Results: User2 LO crashes, "loosing" all open documents. Expected Results: User2 should get warning message, when trying to access a locked file on server. Reproducible: Always User Profile Reset: Yes Additional Info: clientside WORKAROUND: Set all LibreOffice clients to UseDocumentSystemFileLocking=false
Created attachment 154302 [details] libreoffice 6.3.1~rc2-0ubuntu0.19.04.1~lo1 crashes immediatly after accessing a locked file
Could you provide LO version? Indeed, you put 5.1 whereas the screenshot shows 6.3 Is it 6.3.0, 6.3.1? Also, if you installed last LO version (6.3.1), would it be possible you provide a stacktrace? (see https://wiki.documentfoundation.org/QA/BugReport/Debug_Information#GNU.2FLinux:_How_to_get_a_backtrace)
Created attachment 154332 [details] gdbtrace-6.0.7-0ubuntu0.18.04.9.log I followed this guide to get all ubuntu debug symbols for Build-ID: 1:6.0.7-0ubuntu0.18.04.9: https://wiki.ubuntu.com/Debug%20Symbol%20Packages sudo apt install libreoffice-*-dbgsym Then i opened ODS file by 2nd user soffice --backtrace /mnt/locked\ for\ crash.ods
Would it be possible you upgrade to 6.3.1 by using LO ppa and provide a backtrace with this version? Indeed, investigating on an obsolete version is not very relevant.
Created attachment 154334 [details] backtrace 6.4.0 alpha 2019-09-20_10.11.28 could not firgure out how to get ppa:libreoffice debug symbols. I hope this helps.
Thank you for your feedback. Even if there are not argument values passed to the functions, your bt helps. I noticed this function: 117 sal_Int32 SAL_CALL 118 XInputStream_impl::readBytes( 119 uno::Sequence< sal_Int8 >& aData, 120 sal_Int32 nBytesToRead ) 121 { 122 if( ! m_nIsOpen ) throw io::IOException( THROW_WHERE ); 123 124 aData.realloc(nBytesToRead); 125 //TODO! translate memory exhaustion (if it were detectable...) into 126 // io::BufferSizeExceededException 127 128 sal_uInt64 nrc(0); 129 if(m_aFile.read( aData.getArray(),sal_uInt64(nBytesToRead),nrc ) 130 != osl::FileBase::E_None) 131 throw io::IOException( THROW_WHERE ); 132 133 // Shrink aData in case we read less than nBytesToRead (XInputStream 134 // documentation does not tell whether this is required, and I do not know 135 // if any code relies on this, so be conservative---SB): 136 if (sal::static_int_cast<sal_Int32>(nrc) != nBytesToRead) 137 aData.realloc(sal_Int32(nrc)); 138 return static_cast<sal_Int32>(nrc); 139 } If nBytesToRead > max sal_Int32, we may get a negative value for nrc. Noel/Stephan: considering git history of ucb/source/ucp/file/filinpstr.cxx, thought you might be interested in this one. It may be a dup of tdf#113099
Created attachment 154344 [details] this shows LO on Linux when file is locked by LO in Windows Thanks a lot, it seems my samba blocks access to LO-documents when they are opened/locked by LO on Linux. Access to this Document is denied and not even readable by other programs like commandline cp. When i open/lock the same file on the same SAMBA server by LO on windows, the file is read only and 2nd LO on Linux client works as expected without crash.(see screenshot) Locking procedure on Linux should be checked ? Anyway LO on Linux should not crash even when access is denied. Summary: open/lock file by LO on Linux -> SAMBA denies access -> 2nd LO on linux crashs -> 2nd LO on Windows does not crash but says "read error" open/lock file by LO on Windows -> SAMBA allows read only access -> 2nd LO on linux shows correct dialog (see png screenshot) -> 2nd LO on Windows shows correct dialog
(In reply to Julien Nabet from comment #6) > I noticed this function: (in ucb/source/ucp/file/filinpstr.cxx) > 117 sal_Int32 SAL_CALL > 118 XInputStream_impl::readBytes( > 119 uno::Sequence< sal_Int8 >& aData, > 120 sal_Int32 nBytesToRead ) > 121 { > 122 if( ! m_nIsOpen ) throw io::IOException( THROW_WHERE ); > 123 > 124 aData.realloc(nBytesToRead); > 125 //TODO! translate memory exhaustion (if it were > detectable...) into > 126 // io::BufferSizeExceededException > 127 > 128 sal_uInt64 nrc(0); > 129 if(m_aFile.read( aData.getArray(),sal_uInt64(nBytesToRead),nrc ) The data provided so far in this issue seems to imply that m_aFile.read unexpectedly returned nrc > nBytesToRead (and large enough to overflow to a negative value with the below cast to sal_Int32). (XInputStream_impl::readBytes being called with a negative nBytesToRead, which could presumably also lead to trouble, is ruled out by the fact that the above aData.realloc(nBytesToRead) didn't fire the "### new size must be at least 0!" assert, which only the below aData.realloc(sal_Int32(nrc)); fires.) > 130 != osl::FileBase::E_None) > 131 throw io::IOException( THROW_WHERE ); > 132 > 133 // Shrink aData in case we read less than nBytesToRead > (XInputStream > 134 // documentation does not tell whether this is required, and I > do not know > 135 // if any code relies on this, so be conservative---SB): > 136 if (sal::static_int_cast<sal_Int32>(nrc) != nBytesToRead) > 137 aData.realloc(sal_Int32(nrc)); > 138 return static_cast<sal_Int32>(nrc); > 139 } > > If nBytesToRead > max sal_Int32, we may get a negative value for nrc.
Unfortunately, I cannot reproduce this issue. If somebody who can reproduce it has a debug build, it would be interesting to learn the values of nBytesToRead and nrc in the crashing call to XInputStream_impl::readBytes (in ucb/source/ucp/file/filinpstr.cxx).
(In reply to Stephan Bergmann from comment #9) > Unfortunately, I cannot reproduce this issue. If somebody who can reproduce > it has a debug build, it would be interesting to learn the values of > nBytesToRead and nrc in the crashing call to XInputStream_impl::readBytes > (in ucb/source/ucp/file/filinpstr.cxx). we are having this issue all the time. How can we send you the info you need? what can we do?
(In reply to Pasaiako Udala from comment #10) > we are having this issue all the time. How can we send you the info you > need? what can we do? Do a build of LO (see <https://www.libreoffice.org/about-us/source-code/>) configured with --enable-debug, then run the built LO in the debugger (`gdb instdir/program/soffice.bin`) and provide the requested information.
Created attachment 155436 [details] debug build backtrace
Comment on attachment 155436 [details] debug build backtrace i managed to compile a debug build, any new info in this backtrace? How to get the values of nBytesToRead and nrc in the crashing call?
(In reply to Jan from comment #13) > Comment on attachment 155436 [details] > debug build backtrace > > i managed to compile a debug build, any new info in this backtrace? > How to get the values of nBytesToRead and nrc in the crashing call? Do `thread 1` and `frame 6` to get to the fileaccess::XInputStream_impl::readBytes call, then `print nBytesToRead` and `print nrc` to get the values.
Created attachment 155452 [details] gdb output of nBytesToRead and nrc
(In reply to Jan from comment #15) > Created attachment 155452 [details] > gdb output of nBytesToRead and nrc (gdb) print nrc $2 = 4294967283 is clearly bogus. Maybe an strace log helps pinpoint what exactly is going wrong (otherwise, I'll need to prepare a patch that logs the relevant places in the code, and which you would need to include in your LO build): With strace installed on your machine, please run instdir/program/soffice --strace and reproduce the issue. This should leave a (large) strace.log file in the current working dir, which you please attach here.
Created attachment 155456 [details] strace with debug build
(In reply to Jan from comment #17) > Created attachment 155456 [details] > strace with debug build So that's a kernel bug: [...] > 2085 20:13:46.936734 openat(AT_FDCWD, "/mnt/locktest.odt", O_RDONLY) = 22 > 2085 20:13:46.938557 fstat(22, {st_mode=S_IFREG|0600, st_size=8305, ...}) = 0 [...] > 2085 20:13:46.942594 pread64(22, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 0) = 4294967283 [...] So /mnt/locktest.odt is reported to have a size of 8305 bytes, and when we want to read 4096 bytes from the start, pread64 claims it read 4294967283 bytes. Whatever filesystem is used by that mount point appears to have a bug (smells like an EACESS = 13 is internally reported as -13 as a 32 bit value (i.e., 4294967283 = 0xFFFFFFF3) and then erroneously interpreted as a positive 64-bit value, and propagated out from the kernel as that).
(In reply to Stephan Bergmann from comment #18) > (In reply to Jan from comment #17) > > Created attachment 155456 [details] > > strace with debug build > > So that's a kernel bug: Great investigation ! All right, so you mean it is a client side kernel bug of cifs kernel module ? Or Server side bug samba4 / btrfs?
Worth reporting to the problematic project (which?). Does that value reach LO code? In which case, it could be worth a workaround checking if the reported read size is greater than asked size?
Looking at attachment 154334 [details], it might be https://opengrok.libreoffice.org/xref/core/ucb/source/ucp/file/filinpstr.cxx?r=c920592d#92
(In reply to Jan from comment #19) > All right, so you mean it is a client side kernel bug of cifs kernel module ? > Or Server side bug samba4 / btrfs? Hard to tell. I would start by reporting it to the relevant client-side kernel part. (In reply to Mike Kaganski from comment #20) > Does that value reach LO code? In which case, it could be worth a workaround > checking if the reported read size is greater than asked size? I would only want to do that as a very last resort, in case the the assumed bug outside of LO can't be fixed for whatever reason. Why not wait for an analysis of the assumed kernel bug first? (Jan, I assume you pass this on as appropriate? In which case, thank you, and would be great if you could add a link here or report back any findings.)
someone already sent it into linux kernel bugzilla?
(In reply to Stephan Bergmann from comment #22) > (Jan, I assume you pass this on as appropriate? In which case, thank you, > and would be great if you could add a link here or report back any findings.) Did not report to bugzilla.kernel.org yet. I am building a testbed with own samba for logfiles... I noticed -1 EACCES (Permission denied) in my strace. Meanwhile maybe its possible to tell Libreoffice for Linux to lock files as read only, just like LibreOffice for Windows does?
Thanks Jan for your work, can you post the bugzilla url to track it when you report it? We are very interested to know when this bug will be fixed
I am wondering if anyone reported this to bugzilla.kernel.org or one of the developers directly yet? Couldn't find anything related on bugzilla.
I created a sample c program to test if the samba server or client reports something wrong when a file is open: #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <errno.h> #include <string.h> #include <unistd.h> void printerr(void) { printf("%s\n", strerror(errno)); exit(errno); } int main(void) { int fd; struct stat statbuf; int res; char buf[4096]; fd = openat(AT_FDCWD, "/mnt/disk/cv.doc", O_RDONLY); if (fd == -1) printerr(); res = fstat(fd, &statbuf); if (res == -1) printerr(); printf("st_mode=%d\nst_size=%ld\n", statbuf.st_mode, statbuf.st_size); res = pread64(fd, &buf, 4096, 0); printf("%d\n", res); close(fd); exit(0); } I have created a simple public samba share with just the cv.doc file and mounted the share in /mnt/disk with the following command: mount -t cifs //pc/public /mnt/disk -o guest,rw,_netdev,uid=user,vers=1.0 If libreoffice is closed and I run the above program, I get: st_mode=33261 st_size=11398 4096 where 4096 is the number of bytes read. If I open cv.doc with libreoffice and rerun the program I get: st_mode=33261 st_size=11398 -13 The error is reported successfully as it seems. nrc is an unsigned int as it is declared as an sal_uInt64 so the 4294967283 value is justified. Is it really not a bug of libreoffice? Unfortunately I cannot fully understand the libreoffice code to detect the problem :-(
(In reply to Theofilos Intzoglou from comment #27) > ... > The error is reported successfully as it seems. nrc is an unsigned int as it > is declared as an sal_uInt64 so the 4294967283 value is justified. Is it > really not a bug of libreoffice? Unfortunately I cannot fully understand the > libreoffice code to detect the problem :-( Please read from https://bugs.documentfoundation.org/show_bug.cgi?id=127648#c16 to https://bugs.documentfoundation.org/show_bug.cgi?id=127648#c18 For even more details, read the whole bugtracker. A lot of debug tests have been made.
(In reply to Theofilos Intzoglou from comment #27) > int main(void) { > int fd; > struct stat statbuf; > int res; > char buf[4096]; > > fd = openat(AT_FDCWD, "/mnt/disk/cv.doc", O_RDONLY); > if (fd == -1) > printerr(); > > res = fstat(fd, &statbuf); > if (res == -1) > printerr(); > printf("st_mode=%d\nst_size=%ld\n", statbuf.st_mode, statbuf.st_size); > res = pread64(fd, &buf, 4096, 0); > printf("%d\n", res); > close(fd); > exit(0); > } The return type of pread64 is ssize_t, not int.
(In reply to Stephan Bergmann from comment #29) > (In reply to Theofilos Intzoglou from comment #27) > > int main(void) { > > int fd; > > struct stat statbuf; > > int res; > > char buf[4096]; > > > > fd = openat(AT_FDCWD, "/mnt/disk/cv.doc", O_RDONLY); > > if (fd == -1) > > printerr(); > > > > res = fstat(fd, &statbuf); > > if (res == -1) > > printerr(); > > printf("st_mode=%d\nst_size=%ld\n", statbuf.st_mode, statbuf.st_size); > > res = pread64(fd, &buf, 4096, 0); > > printf("%d\n", res); > > close(fd); > > exit(0); > > } > > The return type of pread64 is ssize_t, not int. Indeed but it gives the same result. Anyway I think that this bug is a big deal for many offices as it renders libreoffice nearly unusable if you work with multiple files opened at once. One mistake and libreoffice crashes leaving some of its processes open which makes it difficult for people without much experience with dealing with processes in a state that libreoffice can't be started again. I am not the one to decide but if a simple check could solve this problem in libreoffice while we wait for the kernel module to be fixed then it would be much appreciated. Thank you for your time spend investigating this.
(In reply to Theofilos Intzoglou from comment #30) > (In reply to Stephan Bergmann from comment #29) > > The return type of pread64 is ssize_t, not int. > > Indeed but it gives the same result. No, it does not, in general. > Anyway I think that this bug is a big > deal for many offices as it renders libreoffice nearly unusable if you work > with multiple files opened at once. If you are affected by this issue, please get in touch with the maintainers of whatever software component is causing the error for you.
(In reply to Theofilos Intzoglou from comment #30) > (In reply to Stephan Bergmann from comment #29) > > (In reply to Theofilos Intzoglou from comment #27) > > > int main(void) { > > > int fd; > > > struct stat statbuf; > > > int res; > > > char buf[4096]; > > > > > > fd = openat(AT_FDCWD, "/mnt/disk/cv.doc", O_RDONLY); > > > if (fd == -1) > > > printerr(); > > > > > > res = fstat(fd, &statbuf); > > > if (res == -1) > > > printerr(); > > > printf("st_mode=%d\nst_size=%ld\n", statbuf.st_mode, statbuf.st_size); > > > res = pread64(fd, &buf, 4096, 0); > > > printf("%d\n", res); > > > close(fd); > > > exit(0); > > > } > > > > The return type of pread64 is ssize_t, not int. > > Indeed but it gives the same result. Indeed, because even if you declare your res to be ssize_t, if you print it using `printf("%d\n", res)`, you convert the ssize_t to int again here, since "%d" (without "l" length modifier) treats the value as int [1]. The same wrong handling of integer types seems to be the reason of this problem in the kernel. [1] https://linux.die.net/man/3/printf
FTR: https://gerrit.libreoffice.org/c/core/+/81931 is a patch to workaround this, which was abandoned because of course this should be fixed in the kernel.
Might this kernel bug be related? bugzilla.kernel.org/show_bug.cgi?id=201893