Download it now!
Bug 102616 - EDITING: Compare documents on near-identical files flags 99.9% of the contents as different
Summary: EDITING: Compare documents on near-identical files flags 99.9% of the content...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.1.4.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: haveBacktrace
Depends on:
Blocks: Document-Comparison
  Show dependency treegraph
 
Reported: 2016-09-27 03:15 UTC by Luke Kendall
Modified: 2020-04-07 08:32 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Two sample documents, almost identical (737.27 KB, application/zip)
2016-09-27 03:15 UTC, Luke Kendall
Details
bt with debug symbols (7.39 KB, text/plain)
2020-04-06 10:56 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kendall 2016-09-27 03:15:22 UTC
Created attachment 127653 [details]
Two sample documents, almost identical

Often, when I compare two subtly different versions of my novels (e.g. the version for ebook preparation vs the version for print production), it will flag basically the entire body of the document as different.
So it basically fails completely, and is a function that isn't useful, and I have to try to remember what I changed, and manually hunt and check for each change, and hope I've remembered them all.  It's an awful situation to be in.

I will attach a zip file with obfuscated versions of an example. (I can't provide unobfuscated copies as Amazon have exclusive distribution rights for the electronic version.)

Steps to reproduce:

1. Open HarshLessons-CS-KDP-obfus-noimgs.odt
2. Edit->TrackChanges->Compare Documents
3. Choose HarshLessons-CS-obfus.odt and click OK

Basically the entire document is identified as different.

The sections with the ISBN are genuinely different; the KDP version also has a ToC; chapter titles are coloured blue in the KDP version; and URLs are not "spelled out" in the KDP version.

There should be few, if any, other differences.  Instead, LO identifies about 6 sets of changes, which basically encompass the entire document.
Comment 1 Buovjaga 2016-10-11 18:49:52 UTC
Repro.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.3.0.0.alpha0+
Build ID: 65f2d6b1cc40b4b90f8987e8ea14d24b5f38f950
CPU Threads: 8; OS Version: Linux 4.7; UI Render: default; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on October 10th 2016
Comment 2 Luke Kendall 2017-03-28 07:06:12 UTC
Another use case, which would require additional UI design, is comparing two documents that only differ in things like the page dimensions, font sizes, etc.
Comment 3 Boaz Dodin 2017-07-16 06:02:48 UTC
I am wondering if this is the same issue - comparing 2 odt documents with the only difference is an addition of smiley at the start of the paragraph of 1 document, marking all the paragraph as changed.

Version: 5.3.1.2
Build ID: 1:5.3.1-0ubuntu2
CPU Threads: 4; OS Version: Linux 4.10; UI Render: default; VCL: kde4; Layout Engine: new; 
Locale: en-US (en_IL); Calc: group
Comment 4 Cathy Crumbley 2018-03-08 16:37:42 UTC
I posted to forums for help with compare documents for several months and finally concluded that it does not work properly, no matter the document content.  My only solution has been to compare documents using Word at the local library.  But not everyone can do that.

From what I can gather, it is not easy to fix this problem. It would be helpful to know if anyone is willing or able to take this on.
Comment 5 QA Administrators 2019-03-09 03:42:18 UTC Comment hidden (obsolete)
Comment 6 Babak Razmjoo 2019-03-27 13:28:58 UTC
Cathy, This is NOT a bug, nor an unexpected behaviour if you can understand basics of Open Document Format (ODF) and the manner LibreOffice uses it to save your documents.

A document saved in ODF is a set of XML files, possibly along with some graphics stored in a directory hierarchy and zipped into a single file. If you open an ODF (or ODT...) file with an archive manager, You can see them.

Using ZIP or any other compression format to compress some files means the list modification in the source files (here XML) will cause a huge difference in the resulting compressed file (see https://computer.howstuffworks.com/file-compression.htm)

Hope this was useful
Comment 7 Babak Razmjoo 2019-03-27 14:27:09 UTC
Sorry, I misunderstood you. I thought you mean comparing files outside LibreOffice
Comment 8 Luke Kendall 2019-03-27 14:48:04 UTC
Seeing Babak's earlier comment, I started replying, but abandoned it when I saw his later one.
However, it did encourage me to try a quick check of the original bug.
It seems like the bug may be partly fixed: not all identical text shows as different, only some does.

However, LO did crash twice while running two of those tests.
That would be a separate bug though.

Anyway, I'll try to make time to revisit this report next week, after my current deadline is past.
Comment 9 Luke Kendall 2019-03-28 03:07:26 UTC
This is just an interim note.
It seems that compare documents for me now reproducibly crashes LO.
After recovery, the merged document has a combination of:
- genuine changes marked, 
- identical text marked as changed, and 
- a large amount of ruined formatting and paragraph style corruption, to the extent that the merged document for comparison is only of marginal utility.

In short, Compare Documents is now of some utility, but it's certainly unsafe for use.
Comment 10 Buovjaga 2019-03-28 06:24:14 UTC
I could not reproduce the crash with the attached documents or my own very simple test.
Before creating a new report for the crash, please test with the latest version and master.
https://dev-builds.libreoffice.org/daily/master/Linux-rpm_deb-x86_64@86-TDF/current/

Arch Linux 64-bit
Version: 6.2.2.2
Build ID: 6.2.2-2
CPU threads: 8; OS: Linux 5.0; UI render: default; VCL: kde5; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded

Arch Linux 64-bit
Version: 6.3.0.0.alpha0+
Build ID: 9852f09b467e3c7f8058b931010b91f447905051
CPU threads: 8; OS: Linux 5.0; UI render: default; VCL: gtk3; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded
Built on 27 March 2019
Comment 11 Luke Kendall 2019-05-23 12:23:15 UTC
This bug is as bad or worse in Writer 6.2.3.2.

Because I was preparing 4 editions of my book, I have 4 versions of the MS.
I'll need to report several bugs in Writer for this.

Anyway, because of these other bugs, it was important to compare the different MSes. Writer never was able to provide a useful comparison.  In the end I had to use MS Office365 to compare .docx versions of the MSes saved from Writer.

This was doing a merge comparison.

The main problems I had with Writer were (in order of seriousness):

1. As reported, basically the entire document is flagged as changed.
   This applied even when the documents used the same paragraph styles (except for chapter title paragraph style), same page format, and same body text paragraph style.
2. Undo of a file comparison is not just incomplete, applying Undo can introduce unexpected changes.  On one occasion, the Undo changed the final body paragraphs by applying an All Capitals attribute.  My recollection is that it also left some Chapter Title paragraphs in a strange state, as well as losing some page breaks.
3. Writer would often crash as soon as the comparison started.
4. After recovery from said crash, the Manage Changes dialog would be open for every recovered document, and have to be closed separately.

I also noticed that the .docx files produced by Writer are difficult for other word processors to compare. This was because regardless of how little editing I had done to my MS, for every real change I had made there would be 10 or 20 "null differences" found by the other word processor.  These usually but not always appeared as a space character at the end of paragraphs, but sometimes between words in a paragraph.

I don't know if it's indicative of an underlying problem, but ONLY Office365 could produce useful comparisons of Writer-generated .docx files of the same root document.  The others I tried tended to produce very large swathes of changes that were marked as different even though painstaking visual comparison showed them to be identical.

So I suspect one issue with the Writer file compare algorithm is that it may be comparing the underlying data structures, not the visible data (detectable by a user) - by which I mean, for any run of text, the paragraph, character, page styles, and the literal characters.
Comment 12 Luke Kendall 2019-06-14 07:40:44 UTC
Just a note that I had a short (800 word) doc that I compared versions with, using 6.2.3.2 and it instantly crashed.

Writer 6.2.3.2 also never progressed in the document recovery when it reached the compared-doc: it was using 30%-50% CPU time, but after 20 mins hadn't managed to make any progress on the 800 word doc I'd been comparing.

I installed 6.2.4.2 and ran that - it quickly recovered the documents (20 secs or so).  So this note is mainly to say the crash problem may be fixed.

Writer's file compare is still woefully bad compared to MS Office. Even on an 800 word file it marks whole paragraphs as changed when you change a few characters or a few words.
Comment 13 Luke Kendall 2020-04-06 10:14:42 UTC
Just a note that in 6.3.2, I could compare some documents usefully (after it crashed and the files were recovered - see https://bugs.documentfoundation.org/show_bug.cgi?id=131924 

But in 6.4.2.2, the compare doesn't crash, but instead the whole document is marked as different, so the Compare documents feature becomes useless.

Version: 6.4.2.2
Build ID: 4e471d8c02c9c90f512f7f9ead8875b57fcb1ec3
CPU threads: 4; OS: Linux 4.4; UI render: default; VCL: x11; 
Locale: en-GB (en_AU.UTF-8); UI-Language: en-US
Calc: threaded
Comment 14 Julien Nabet 2020-04-06 10:56:29 UTC
Created attachment 159359 [details]
bt with debug symbols

On pc Debian x86-64 with master sources updated today, I got an assert.
Comment 15 Julien Nabet 2020-04-06 11:09:52 UTC
Stephan: noticing https://cgit.freedesktop.org/libreoffice/core/commit/?id=aef7feb3e695ecf6d411f0777196dcc4281e201a, thought you might be interested in this one.
Any idea to fix the assertion? (should I create a new bugtracker)
Comment 16 Stephan Bergmann 2020-04-06 16:01:03 UTC
(In reply to Julien Nabet from comment #14)
> Created attachment 159359 [details]
> bt with debug symbols
> 
> On pc Debian x86-64 with master sources updated today, I got an assert.

I do not know whether the assert from comment 14 and my below UBSan+ASan findings are related to the original issue.  However, I'll keep their discussion in this issue for now instead of spawning a new issue.

So <https://git.libreoffice.org/core/+diff/aef7feb3e695ecf6d411f0777196dcc4281e201a%5E!> "New loplugin:unsignedcompare", which introduced the firing assert, was a kind of a gamble.  It assumed that those casts from `long` to `unsigned long` would only be done with non-negative values (to silence warnings about mixed signed/unsigned comparisons), rather than to wrap around negative values to large positive values on purpose.  The commit may well have guessed wrongly, of course.

However, locally reverting the commit with

> diff --git a/compilerplugins/clang/unsignedcompare.cxx b/compilerplugins/clang/unsignedcompare.cxx
> index d9b8f144ca77..a929d219f205 100644
> --- a/compilerplugins/clang/unsignedcompare.cxx
> +++ b/compilerplugins/clang/unsignedcompare.cxx
> @@ -225,7 +225,7 @@ private:
>      }
>  };
>  
> -loplugin::Plugin::Registration<UnsignedCompare> unsignedcompare("unsignedcompare");
> +loplugin::Plugin::Registration<UnsignedCompare> unsignedcompare("unsignedcompare", false);
>  }
>  
>  /* vim:set shiftwidth=4 softtabstop=4 expandtab cinoptions=b1,g0,N-s cinkeys+=0=break: */
> diff --git a/sw/source/core/doc/doccomp.cxx b/sw/source/core/doc/doccomp.cxx
> index 21a79453985e..5566beeb48ff 100644
> --- a/sw/source/core/doc/doccomp.cxx
> +++ b/sw/source/core/doc/doccomp.cxx
> @@ -19,7 +19,6 @@
>  
>  #include <sal/config.h>
>  
> -#include <o3tl/safeint.hxx>
>  #include <osl/diagnose.h>
>  #include <rtl/character.hxx>
>  #include <swmodule.hxx>
> @@ -882,7 +881,7 @@ sal_uLong Compare::CompareSequence::CheckDiag( sal_uLong nStt1, sal_uLong nEnd1,
>              else
>                  x = thi;
>              y = x - d;
> -            while( o3tl::make_unsigned(x) < nEnd1 && o3tl::make_unsigned(y) < nEnd2 &&
> +            while( sal_uLong(x) < nEnd1 && sal_uLong(y) < nEnd2 &&
>                  m_rMoved1.GetIndex( x ) == m_rMoved2.GetIndex( y ))
>              {
>                  ++x;
> @@ -914,7 +913,7 @@ sal_uLong Compare::CompareSequence::CheckDiag( sal_uLong nStt1, sal_uLong nEnd1,
>              else
>                  x = thi - 1;
>              y = x - d;
> -            while( o3tl::make_unsigned(x) > nStt1 && o3tl::make_unsigned(y) > nStt2 &&
> +            while( sal_uLong(x) > nStt1 && sal_uLong(y) > nStt2 &&
>                  m_rMoved1.GetIndex( x - 1 ) == m_rMoved2.GetIndex( y - 1 ))
>              {
>                  --x;

makes the scenario from comment 0 then fail with (an undefined pointer addition and a resulting) heap-buffer-overflow in my ASan+UBSan build, which appears to be a strong indicator that the above guess may have been well-founded after all, and the assert was just another legitimate symptom of whatever underlying issue.

> include/c++/10.0.1/bits/unique_ptr.h:656:9: runtime error: addition of unsigned offset to 0x610000272440 overflowed to 0x610000272430
>  #0 in std::unique_ptr<unsigned long [], std::default_delete<unsigned long []> >::operator[](unsigned long) const at include/c++/10.0.1/bits/unique_ptr.h:656:9
>  #1 in (anonymous namespace)::Compare::MovedData::GetIndex(unsigned long) const at sw/source/core/doc/doccomp.cxx:214:58
>  #2 in (anonymous namespace)::Compare::CompareSequence::CheckDiag(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long*) at sw/source/core/doc/doccomp.cxx:917:27
>  #3 in (anonymous namespace)::Compare::CompareSequence::Compare(unsigned long, unsigned long, unsigned long, unsigned long) at sw/source/core/doc/doccomp.cxx:828:13
>  #4 in (anonymous namespace)::Compare::CompareSequence::CompareSequence((anonymous namespace)::CompareData&, (anonymous namespace)::CompareData&, (anonymous namespace)::Compare::MovedData const&, (anonymous namespace)::Compare::MovedData const&) at sw/source/core/doc/doccomp.cxx:791:5
>  #5 in (anonymous namespace)::Compare::Compare(unsigned long, (anonymous namespace)::CompareData&, (anonymous namespace)::CompareData&) at sw/source/core/doc/doccomp.cxx:602:25
>  #6 in (anonymous namespace)::CompareData::CompareLines((anonymous namespace)::CompareData&) at sw/source/core/doc/doccomp.cxx:437:17
>  #7 in SwDoc::CompareDoc(SwDoc const&) at sw/source/core/doc/doccomp.cxx:1866:13
>  #8 in SwEditShell::CompareDoc(SwDoc const&) at sw/source/core/edit/editsh.cxx:876:27
>  #9 in SwView::InsertMedium(unsigned short, std::unique_ptr<SfxMedium, std::default_delete<SfxMedium> >, short) at sw/source/uibase/uiview/view2.cxx:2300:39
>  #10 in SwView::DialogClosedHdl(sfx2::FileDialogHelper*) at sw/source/uibase/uiview/view2.cxx:2491:19
>  #11 in SwView::LinkStubDialogClosedHdl(void*, sfx2::FileDialogHelper*) at sw/source/uibase/uiview/view2.cxx:2481:1
>  #12 in Link<sfx2::FileDialogHelper*, void>::Call(sfx2::FileDialogHelper*) const at include/tools/link.hxx:111:45
>  #13 in sfx2::DocumentInserter::DialogClosedHdl(sfx2::FileDialogHelper*) at sfx2/source/doc/docinsert.cxx:285:25
>  #14 in sfx2::DocumentInserter::LinkStubDialogClosedHdl(void*, sfx2::FileDialogHelper*) at sfx2/source/doc/docinsert.cxx:184:1
>  #15 in Link<sfx2::FileDialogHelper*, void>::Call(sfx2::FileDialogHelper*) const at include/tools/link.hxx:111:45
>  #16 in sfx2::FileDialogHelper::ExecuteSystemFilePicker(void*) at sfx2/source/dialog/filedlghelper.cxx:2356:25
>  #17 in sfx2::FileDialogHelper::LinkStubExecuteSystemFilePicker(void*, void*) at sfx2/source/dialog/filedlghelper.cxx:2353:1
>  #18 in Link<void*, void>::Call(void*) const at include/tools/link.hxx:111:45
>  #19 in ImplHandleUserEvent(ImplSVEvent*) at vcl/source/window/winproc.cxx:2009:30
>  #20 in ImplWindowFrameProc(vcl::Window*, SalEvent, void const*) at vcl/source/window/winproc.cxx:2562:13
>  #21 in SalFrame::CallCallback(SalEvent, void const*) const at vcl/inc/salframe.hxx:306:29
>  #22 in SalGenericDisplay::ProcessEvent(SalUserEventList::SalUserEvent) at vcl/unx/generic/app/gendisp.cxx:66:22
>  #23 in SalUserEventList::DispatchUserEvents(bool) at vcl/source/app/salusereventlist.cxx:108:17
>  #24 in SalGenericDisplay::DispatchInternalEvent(bool) at vcl/unx/generic/app/gendisp.cxx:51:12
>  #25 in call_userEventFn(void*) at vcl/unx/gtk3/gtk3gtkdata.cxx:707:27
>  #26  at <null> (/lib64/libglib-2.0.so.0 +0x4de8a)
>  #27 in g_main_context_dispatch at <null> (/lib64/libglib-2.0.so.0 +0x5156f)
>  #28  at <null> (/lib64/libglib-2.0.so.0 +0x518ff)
>  #29 in g_main_context_iteration at <null> (/lib64/libglib-2.0.so.0 +0x519a2)
>  #30 in GtkSalData::Yield(bool, bool) at vcl/unx/gtk3/gtk3gtkdata.cxx:382:31
>  #31 in GtkInstance::DoYield(bool, bool) at vcl/unx/gtk3/gtk3gtkinst.cxx:384:29
>  #32 in ImplYield(bool, bool) at vcl/source/app/svapp.cxx:454:48
>  #33 in Application::Yield() at vcl/source/app/svapp.cxx:518:5
>  #34 in Application::Execute() at vcl/source/app/svapp.cxx:433:9
>  #35 in desktop::Desktop::Main() at desktop/source/app/app.cxx:1602:17
>  #36 in ImplSVMain() at vcl/source/app/svmain.cxx:196:35
>  #37 in SVMain() at vcl/source/app/svmain.cxx:228:12
>  #38 in soffice_main at desktop/source/app/sofficemain.cxx:107:12
>  #39 in sal_main at desktop/source/app/main.c:48:15
>  #40 in main at desktop/source/app/main.c:47:1
>  #41 in __libc_start_main at /usr/src/debug/glibc-2.30-34-g994e529a37/csu/../csu/libc-start.c:308:16
>  #42 in _start at <null>
> 
> SUMMARY: UndefinedBehaviorSanitizer: pointer-overflow include/c++/10.0.1/bits/unique_ptr.h:656:9 in 
> =================================================================
> ==2406221==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x610000272430 at pc 0x7f0db2c2790d bp 0x7ffcfd1c0870 sp 0x7ffcfd1c0868
> READ of size 8 at 0x610000272430 thread T0
>  #0 in (anonymous namespace)::Compare::MovedData::GetIndex(unsigned long) const at sw/source/core/doc/doccomp.cxx:214:58
>  #1 in (anonymous namespace)::Compare::CompareSequence::CheckDiag(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long*) at sw/source/core/doc/doccomp.cxx:917:27
>  #2 in (anonymous namespace)::Compare::CompareSequence::Compare(unsigned long, unsigned long, unsigned long, unsigned long) at sw/source/core/doc/doccomp.cxx:828:13
>  #3 in (anonymous namespace)::Compare::CompareSequence::CompareSequence((anonymous namespace)::CompareData&, (anonymous namespace)::CompareData&, (anonymous namespace)::Compare::MovedData const&, (anonymous namespace)::Compare::MovedData const&) at sw/source/core/doc/doccomp.cxx:791:5
>  #4 in (anonymous namespace)::Compare::Compare(unsigned long, (anonymous namespace)::CompareData&, (anonymous namespace)::CompareData&) at sw/source/core/doc/doccomp.cxx:602:25
>  #5 in (anonymous namespace)::CompareData::CompareLines((anonymous namespace)::CompareData&) at sw/source/core/doc/doccomp.cxx:437:17
>  #6 in SwDoc::CompareDoc(SwDoc const&) at sw/source/core/doc/doccomp.cxx:1866:13
>  #7 in SwEditShell::CompareDoc(SwDoc const&) at sw/source/core/edit/editsh.cxx:876:27
>  #8 in SwView::InsertMedium(unsigned short, std::unique_ptr<SfxMedium, std::default_delete<SfxMedium> >, short) at sw/source/uibase/uiview/view2.cxx:2300:39
>  #9 in SwView::DialogClosedHdl(sfx2::FileDialogHelper*) at sw/source/uibase/uiview/view2.cxx:2491:19
>  #10 in SwView::LinkStubDialogClosedHdl(void*, sfx2::FileDialogHelper*) at sw/source/uibase/uiview/view2.cxx:2481:1
>  #11 in Link<sfx2::FileDialogHelper*, void>::Call(sfx2::FileDialogHelper*) const at include/tools/link.hxx:111:45
>  #12 in sfx2::DocumentInserter::DialogClosedHdl(sfx2::FileDialogHelper*) at sfx2/source/doc/docinsert.cxx:285:25
>  #13 in sfx2::DocumentInserter::LinkStubDialogClosedHdl(void*, sfx2::FileDialogHelper*) at sfx2/source/doc/docinsert.cxx:184:1
>  #14 in Link<sfx2::FileDialogHelper*, void>::Call(sfx2::FileDialogHelper*) const at include/tools/link.hxx:111:45
>  #15 in sfx2::FileDialogHelper::ExecuteSystemFilePicker(void*) at sfx2/source/dialog/filedlghelper.cxx:2356:25
>  #16 in sfx2::FileDialogHelper::LinkStubExecuteSystemFilePicker(void*, void*) at sfx2/source/dialog/filedlghelper.cxx:2353:1
>  #17 in Link<void*, void>::Call(void*) const at include/tools/link.hxx:111:45
>  #18 in ImplHandleUserEvent(ImplSVEvent*) at vcl/source/window/winproc.cxx:2009:30
>  #19 in ImplWindowFrameProc(vcl::Window*, SalEvent, void const*) at vcl/source/window/winproc.cxx:2562:13
>  #20 in SalFrame::CallCallback(SalEvent, void const*) const at vcl/inc/salframe.hxx:306:29
>  #21 in SalGenericDisplay::ProcessEvent(SalUserEventList::SalUserEvent) at vcl/unx/generic/app/gendisp.cxx:66:22
>  #22 in SalUserEventList::DispatchUserEvents(bool) at vcl/source/app/salusereventlist.cxx:108:17
>  #23 in SalGenericDisplay::DispatchInternalEvent(bool) at vcl/unx/generic/app/gendisp.cxx:51:12
>  #24 in call_userEventFn(void*) at vcl/unx/gtk3/gtk3gtkdata.cxx:707:27
>  #25  at <null> (/lib64/libglib-2.0.so.0 +0x4de8a)
>  #26 in g_main_context_dispatch at <null> (/lib64/libglib-2.0.so.0 +0x5156f)
>  #27  at <null> (/lib64/libglib-2.0.so.0 +0x518ff)
>  #28 in g_main_context_iteration at <null> (/lib64/libglib-2.0.so.0 +0x519a2)
>  #29 in GtkSalData::Yield(bool, bool) at vcl/unx/gtk3/gtk3gtkdata.cxx:382:31
>  #30 in GtkInstance::DoYield(bool, bool) at vcl/unx/gtk3/gtk3gtkinst.cxx:384:29
>  #31 in ImplYield(bool, bool) at vcl/source/app/svapp.cxx:454:48
>  #32 in Application::Yield() at vcl/source/app/svapp.cxx:518:5
>  #33 in Application::Execute() at vcl/source/app/svapp.cxx:433:9
>  #34 in desktop::Desktop::Main() at desktop/source/app/app.cxx:1602:17
>  #35 in ImplSVMain() at vcl/source/app/svmain.cxx:196:35
>  #36 in SVMain() at vcl/source/app/svmain.cxx:228:12
>  #37 in soffice_main at desktop/source/app/sofficemain.cxx:107:12
>  #38 in sal_main at desktop/source/app/main.c:48:15
>  #39 in main at desktop/source/app/main.c:47:1
>  #40 in __libc_start_main at /usr/src/debug/glibc-2.30-34-g994e529a37/csu/../csu/libc-start.c:308:16
>  #41 in _start at <null>
> 
> 0x610000272430 is located 16 bytes to the left of 192-byte region [0x610000272440,0x610000272500)
> allocated by thread T0 here:
>  #0 in operator new[](unsigned long) at compiler-rt/lib/asan/asan_new_delete.cpp:102:3
>  #1 in (anonymous namespace)::Compare::MovedData::MovedData((anonymous namespace)::CompareData&, char const*) at sw/source/core/doc/doccomp.cxx:768:25
>  #2 in (anonymous namespace)::Compare::Compare(unsigned long, (anonymous namespace)::CompareData&, (anonymous namespace)::CompareData&) at sw/source/core/doc/doccomp.cxx:597:24
>  #3 in (anonymous namespace)::CompareData::CompareLines((anonymous namespace)::CompareData&) at sw/source/core/doc/doccomp.cxx:437:17
>  #4 in SwDoc::CompareDoc(SwDoc const&) at sw/source/core/doc/doccomp.cxx:1866:13
>  #5 in SwEditShell::CompareDoc(SwDoc const&) at sw/source/core/edit/editsh.cxx:876:27
>  #6 in SwView::InsertMedium(unsigned short, std::unique_ptr<SfxMedium, std::default_delete<SfxMedium> >, short) at sw/source/uibase/uiview/view2.cxx:2300:39
>  #7 in SwView::DialogClosedHdl(sfx2::FileDialogHelper*) at sw/source/uibase/uiview/view2.cxx:2491:19
>  #8 in SwView::LinkStubDialogClosedHdl(void*, sfx2::FileDialogHelper*) at sw/source/uibase/uiview/view2.cxx:2481:1
>  #9 in Link<sfx2::FileDialogHelper*, void>::Call(sfx2::FileDialogHelper*) const at include/tools/link.hxx:111:45
> 
> SUMMARY: AddressSanitizer: heap-buffer-overflow sw/source/core/doc/doccomp.cxx:214:58 in (anonymous namespace)::Compare::MovedData::GetIndex(unsigned long) const
> Shadow bytes around the buggy address:
>   0x0c2080046430: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>   0x0c2080046440: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>   0x0c2080046450: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>   0x0c2080046460: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c2080046470: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> =>0x0c2080046480: fa fa fa fa fa fa[fa]fa 00 00 00 00 00 00 00 00
>   0x0c2080046490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x0c20800464a0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
>   0x0c20800464b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>   0x0c20800464c0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
>   0x0c20800464d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> Shadow byte legend (one shadow byte represents 8 application bytes):
>   Addressable:           00
>   Partially addressable: 01 02 03 04 05 06 07 
>   Heap left redzone:       fa
>   Freed heap region:       fd
>   Stack left redzone:      f1
>   Stack mid redzone:       f2
>   Stack right redzone:     f3
>   Stack after return:      f5
>   Stack use after scope:   f8
>   Global redzone:          f9
>   Global init order:       f6
>   Poisoned by user:        f7
>   Container overflow:      fc
>   Array cookie:            ac
>   Intra object redzone:    bb
>   ASan internal:           fe
>   Left alloca redzone:     ca
>   Right alloca redzone:    cb
>   Shadow gap:              cc
> ==2406221==ABORTING

I've put on CC some people who know the Writer code in general and have touched sw/source/core/doc/doccomp.cxx in the past, in the hope that one of them might have an idea what's going on here.
Comment 17 Miklos Vajna 2020-04-07 07:45:33 UTC
Most of this code is simply unchanged since the initial import. I believe that the x and y in question represent character indexes within a paragraph, so if they ever go below 0, that's something to fix.
Comment 18 Julien Nabet 2020-04-07 08:32:14 UTC
(In reply to Miklos Vajna from comment #17)
> Most of this code is simply unchanged since the initial import. I believe
> that the x and y in question represent character indexes within a paragraph,
> so if they ever go below 0, that's something to fix.

Argh another buggy (see other trackers about this, above all the fact that tables can only be compared as a whole and not cell per cell) feature not maintained :-(