Bug 148703 - Writer crash on undo after paste
Summary: Writer crash on undo after paste
Status: ASSIGNED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: high critical
Assignee: David Hashe
URL:
Whiteboard: target:26.2.0
Keywords: haveBacktrace
Depends on:
Blocks: Paste Crash-Assert Undo-Redo Crash-BigPtrArray
  Show dependency treegraph
 
Reported: 2022-04-21 08:06 UTC by Timur
Modified: 2025-10-31 07:15 UTC (History)
6 users (show)

See Also:
Crash report or crash signature: ["BigPtrArray::Index2Block(long)","BigPtrArray::Index2Block(int) const"]


Attachments
bt with debug symbols (7.08 KB, text/plain)
2022-04-21 09:16 UTC, Julien Nabet
Details
gdb session with destination document SwNodes (4.26 KB, text/plain)
2025-05-27 03:35 UTC, David Hashe
Details
simple reproduction (16.94 KB, application/zip)
2025-05-27 03:40 UTC, David Hashe
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Timur 2022-04-21 08:06:41 UTC
Open ODT attachment 179694 [details], select all and copy.
Open ODT attachment 179695 [details], select all and paste.
Undo > Crash 
Repro OO, 4.1, 5.2 and 7.4+ Win and Lin.
Comment 1 Timur 2022-04-21 08:11:59 UTC Comment hidden (obsolete)
Comment 2 Telesto 2022-04-21 09:05:23 UTC
Repro
Version: 7.4.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 4659fc2f0a7223a89446edff0b77e58758b5edf5
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: Skia/Raster; VCL: win
Locale: en-US (nl_NL); UI: en-US
Calc: CL Jumbo
Comment 3 Julien Nabet 2022-04-21 09:16:16 UTC
Created attachment 179705 [details]
bt with debug symbols

I got an assertion on pc Debian x86-64 with master sources updated today.

Considering the number of crashes due to undo, I'm not sure it worths it to keep on testing undoing for the moment.
Comment 4 Telesto 2022-04-21 09:37:49 UTC
Looks like bug 147726 (for the assert part)
Comment 5 Timur 2022-04-21 09:46:55 UTC
Other undo bugs are mostly regressions, this one is InNherited. 
Also, unlike some bugs with "copy 5x, paste 3x.." this one is simple copy+paste.
Comment 6 Timur 2022-04-21 14:37:00 UTC
A mystery for me is why there are 266 crash reports for LO 7.3 Win when I see this crash all the way to OO including Lin.
Comment 7 Stéphane Guillou (stragu) 2023-04-06 13:58:13 UTC
Repro with recent debug build:

Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 1b463f697405e64a03378fb38a32172c4d3c25e6
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

Getting the same assert:

warn:legacy.osl:389415:389415:sw/source/core/access/accmap.cxx:1074: invalid event combination
soffice.bin: /home/tdf/lode/jenkins/workspace/lo_gerrit/tb/src_master/sw/source/core/undo/undel.cxx:915: virtual void SwUndoDelete::UndoImpl(sw::UndoRedoContext&): Assertion `pTextNd' failed.

In 7.5.2.2 (not debug), I get this crash report: https://crashreport.libreoffice.org/stats/crash_details/8ffc1cee-0081-4dc4-a92d-b55fcf3ebc12

With the extra signature "BigPtrArray::Index2Block(int) const".
Comment 8 Tex2002ans 2024-02-16 22:14:06 UTC
Yep, followed comment 0 exactly.

CRASH:

- https://crashreport.libreoffice.org/stats/crash_details/67812785-33aa-434c-b7c5-12dd5941ca3b

in:

Version: 24.2.0.3 (X86_64) / LibreOffice Community
Build ID: da48488a73ddd66ea24cf16bbc4f7b9c08e9bea1
CPU threads: 8; OS: Windows 10.0 Build 22631; UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: CL threaded
Comment 9 Stéphane Guillou (stragu) 2024-04-16 12:59:29 UTC
Still crashing in:

Version: 24.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: bdf3b5ce49b0e4ee1b4525d344cfb037ef473059
CPU threads: 8; OS: Linux 6.5; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: CL threaded
Comment 10 David Hashe 2025-05-27 03:30:02 UTC
I think that I understand what's going on here.

Reproduction from empty documents:

1. Create a new document.
2. View > Styles. Click the "page styles" icon in the styles pane.
3. Right click an item and choose "new".
4. Apply, then OK.
5. Double click the newly created style in the styles pane to apply it.
6. Create a footer.
7. Insert some text into the footer (optional, but makes it clearer that the footer is copied).
8. Enter at least two paragraphs (or more generally nodes) in the main body of the first document.
9. Create a second new document.
10. Enter some text in the main body of the second document.
11. Ctrl-A then Ctrl-C in the first document.
12. Ctrl-A then Ctrl-V in the second document.
13. Ctrl-Z in the second document.

The source document must 1) have a non-default page style applied, and 2) have a header and/or footer. The selection from the source document must also 3) include at least two nodes. There also must 4) be a non-empty selection in the destination document that is overwritten by the copy/paste, and 5) the destination document can't already contain a page style with the same name as the non-default page style from the source document selection.

When these conditions are met, the copy will include the header and/or footer from the source document.

(See here, where the page desc, which exists if there is a page style applied, is copied, which also copies the header/footer: https://opengrok.libreoffice.org/xref/core/sw/source/core/attr/swatrset.cxx?r=ac3b217a43e21b00f434fa796a0c966b9ddfd9df#423 )

Pasting into the destination document is a two stage process, and then undoing the paste is another two stage process:

1. Pasting into the destination document.
   a. Deleting the selection.
   b. Inserting the pasted content.
2. Undoing the paste.
   a. Deleting the pasted content.
   b. Re-inserting the selection.

In step 1a, the undo information saves the index from the document SwNodes where it will re-insert the selection later.

(There's a comment about how undo information stores document indices here: https://opengrok.libreoffice.org/xref/core/sw/source/core/undo/untblk.cxx?r=ac3b217a43e21b00f434fa796a0c966b9ddfd9df#239 )

By step 2b, something is wrong. The header/footer nodes are still in the document, and it messes up the indexing, so the re-insertion point, which should've been a text node, is actually a start node, which causes the cryptic error.

The solution is to modify SwDoc::CopyPageDescHeaderFooterImpl to record the copying of the header/footer nodes so that they are correctly undone and don't mess up the indexing.

I am working on a small fix and regression test.
Comment 11 David Hashe 2025-05-27 03:35:06 UTC
Created attachment 200968 [details]
gdb session with destination document SwNodes

I've attached some snippets from my gdb session showing the contents of the destination document SwNodes after steps 1a, 1b, and 2a.

The problem is that the first and the third versions look different, even though the logical difference between them is inserting the pasted text and then removing it. The re-insertion point for the deleted selection was recorded as 9, but now that point has shifted to 12 because of the extra 3 nodes representing the leftover footer.
Comment 12 David Hashe 2025-05-27 03:40:50 UTC
Created attachment 200969 [details]
simple reproduction

I've done steps 1-10 from my previous comment in the two .odt attachments for a simpler reproduction.
Comment 13 David Hashe 2025-05-28 04:13:42 UTC
Fix is here: https://gerrit.libreoffice.org/c/core/+/185940
Comment 14 Commit Notification 2025-10-07 06:19:47 UTC
David Hashe committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/434cf190c7dcf0e8b99d473eaff2acfd70b6e82e

tdf#148703 fix follow links for recreated page descs

It will be available in 26.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 David Hashe 2025-10-27 17:51:03 UTC
Here's my current understanding and plan for finishing this:

1. The header/footer content nodes are copied into the document and left there, which messes up the indexing in the undo nodes and causes a crash (what I've talked about before).
2. We fix that by saving the content section into the undo nodes, so that the nodes are present in the doc nodes only when they should be.
3. The obvious way to save the section deletes and then recreates the section start node, but the page desc holds a pointer to the start node (actually, multiple pointers in its various SwFrameFormats), so we have to update these pointers (actually, SwNodeIndexes) to point to the new section start node.
4. Now the page description itself is successfully recreated on undo/redo, but there's still a problem: the page description doesn't get re-applied on redo.
   
(This is where we are at after applying https://gerrit.libreoffice.org/c/core/+/185940 )

I plan to make another changeset to address the problem with page desc's not getting re-applied on redo. For now, the workaround is just to manually apply the page style to the page.

Why is the page desc not being re-applied?

Whenever the creation of a page desc is undone, it broadcasts its deletion to all nodes that used it, and they immediately drop it. This also applies to undo nodes, that were created after the page description was created and will only be redone after the page desc is redone. But when the page desc is recreated, it doesn't have any list of the nodes that it was previously applied to, and so it is simply never re-applied.

There are actually two situations that need to be fixed:
1) redo'ing the creation of a new page desc (doesn't remember which nodes in the undo nodes it was applied to, if you keep redoing to bring back nodes that it was applied to).
2) undo'ing the deletion of an old page desc (doesn't remember which nodes in the doc nodes it was applied to before it was deleted)

These involve separate classes but the fix is similar for both; we just need the relevant SwUndo* classes to do a linear search through the nodes (the undo nodes for (1) and the doc nodes for (2)) and record which nodes that page desc applies to at the time of deletion.

After that last changeset, we should be able to close this as completed.
Comment 16 Mike Kaganski 2025-10-29 07:52:07 UTC
(In reply to David Hashe from comment #15)
> Why is the page desc not being re-applied?
> 
> Whenever the creation of a page desc is undone, it broadcasts its deletion
> to all nodes that used it, and they immediately drop it. This also applies
> to undo nodes, that were created after the page description was created and
> will only be redone after the page desc is redone. But when the page desc is
> recreated, it doesn't have any list of the nodes that it was previously
> applied to, and so it is simply never re-applied.

Shouldn't we instead just avoid deletion of the page desc? Store it in an undo object, and avoid its destruction and broadcasting?
Comment 17 David Hashe 2025-10-30 19:45:12 UTC
> Shouldn't we instead just avoid deletion of the page desc? Store it in an undo object, and avoid its destruction and broadcasting?

I've given that idea some thought and I don't think that it would work in all cases.

(also, I should say that this bit is independent of the current patch https://gerrit.libreoffice.org/c/core/+/185940 . I do not expect that patch to need to be updated based on the decision that we make here, and it can be merged independently)

Consider case "(2) undo'ing the deletion of an old page desc".

The concern with that case is that the page desc may be present on some nodes within the document at the time that it is deleted, and then when we undo the deletion we will need to re-apply the page desc to those nodes.

Here is a series of steps which will delete a page desc that is currently in use:

1. Create a custom page style with (style sidebar) > Page Styles icon > (right click context menu) > New...
2. Apply that custom page style to the current page by double clicking on it.
3. Select the custom page style in the style sidebar and press the delete key.
4. A popup window appears: "One or more of the selected styles is in use in this document. If you will delete it, text or objects using these styles will revert to the parent style. Do you still wish to delete these styles? Styles in use: Untitled1"
5. Answer "Yes".
6. The style is deleted, despite there being nodes which use it.
   
So then, the deletion must be propagated to the nodes in the document in order to correctly process the delete, but the undo information must somehow maintain a list of affected nodes in order to correctly process an undo of that delete.

For the other case "(1) redo'ing the creation of a new page desc", we might be able to avoid the destruction/broadcasting, because we know that the page desc will only be present on nodes in the undo nodes, but it might be more confusing to handle the two closely related situations with different approaches.

I can play around with it and see if it simplifies the implementation enough to justify using a second approach, but my intuition right now is that it will be better to handle them both the same way.
Comment 18 Mike Kaganski 2025-10-31 04:58:03 UTC
(In reply to David Hashe from comment #15)
> But when the page desc is recreated, it doesn't have any list of the nodes
> that it was previously applied to, and so it is simply never re-applied.

Having a list of indices in the undo object itself could be relatively straightforward - it should be a passive vector if sal_Int32, not of active SwNodeIndexes. The state of the document at the time of redoing is expected to be the same, so the indexes should match?
Comment 19 David Hashe 2025-10-31 06:08:02 UTC
(In reply to Mike Kaganski from comment #18)
> (In reply to David Hashe from comment #15)
> > But when the page desc is recreated, it doesn't have any list of the nodes
> > that it was previously applied to, and so it is simply never re-applied.
> 
> Having a list of indices in the undo object itself could be relatively
> straightforward - it should be a passive vector if sal_Int32, not of active
> SwNodeIndexes. The state of the document at the time of redoing is expected
> to be the same, so the indexes should match?

Yes, I agree :) I am thinking to do something like that, by adding a member to SwUndoPageDescDelete.

The current way that SwUndoPageDescDelete works is by saving a copy-constructed version of the document's page desc, and then using that version as a template in SwUndoPageDescDelete::UndoImpl to recreate the page desc on the document.

So, the identity of the document's page desc changes after undo/redo (because it has been deleted and recreated), and the document's page desc is removed from all nodes in the document when it is deleted.

I think that it is easiest to save the list of indices in the undo object during the constructor by doing a linear scan* of the document, and then use that list during the undo to apply the re-created page desc to the nodes at those indices.

And I don't think that avoiding the deletion of the document's page desc saves us anything, because then we need to also manually remove the page desc from the nodes that it is applied to as well as add it back to those nodes during undo.

*I think that we can use SwDoc::GetAllLayouts() to iterate over the SwPageFrames, so this would be linear in the number of pages and not linear in the number of nodes.
Comment 20 Mike Kaganski 2025-10-31 07:15:27 UTC
(In reply to David Hashe from comment #19)

Nice!
I suggest to make that change in the frame of https://gerrit.libreoffice.org/c/core/+/185940.