Bug 170602 - FILESAVE DOCX: corrupt document reported by MS Word when certain content controls (w:sdt) contain a bookmark
Summary: FILESAVE DOCX: corrupt document reported by MS Word when certain content cont...
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:docx
Depends on:
Blocks: DOCX-Corrupted
  Show dependency treegraph
 
Reported: 2026-02-04 15:51 UTC by Justin L
Modified: 2026-02-04 19:31 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
bookmarkCorrupts_runSdt.docx: MS Word already considers this document to be corrupt (13.18 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2026-02-04 15:51 UTC, Justin L
Details
bookmarkCorrupts_blockSdt.docx: MS Word already considers this document to be corrupt (13.17 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2026-02-04 16:46 UTC, Justin L
Details
forum-mso-en-4699.docx: example document that is corrupt after round-tripped by LO (21.89 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2026-02-04 16:50 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Justin L 2026-02-04 15:51:41 UTC
Created attachment 205360 [details]
bookmarkCorrupts_runSdt.docx: MS Word already considers this document to be corrupt

This first comment actually describes a situation where LO current works well. It even takes this corrupt document and round-trips it as a valid document...

MS Word easily complains that a document is corrupt, if a content control ends in a bookmark.

A w:sdt is very particular about where a bookmarkEnd is placed. (bookmarkStart doesn't seem to be a problem - it can start anywhere...).
Interestingly, bookmarkEnd can be placed OK in lots of strange places (inside SdtPr, rPr), but it cannot be placed as the last item inside of the content control.

Note that this is NOT universally true. Anywhere is fine with richText and group. Tests as corrupt for w:text and w:checkbox.

<w:p>
  <w:sdt>
    <w:sdtContent>
      <w:r>
        <w:t> some content </w:t>
      </w:r>
      <w:r>
                 w:bookmarkEnd is OK here
        <w:t> ending content </w:t>
                 w:bookmarkEnd MUST NOT be here
      </w:r>
                 w:bookmarkEnd MUST NOT be here
    </w:sdtContent>
                 w:bookmarkEnd MUST NOT be here
  <w:sdt>
w:bookmarkEnd is OK here
</w:p>

At the moment, we seem to be OK with these paragraph-wrapped runSdt's. I think we are outputting an empty w:r for the end marker, so our bookmark gets written before the empty w:r. Or, if the bookmarkEnd was originally after the /w:sdt, it gets written back to that spot again.
Comment 1 Justin L 2026-02-04 16:46:17 UTC
Created attachment 205361 [details]
bookmarkCorrupts_blockSdt.docx: MS Word already considers this document to be corrupt

This comment actually describes a situation where LO current works well. It even takes this second corrupt document and round-trips it as a valid document...

Surprisingly, content controls that contain a paragraph (blockSdt) are more restrictive in this aspect than are runSdts.

<w:sdt>
  <w:sdtContent>
    <w:p>
                 w:bookmarkEnd OK up to this point
      <w:r>
                 w:bookmarkEnd MUST not occur from this point onwards
        <w:rPr/>
        <w:t> some content </w:p>
      </w:r>
      <w:r>
        <w:t> ending content </w:p>
      </w:r>
    </w:p>
                 w:bookmarkEnd OK from here onwards
  </w:sdtContent>
</w:sdt>

P.S. Where the bookmark starts doesn't matter. For example, even if the bookmark starts in the Sdt, it still is not allowed to end in the Sdt. (Microsoft's UI blocks the 'Insert - Bookmark' option while inside a content control).

P.P.S. LO tends to avoid this situation often, because it converts blockSdt's into runSdt's (which are more lenient).

Once again, these restrictions do not apply to richText or w:group.
Comment 2 Justin L 2026-02-04 16:50:20 UTC
Created attachment 205362 [details]
forum-mso-en-4699.docx: example document that is corrupt after round-tripped by LO

I suppose by this point you would like to have a file that DOESN'T work in LO. Here you go.

Steps to reproduce:
1.) open forum-mso-en-4699.docx in LO and save as DOCX
2.) try to open the file is MS Word.

MS Word reports the file as corrupt. In this case, during import the checkbox is not imported as a content control. On export, grabBag properties restore it as a blockSdt - along with the offending bookmarks inside the paragraph.
Comment 3 Justin L 2026-02-04 17:38:18 UTC
Restriction also applies to w:comboBox, w:dropDownList, w:picture, and w:date.
Comment 4 Justin L 2026-02-04 19:31:08 UTC
(In reply to Justin L from comment #0)
> we are outputting an empty w:r because of the bookmarkEnd attr, so our bookmark
> gets written before the empty w:r.
I'm pretty sure this was just lucky. Having a bookmark at the end of the paragraph stops the run at the content control end marker. So we get an extra run containing only the end marker - which of course results in an empty w:r. (Otherwise the last run also contains the end marker

OK - take that as a lucky win.