Created attachment 199158 [details] forum-mso-de-138781.docx: example document with date in footer Visually, the problem is the date in the footer was previously round-tripped unchanged (as seen by MS Word). Now, MS Word reports the save date as all zeros. More ultimately, this probably has to do with the document properties. With the change below, the last modified date is no longer seen in MS Word. (However, looking at core.xml, I'm not noticing any significant differences. This started (while saving interactively - convert-to backported to 25.2) with 25.8 commit d97085cc6cd2bdc3b6723d1960d0ec5fa0a48165 Author: Justin Luth on Sat Dec 7 11:42:39 2024 -0500 tdf#164201 docx import: compat14+ cannot be ECMA_376_1ST_EDITION Steps to reproduce: 1.) Open and resave forum-mso-de-138781.docx. 2.) Open the round-tripped file in MS Word (probably 2010 or higher) Notice that the date field in the footer (on page 2) is 00-00-0000. It should be the date that you saved it... Found by Collabora's mso-test
Created attachment 199159 [details] 165207_modifiedDate.odt: simple date field content In order to take out any previous DOCX-isms out of the picture, I tested using ODT->DOCX. That lead me to 24.2's commit ed0476b0625c4361df5ff040a6661a9634588cea Author: Michael Stahl on Fri Feb 17 12:25:30 2023 +0100 tdf#137883 filter: rename DOCX filters to be less confusing Rename misleading "Word 2007–365" filter which corresponds to the sightly incompatible first pre-ISO version of OOXML (ECMA-376 1st edition) and is actually very specifically for Word 2007. Stop confusing users with standardese like "Office Open XML Text Document (Transitional)" and instead use the name of the application that the format is intended for, "Word 2010-365". Hopefully users will now pick the latter filter over the former. And I'm getting the same results all the way back to 3.6 when saving to "Office Open XML Text Document (Transitional)". (Even "Word 2007 DOCX" only seems to round-trip the date, and not update it). So somehow this has never worked for this format.
The problem appears to be "officedocument/2006/relationships/metadata/core-properties" instead of "package/2006/relationships/metadata/core-properties" although that is exactly what we use to distinguish Word 2007 on import...
See oox/source/core/xmlfilterbase.cxx WriteCoreProperties. So now the question is, when should we be following the spec? // The lowercase "officedocument" is intentional and according to the spec // (although most other places are written "officeDocument") sValue = "http://schemas.openxmlformats.org/officedocument/2006/relationships/metadata/core-properties"; The Internet says: 2.1.30 Part 1 Section 15.2.12.1, Core File Properties Part a. The standard specifies a source relations ship for the Core File Properties part as follows: http://schemas.openxmlformats.org/officedocument/2006/relationships/metadata/core-properties. Office uses the following source relationship for the Core File Properties part: http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties. I take that to mean that Office ignores the spec and does something else. We on the other hand have always coded as if MS is following their own spec.
https://gerrit.libreoffice.org/c/core/+/178048 has a comment > perhaps the "best" way to do this would be to parse the settings.xml > in the type detection to get the version number? which is probably the route we should go, and then just forget that the spec even exists...
Something strange though. In my personal testing (Word 2010 and Word 2019) I always get zero'd out dates. But in mso-test, Word 2019 makes a PDF with the correct date.... Oh, I just answered my own question. While the SCREEN displays zeros and xxx's, the PDF contains the date (because Word "modifies" the document before creating the PDF and thereby also updates the SCREEN to that moment in time).
(In reply to Justin L from comment #1) > (Even "Word 2007 DOCX" only seems to round-trip the date, and not update it). That is because we don't treat a simple save as a "modification". For us, it is a modified field, not a save date. Of course, the date is changed if there is actually a modification to the document instead of a simple round-trip or convert-to.
Let me document this here, because it almost certainly will come back to bite me. My fix affects uiwriter4.cxx's testTdf72942. The main file is fdo72942.docx (has settings.xml but no compatSetting entries - thus is treated as compat12 / Word 2007). testTdf72942 does an Insert - Text from file with fdo72942-insert.docx which is compat15 / Word 2013). So, previously the logic told it to treat both as Word 2007 formats, and thus in SwView::InsertMedium, StartConvertFrom found a pRead which ends up calling SwDOCXReader::Read (from 5.4/6.0 bug 112025) - which first adds an empty paragraph and then imports the contents. Now, the logic correctly inserts the file as "Office Open XML Text" which is a different filter and thus no pRead and so it follows a different code path, which does NOT first add an empty paragraph before inserting (like what happened prior to 5.4).
(In reply to Justin L from comment #7) > So, previously the logic told it to treat both as Word 2007 formats, and > thus in SwView::InsertMedium, StartConvertFrom found a pRead which ends up Nah, it has nothing to do with "the same filter". MS_WORD_2007_XML.xcu has <prop oor:name="UserData"><value>OXML</value></prop> while Word 2010 has no value specified.
Justin Luth committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/e4b629c1eecf8cd46007fb064179d765d55fd26b tdf#165207 tdf#164201 docx: always use errata uri in docProps/core.xml It will be available in 25.8.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.