Bug 101753 - ww8 import filter not correctly positioning Text Frame objects (holding "marginalia" annotation) on import
Summary: ww8 import filter not correctly positioning Text Frame objects (holding "marg...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:doc
Depends on:
Blocks: DOC
  Show dependency treegraph
 
Reported: 2016-08-27 11:39 UTC by Zenaan Harkness
Modified: 2024-08-28 18:07 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
input or original test document (22.50 KB, application/msword)
2016-08-27 11:39 UTC, Zenaan Harkness
Details
correct (MS Windows) pdf print output (11.28 KB, application/pdf)
2016-08-27 11:41 UTC, Zenaan Harkness
Details
incorrect output - LO pdf print on Debian (65.16 KB, application/pdf)
2016-08-27 11:42 UTC, Zenaan Harkness
Details
example -correct- odt file/ transformation - works in LO 5.2.0.4 (24.64 KB, application/vnd.oasis.opendocument.text)
2016-08-27 12:54 UTC, Zenaan Harkness
Details
DOCX conversion by (MS) Office 365 (14.48 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2024-08-28 18:05 UTC, László Németh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zenaan Harkness 2016-08-27 11:39:36 UTC
Created attachment 127046 [details]
input or original test document

As the attached files show, the per-paragraph paragraphs ("marginalia") show at the top of each page, rather than to the correct margin placement side (left or right (mirrored margins)), beside each each respective paragraph.

Libreoffice frame styles are capable of correct placement (I've tested this, can supply an example test file if needed - LO marginalia examples are also available on the web).

Only one attachment allowed, so I'll attach the respective PDF "prints" separately.

Please note: the input file (.doc/ WordXP/ Word 2002) page size is custom (13cm x 21cm) and the PDF print on Debian does not preserve this, and provides (on my XFCE4 desktop) no option to set correct paper size - this used to work, but no longer; I shall file a separate bug report somewhere about this.
Comment 1 Zenaan Harkness 2016-08-27 11:41:39 UTC
Created attachment 127047 [details]
correct (MS Windows) pdf print output
Comment 2 Zenaan Harkness 2016-08-27 11:42:28 UTC
Created attachment 127048 [details]
incorrect output - LO pdf print on Debian
Comment 3 Zenaan Harkness 2016-08-27 11:45:34 UTC
Note that the "incorrect" output, although I call it "output" is actually an incorrect conversion of the input document, and the incorrect conversion is seen within the LibreOffice writer as the (converted) document is displayed/ edited in LO.

Happy to run tests, as well as test latest versions/ updates - I have a local libreoffice-core.git installation which I compile as needed; but the example output pdf files make the required result obvious anyway...
Comment 4 V Stuart Foote 2016-08-27 12:03:10 UTC
Please go ahead and post up a mockup in Writer .ODT of the layout working correctly in ODF.
Comment 5 Zenaan Harkness 2016-08-27 12:54:37 UTC
Created attachment 127049 [details]
example -correct- odt file/ transformation - works in LO 5.2.0.4

This works in LO 5.2.0.4.

The transform should probably create a frame style in LO which has the same name as the corresponding para style in MSWord. That would be logical of course.

As you can see, this .odt file looks very similar to the correct / successful (Word pdf print) output/ pdf print file.

So, we can see that the LO frames engines can correctly support this marginalia style.

I note that although it works, it is a recent functionality that it works at all - following is my attempt to load this (fixed) .odt file in Debian (stable) system LO installation:

$ /usr/bin/soffice --version
LibreOffice 4.3.3.2 430m0(Build:2)

$ /usr/bin/soffice --writer submission-1-re-human-rights-fixed.odt
error
xsltParseStylesheetFile : cannot parse 
I/O warning : failed to load external entity ""
error
xsltParseStylesheetFile : cannot parse 
error
xsltParseStylesheetFile : cannot parse 
I/O warning : failed to load external entity ""
error
xsltParseStylesheetFile : cannot parse 
error
xsltParseStylesheetFile : cannot parse 
I/O warning : failed to load external entity ""
error
xsltParseStylesheetFile : cannot parse 
error
xsltParseStylesheetFile : cannot parse 
I/O warning : failed to load external entity ""
error
xsltParseStylesheetFile : cannot parse 
error
xsltParseStylesheetFile : cannot parse 
I/O warning : failed to load external entity ""
error
xsltParseStylesheetFile : cannot parse 
terminate called after throwing an instance of 'com::sun::star::uno::DeploymentException'


I will be very happy if this can be fixed in the current/ latest versions of LO, I have no attachment at all to the older version(s).  :)
Comment 6 Zenaan Harkness 2016-08-27 12:58:00 UTC
I hope this is a correct "mockup" - it uses frame style ("zenaanFrame") to do the job, which is the built in, if recent (to LO) functionality to get this type of text document layout working properly.

I say this because when I hear the term "mockup" I think "totally fake Gimp image" or something :)

It would be amusing to say the least, if a working LO writer file were not sufficient :D
Comment 7 Zenaan Harkness 2016-08-27 13:09:43 UTC
Actually, the LO 4.3 failure seems to be that I can no longer run LO 4.3 at all - any idea how I can get it to run?

I have Debian stable's LO 4.3 installed (/usr/...), LibreOffice 5.2.0.4 .deb distribution installed to /opt/libreoffice5.2 and moved to /opt/l/libreoffice5.2 , and LO 5.3 beta via .git compile, installed in $HOME/dev/locore.git/... and symlinked to /opt/l/libreoffice5.3/...

$ echo $PATH
/usr/local/sbin:/usr/sbin:/sbin:/usr/local/bin:/usr/bin:/bin
$ /usr/bin/soffice 
terminate called after throwing an instance of 'com::sun::star::uno::DeploymentException'
Comment 8 Zenaan Harkness 2016-08-27 13:13:38 UTC
I can confirm that submission-1-re-human-rights-fixed.odt can be loaded successfully in LO 5.3.0 beta as well.

Enjoy :)
Comment 9 V Stuart Foote 2016-08-27 14:04:49 UTC
(In reply to Zenaan Harkness from comment #7)
> Actually, the LO 4.3 failure seems to be that I can no longer run LO 4.3 at
> all - any idea how I can get it to run?
> 
> I have Debian stable's LO 4.3 installed (/usr/...), LibreOffice 5.2.0.4 .deb
> distribution installed to /opt/libreoffice5.2 and moved to
> /opt/l/libreoffice5.2 , and LO 5.3 beta via .git compile, installed in
> $HOME/dev/locore.git/... and symlinked to /opt/l/libreoffice5.3/...
> 
> $ echo $PATH
> /usr/local/sbin:/usr/sbin:/sbin:/usr/local/bin:/usr/bin:/bin
> $ /usr/bin/soffice 
> terminate called after throwing an instance of
> 'com::sun::star::uno::DeploymentException'

Have a read of the installing in parallel Wiki [1], you probably would want to separate the user profile location of the two installations.

=-ref-=
1. https://wiki.documentfoundation.org/Installing_in_parallel/Linux
Comment 10 V Stuart Foote 2016-08-27 14:43:37 UTC
Thanks for posting the sample document in ODF. I only asked for a mock-up to see the means being used for structuring the "marginalia", having the complete document is perfect.

So, yes confirming that using styled Text Frames anchored to paragraph allows this formatting in LO 5.2 and master (5.3)--however it is not imported from a MS Word 97-2003 binary file. Nor does Word 2007 correctly export them to ODF Text document .ODT format.

More important is how this formatting behaves round trip from LibreOffice using export filter(s)--unfortunately we mishandle it as badly as does Word 2007.

The placement of the Text frames (objects with zenaanFrame style) are anchored to paragraph--but the placement (0.08" from the Frame objects Position Horizontal "to Outer Paragraph border" w/mirror on even pages) and its AutoSize attribute is not filter parsed into the resulting 97-2003 MS Word binary, nor any of the XML based Word documents.

On filter import back into LibreOffice, formatting for the Text Boxes is lost round trip.
Comment 11 V Stuart Foote 2016-08-27 14:52:05 UTC
(In reply to V Stuart Foote from comment #10)
> The placement of the Text frames (objects with zenaanFrame style) are
> anchored to paragraph--but the placement (0.08" from the Frame objects
> Position Horizontal "to Outer Paragraph border" w/mirror on even pages) and
> its AutoSize attribute is not filter parsed into the resulting 97-2003 MS
> Word binary, nor any of the XML based Word documents.
> 
> On filter import back into LibreOffice, formatting for the Text Boxes is
> lost round trip.

s/Text Boxes/Text Frames/g

Of course layout of these "marginalia" are dealing with Writer's "Text frame" objects, and not the "Text box" shape Drawing objects.
Comment 12 Zenaan Harkness 2016-08-27 16:06:43 UTC
This bug relates also to the following bug, to which I've added a fairly detailed description of an enhancement for LO which would help bring it up to par with WordXP functionality, and then further to take LO to a new level of functionality beyond WordXP.
https://bugs.documentfoundation.org/show_bug.cgi?id=62071
Comment 13 V Stuart Foote 2016-08-28 19:01:12 UTC
Moving to LibreOffice filters & storage (ww8 import filter) rather than Document Liberation Project
Comment 14 V Stuart Foote 2016-08-28 19:05:59 UTC
The anchor of the Text Frame object holding "marginalia" annotation/reference is correctly place on import--but the position of the Frame relative to its anchor as defined in the MS Word binary or OOXML document is not retained.

The Text Frame is imported at its anchor location.
Comment 15 Xisco Faulí 2017-09-29 08:48:03 UTC Comment hidden (obsolete)
Comment 16 QA Administrators 2019-12-03 13:52:14 UTC Comment hidden (obsolete)
Comment 17 QA Administrators 2021-12-03 04:20:49 UTC Comment hidden (obsolete, spam)
Comment 18 Justin L 2022-09-06 16:53:30 UTC
repro 7.5+
Comment 19 Justin L 2023-05-25 01:42:11 UTC
repro 7.6+
anchored to "outside" of the page (i.e. right side, 0.2cm from the start of the right margin).
Comment 20 László Németh 2024-08-28 18:05:10 UTC
Created attachment 196074 [details]
DOCX conversion by (MS) Office 365

much better import in LibreOffice, but frames are on the opposite margin
Comment 21 László Németh 2024-08-28 18:07:55 UTC
(In reply to László Németh from comment #20)
> Created attachment 196074 [details]
> DOCX conversion by (MS) Office 365
> 
> much better import in LibreOffice, but frames are on the opposite margin