Bug 93675 - FILESAVE DOCX Grouped shapes text in Word is attached to different shapes than in Writer
Summary: FILESAVE DOCX Grouped shapes text in Word is attached to different shapes tha...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
4.4.5.2 release
Hardware: All All
: low normal
Assignee: Not Assigned
URL:
Whiteboard: target:5.1.0
Keywords: filter:docx
Depends on:
Blocks: Matters-to-Caolan DOCX-Grouped-Shapes Shape-ODF-OOXML-export
  Show dependency treegraph
 
Reported: 2015-08-26 08:35 UTC by Dr. David Alan Gilbert
Modified: 2023-06-09 01:34 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
document containing diagram (266.32 KB, application/vnd.oasis.opendocument.text-flat-xml)
2015-08-26 08:35 UTC, Dr. David Alan Gilbert
Details
corrupt docx produced from the fodt (15.67 KB, application/download)
2015-08-26 08:35 UTC, Dr. David Alan Gilbert
Details
docx (left) and fodt (right) worksforme (300.92 KB, image/jpeg)
2015-08-27 16:40 UTC, steve
Details
Screenshot showing corruption (332.56 KB, image/jpeg)
2015-08-27 17:52 UTC, Dr. David Alan Gilbert
Details
example of rotated element causing trouble on export (19.15 KB, text/odt)
2015-10-14 16:06 UTC, Caolán McNamara
Details
document compared in LO and MSO (184.39 KB, image/png)
2020-09-04 13:14 UTC, Timur
Details
Rendering on head unmodified (72.66 KB, image/png)
2020-09-27 19:10 UTC, Dave Gilbert
Details
Rendering on head after ungroup (65.11 KB, image/png)
2020-09-27 19:12 UTC, Dave Gilbert
Details
The example document and its docx version in Writer (171.21 KB, image/png)
2021-07-26 10:18 UTC, NISZ LibreOffice Team
Details
The example document and its Writer-saved version in Word (186.53 KB, image/png)
2021-07-26 10:51 UTC, NISZ LibreOffice Team
Details
document compared in LO 7.3+ and MSO 2016 (136.24 KB, image/png)
2021-07-26 11:05 UTC, Timur
Details
Current state, showing loss of rotation on imported docx (261.57 KB, image/png)
2023-06-07 00:37 UTC, Dave Gilbert
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dr. David Alan Gilbert 2015-08-26 08:35:19 UTC
Created attachment 118188 [details]
document containing diagram

(This happens on both the 5.0.1.1-1 upstream packages and the 4.4.5.2-1 fedora downstream packages; both x86-64 Linux)

The attached diagram is very heavily oorrupted when exporting as docx (see the docx attachment as well).   Reading the docx file on MS Word also shows a very corrupt result but very different (chunks of text swapped) from LO's attempt.
Comment 1 Dr. David Alan Gilbert 2015-08-26 08:35:58 UTC Comment hidden (no-value)
Comment 2 steve 2015-08-27 16:40:06 UTC
WORKSFORME with LO Version: 5.1.0.0.alpha1+
Build ID: b2363e98af7b0281279617e43b8fec5b898b9120
TinderBox: MacOSX-x86_64@49-TDF, Branch:master, Time: 2015-08-25_23:42:33
Locale: de-DE (de.UTF-8)

Screenshot of fodt file and docx (exported to that) both opened with LibreOffice.

The docx file looks indeed different when opened with Word. But that might as well be a glitch in Word.

Please correct me if I did miss anything. I'll also ping another QA team member to look at this.
Comment 3 steve 2015-08-27 16:40:35 UTC
Created attachment 118228 [details]
docx (left) and fodt (right) worksforme
Comment 4 Joel Madero 2015-08-27 16:46:16 UTC Comment hidden (obsolete)
Comment 5 Joel Madero 2015-08-27 16:52:37 UTC Comment hidden (obsolete)
Comment 6 Dr. David Alan Gilbert 2015-08-27 17:43:12 UTC
(In reply to Joel Madero from comment #5)
> Ubuntu 15.04 x64
> LibreOffice 4.4.5.2, 5.0.0.5, master (Build ID:
> b103e7d786f5b7ec6cfe4f53f2ca317f06ceabc5)
> 
> 
> Setting as:
> NEW
> Normal - can prevent high quality work;
> Low - fodt is not popular to begin with, then saving that as a docx with a
> complexish chart . . . not going to impact many users at all.
> 

The 'fodt' is not part of the problem; it fails in the same way for me  with a normal odt as well; I just tend to save as fodt before posting so I can check to make sure nothing confidential is left in it.
 
> Would be nice to find out if this ever worked (by testing older versions).
> If you have time please do so and report back:
> http://downloadarchive.documentfoundation.org/libreoffice/old/
> 
> Thanks for reporting
Comment 7 Dr. David Alan Gilbert 2015-08-27 17:52:30 UTC
Created attachment 118230 [details]
Screenshot showing corruption

Hi Steve__,
  As you can see from this screenshot on 5.0.1.1 it's well - VERY corrupt (and the same on the 4.x I have).  So if we're lucky someone just fixed in the 5.1.

When you say it looks different in Word, can you describe how?
When I tried it, chunks of the text were swapped around compared to where LO had it.  To my mind the docx has to look 'right' on Word or there isn't much point in exporting to it.

But IMHO this isn't a 'low' - trying to collaborate with other people using Word is unfortunate but necessary for me.

(I'm out for a few days, so might not respond quickly).
Comment 8 Joel Madero 2015-08-27 18:08:05 UTC Comment hidden (obsolete)
Comment 9 Commit Notification 2015-09-07 13:13:27 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=54f9576aa43e3d6d687469aa0b2ea56ce0bbaca3

Related: tdf#93675 'new' ms-alike numbering has same problem as old numbering

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2015-09-10 16:47:47 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=afe53855a221a3c767e8eb06adfc3d1090d13bfb

fix crash on rightclicking image in tdf#93675 and pressing esc

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 11 Commit Notification 2015-09-15 09:44:27 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1ed0f437679d702b633e381eaf6f6d6f9aecdd9b

Related: tdf#93675 wrong font used in drawings in exported .docx

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 12 Caolán McNamara 2015-10-14 16:06:31 UTC
Created attachment 119612 [details]
example of rotated element causing trouble on export

To cut this problem down here's a single rotated shape from the original which when exported is mangled.
Comment 13 Andras Timar 2016-01-29 16:01:08 UTC
(In reply to Caolán McNamara from comment #12)
> Created attachment 119612 [details]
> example of rotated element causing trouble on export
> 
> To cut this problem down here's a single rotated shape from the original
> which when exported is mangled.

I fixed this one with https://gerrit.libreoffice.org/#/c/21905/
Comment 14 Xisco Faulí 2016-09-15 22:28:56 UTC Comment hidden (obsolete)
Comment 15 Dr. David Alan Gilbert 2016-09-16 16:57:07 UTC
(In reply to Xisco Faulí from comment #14)
> Hello,
> Is this bug fixed?
> If so, could you please close it as RESOLVED FIXED?

Not fixed yet; but it's certainly a heck of a lot better than when I first reported it (thanks!).
We're still missing the orange bar arrows on the right, and the curved arrows on the left are missing (or perhaps the odd arrows on the right are the reamins of them). There's also some odd text placement. (Tested on 5.1.5.2-5.fc24)
Comment 16 QA Administrators 2017-12-10 16:41:22 UTC Comment hidden (obsolete)
Comment 17 Dr. David Alan Gilbert 2017-12-11 12:19:27 UTC
Still partially present in 5.4.3.2 (Build 5.4.3.2-1.fc27 on Linux x86-64 on Fedora 27)

There's three different views of this diagram:
   a) The view LO renders from it's fodt that's fine
   b) The view exported from LO as docx and loaded back into LO
       That's misplaced some lines/boxes (in particular the orange arrow sets and the curvey lines on the left)
   c) Then there's the view Word 10 sees loading the same docx;  the lines are in about the same place as (b) but the text is more corrupt - it looks like the wrong piece of text is in the wrong boxes in some case s- e.g. the 'Data going around bend' is in the box that should be 'a' where as the 'Stuff going this way' is where the 'Data going around bend' is.

So it looks like there's errors in the docx export which are producing things that are wrong consistently between word and LO, and there are other errors that LO and word interpret differently when loaded.
Comment 18 QA Administrators 2018-12-12 03:42:34 UTC Comment hidden (obsolete)
Comment 19 Dr. David Alan Gilbert 2018-12-12 14:47:25 UTC
6.1.2.1 (6.2.1-3.fc29 on Fedora 29) seems about the same as the 5.4 I described this time last year.
Comment 20 Chris Sherlock 2019-11-19 08:53:47 UTC
From what I can see, the only issue now is the border rectangle is slightly too small...
Comment 21 Dr. David Alan Gilbert 2019-11-26 18:52:52 UTC
It's still more broken than that for me; just loading the fodt and resaving as docx and then loading it back; the orange arrows are way off (vertically); as are the curved arrows.
It also looks like a lot of the text is associated with the wrong line/shape - e.g. 'Stuff going this way' is in about the right place for 'Data going around bend' when seen in Word loading the docx; and the 'a' in the box at the top left is replaced by 'Data going around bend' - so it's almost like an offset.

(Libreoffice 6.3.3.2.0+ on fedora 31)
Comment 22 Timur 2020-09-04 13:14:50 UTC
Created attachment 165148 [details]
document compared in LO and MSO

Screenshot in 7.1+
Comment 23 Dr. David Alan Gilbert 2020-09-08 08:39:20 UTC Comment hidden (obsolete)
Comment 24 Dave Gilbert 2020-09-27 19:10:38 UTC
Created attachment 165894 [details]
Rendering on head unmodified
Comment 25 Dave Gilbert 2020-09-27 19:12:37 UTC
Created attachment 165895 [details]
Rendering on head after ungroup

There's something going on with grouping here.
Much of the diagram is in a group; ungrouping before saving gives something which in many ways is closer to the original.
The pink vertical arrow ends up touching the pink boxes, and boxes are wide enough to hold their text.
Heck, even a couple of the dotted lines end up in the right place.
Comment 26 Dave Gilbert 2020-11-22 16:20:12 UTC
I think the grouping problems might be shared with ./sw/qa/extras/ooxmlexport/data/kde216114-1.odt - the test that uses it only checks that it doesn't explode; the actual docx output is very wrong.
Comment 27 NISZ LibreOffice Team 2021-07-26 10:18:29 UTC
Created attachment 173847 [details]
The example document and its docx version in Writer

This got a lot better after 

https://git.libreoffice.org/core/+/b33634a5c07c8f7032967d8e939100a50e0152ae

author	Regina Henschel <rb.henschel@t-online.de>	Sun Jul 11 15:31:58 2021 +0200
committer	Regina Henschel <rb.henschel@t-online.de>	Tue Jul 13 10:56:31 2021 +0200

tdf#141786 correct position of child elements in group

Some text that belong to the pink arrows look a bit out of place.
Comment 28 Dr. David Alan Gilbert 2021-07-26 10:26:31 UTC
Oh yes, that screenshot looks like a great improvement; thanks NISZ and Regina
(How does itlook when reloaded back into LO?)
Comment 29 NISZ LibreOffice Team 2021-07-26 10:51:40 UTC
Created attachment 173849 [details]
The example document and its Writer-saved version in Word

In Word it looks like something is not quite right with the text and the shape they are attached to:
- The cyan rectangle on the left loses the "a" and gets the "Data going around bend" text
- The "Stuff going this way" arrow gets the "Da da da de de de yabba yabba yabba
Yabba yabba yabba" from the vertical pink arrow
- The arrow connecting the "c" box to "b" box loses the "abcdefghijklm" text. It appears as the vertical pink arrows text, after resizing it a lot.
- The "Data going around bend" arrow gets the "Stuff going this way" text

Let's refocus this bug on these issues.

I don't see a way in Word to show the pink arrows text outside the arrows, so that might not be fixable.
Comment 30 Timur 2021-07-26 11:05:26 UTC
Created attachment 173851 [details]
document compared in LO 7.3+ and MSO 2016

There are some differences on DOCX reopen, mainly captions:
1. pink arrow text "Networking.." is different wrap in LO and missing in MSO
2. (left) arrow text "abcd.." is missing in MSO
3. (left) line text "Stuff going..." wrong position in MSO (like mirrored) 
4. (left) line text "Data going..." wrong position in MSO (in square) 
5. "a" in square wrong in MSO ("Data going..")
6. text "da da da.." in MSO.

6. drawing caption is on another page in MSO (different bug)
Comment 31 Timur 2022-04-25 15:01:00 UTC
This regresses again in 7.4+.
source 2951cbdf3a6e2b62461665546b47e1d253fcb834
author	Attila Bakos (NISZ) <bakos.attilakaroly@nisz.hu>	2021-11-10
tdf#143574 OOXML export/import of textboxes in group shapes
But that's already mentioned in bug 147245 so I'll add this example.
Comment 32 Dave Gilbert 2023-06-07 00:35:13 UTC
I'm seeing another regression (currently on 52f70f04bdc586a0721 head)
where the docx loaded back into LO has lost all rotation on the text;  the docx loaded in onedrive gets the rotation right (even if it's the wrong text in some cases!) so the rotation seems to be being exported.
Comment 33 Dave Gilbert 2023-06-07 00:37:16 UTC
Created attachment 187762 [details]
Current state, showing loss of rotation on imported docx

LO loading it's own exported docx on the left showing loss of rotation.
onedrive on the right, getting all the layout pretty close - except for the wrong text binding
Comment 34 Dave Gilbert 2023-06-07 21:07:25 UTC
observation on the wrong text:

  a) I confirmed that both LO and Onedrive are showing the DML/non-fallback version of the drawing in the file

  b) It's a 'after this point' error in the output file, that is if you look at the order of items in the output file we have:

...
                       <w:t>Host 2</w:t>
                       <w:t>More hosts</w:t>
                       <w:t>b</w:t>
FIRST MISPLACE         <w:t>abcdefghijklm</w:t>
                       <w:t>Da da da de de de yabba yabba yabba</w:t>
                       <w:t>Yabba yabba yabba</w:t>
                       <w:t>Stuff going this way</w:t>
                       <w:t>Data going around bend</w:t>
                       <w:t>a</w:t>

all the items upto the 'b' end up in the right place, but everything after end up wrong.
It's the stuff that should be vertical text on the thin line between 'b' and 'c'
Comment 35 Dave Gilbert 2023-06-07 23:53:42 UTC
Deleting the thin line from c->b and then save-as docx gives me a file that's displayed
with all the right text in the right place in OneDrive/word.
Comment 36 Dave Gilbert 2023-06-08 00:03:46 UTC
Here's a diff of the (xmllint'd) word/document.xml from the unzip'd docx, other than a one pixel shift in x coords for some wierd reason, the diff is pretty much just the removal of that vertical line, so something in there must be confusing the heck out of word:

[dg@dalek unpacked-diagram-2015-08-noline]$ diff ../unpacked-mod/word/document-linted.xml word/document-linted.xml
1090c1090
<                                   <a:pt x="1674" y="220"/>
---
>                                   <a:pt x="1673" y="220"/>
1094,1096c1094,1096
<                                   <a:pt x="-222" y="2318"/>
<                                   <a:pt x="942" y="5910"/>
<                                   <a:pt x="976" y="8706"/>
---
>                                   <a:pt x="-223" y="2318"/>
>                                   <a:pt x="941" y="5910"/>
>                                   <a:pt x="975" y="8706"/>
1100c1100
<                                   <a:pt x="134" y="9101"/>
---
>                                   <a:pt x="133" y="9101"/>
1461c1461
<                                   <a:pt x="0" y="0"/>
---
>                                   <a:pt x="-1" y="0"/>
1464c1464
<                                   <a:pt x="1397" y="1498"/>
---
>                                   <a:pt x="1396" y="1498"/>
1484,1527d1483
<                         <wps:cNvSpPr/>
<                         <wps:spPr>
<                           <a:xfrm flipV="1">
<                             <a:off x="892080" y="2342520"/>
<                             <a:ext cx="0" cy="1438920"/>
<                           </a:xfrm>
<                           <a:prstGeom prst="line">
<                             <a:avLst/>
<                           </a:prstGeom>
<                           <a:ln w="0">
<                             <a:solidFill>
<                               <a:srgbClr val="3465a4"/>
<                             </a:solidFill>
<                             <a:tailEnd len="med" type="triangle" w="med"/>
<                           </a:ln>
<                         </wps:spPr>
<                         <wps:style>
<                           <a:lnRef idx="0"/>
<                           <a:fillRef idx="0"/>
<                           <a:effectRef idx="0"/>
<                           <a:fontRef idx="minor"/>
<                         </wps:style>
<                         <wps:txbx>
<                           <w:txbxContent>
<                             <w:p>
<                               <w:pPr>
<                                 <w:jc w:val="center"/>
<                                 <w:rPr/>
<                               </w:pPr>
<                               <w:r>
<                                 <w:rPr>
<                                   <w:rFonts w:ascii="Liberation Serif" w:hAnsi="Liberation Serif" w:eastAsia="SimSun" w:cs="Lucida Sans Unicode"/>
<                                   <w:lang w:val="en-US" w:bidi="he-IL"/>
<                                 </w:rPr>
<                                 <w:t>WOabcdefghijklm</w:t>
<                               </w:r>
<                             </w:p>
<                           </w:txbxContent>
<                         </wps:txbx>
<                         <wps:bodyPr lIns="0" rIns="0" tIns="144000" bIns="0" anchor="ctr" anchorCtr="1">
<                           <a:noAutofit/>
<                         </wps:bodyPr>
<                       </wps:wsp>
<                       <wps:wsp>
1724c1680
<                                   <a:pt x="0" y="4002"/>
---
>                                   <a:pt x="-1" y="4002"/>
1727c1683
<                                   <a:pt x="99" y="1708"/>
---
>                                   <a:pt x="98" y="1708"/>
2235c2191
<                 <v:shape id="shape_0" coordsize="2794,9381" path="m2793,99c2195,0,1896,300,1397,500c0,2398,1164,5990,1198,8786c1197,9185,356,9181,222,9380e" stroked="t" o:allowincell="f" style="position:absolute;left:1729;top:2894;width:1457;height:5272">
---
>                 <v:shape id="shape_0" coordsize="2795,9381" path="m2794,99c2196,0,1896,300,1398,500c0,2398,1164,5990,1198,8786c1198,9185,356,9181,223,9380e" stroked="t" o:allowincell="f" style="position:absolute;left:1729;top:2894;width:1457;height:5272">
2387c2343
<                 <v:shape id="shape_0" coordsize="8347,1499" path="m0,0c1397,1498,2824,1280,2824,1280l8346,1289e" stroked="t" o:allowincell="f" style="position:absolute;left:1373;top:5743;width:4731;height:731">
---
>                 <v:shape id="shape_0" coordsize="8348,1499" path="m0,0c1397,1498,2825,1280,2825,1280l8347,1289e" stroked="t" o:allowincell="f" style="position:absolute;left:1373;top:5743;width:4731;height:731">
2392,2413d2347
<                 <v:line id="shape_0" from="864,5743" to="864,8008" stroked="t" o:allowincell="f" style="position:absolute;flip:y">
<                   <v:textbox>
<                     <w:txbxContent>
<                       <w:p>
<                         <w:pPr>
<                           <w:jc w:val="center"/>
<                           <w:rPr/>
<                         </w:pPr>
<                         <w:r>
<                           <w:rPr>
<                             <w:rFonts w:ascii="Liberation Serif" w:hAnsi="Liberation Serif" w:eastAsia="SimSun" w:cs="Lucida Sans Unicode"/>
<                             <w:lang w:val="en-US" w:bidi="he-IL"/>
<                           </w:rPr>
<                           <w:t>abcdefghijklm</w:t>
<                         </w:r>
<                       </w:p>
<                     </w:txbxContent>
<                   </v:textbox>
<                   <v:stroke color="#3465a4" endarrow="block" endarrowwidth="medium" endarrowlength="medium" joinstyle="round" endcap="flat"/>
<                   <v:fill o:detectmouseclick="t" on="false"/>
<                   <w10:wrap type="square"/>
<                 </v:line>
2515c2449
<                 <v:shape id="shape_0" coordsize="8623,4004" path="m0,4003c99,1709,1996,1,3099,0l8622,10e" stroked="t" o:allowincell="f" style="position:absolute;left:1157;top:2670;width:4887;height:2268">
---
>                 <v:shape id="shape_0" coordsize="8624,4004" path="m0,4003c99,1709,1997,1,3100,0l8623,10e" stroked="t" o:allowincell="f" style="position:absolute;left:1157;top:2670;width:4887;height:2268">
Comment 37 Regina Henschel 2023-06-08 00:26:01 UTC
The reason of the rotation trouble is, that a shape gets attached a frame on import to be able to contain complex texts like char anchored images and tables. And these frames are not able to rotate.

At that time, when this happens, there is no simple way to detect, whether the contained content is a complex text or a simple text.

You will likely find other bug reports about the missing ability of frames to rotate.
Comment 38 Dave Gilbert 2023-06-08 01:09:47 UTC
(In reply to Regina Henschel from comment #37)
> The reason of the rotation trouble is, that a shape gets attached a frame on
> import to be able to contain complex texts like char anchored images and
> tables. And these frames are not able to rotate.
> 
> At that time, when this happens, there is no simple way to detect, whether
> the contained content is a complex text or a simple text.
> 
> You will likely find other bug reports about the missing ability of frames
> to rotate.

Hi Regina,
  Thanks for the reply.  So is that a new issue? - I thought the rotation used to work, and the screenshot from 2021-07-26 seems to suggest it did.
Comment 39 Regina Henschel 2023-06-08 07:48:06 UTC
(In reply to Dave Gilbert from comment #38)
> Hi Regina,
>   Thanks for the reply.  So is that a new issue? - I thought the rotation
> used to work, and the screenshot from 2021-07-26 seems to suggest it did.

At some time between 7.3.0 and 7.4.0 this "attach a frame" was applied to shapes in groups too. If you have an administrative implementation of LO 7.3.0 and you ungroup the drawing and then save it to docx, you will see the problem already in a LO 7.3.0.
Comment 40 Dave Gilbert 2023-06-09 01:34:29 UTC
export/text problem:
I carried on playing with that line between b/c without success; while removing it fixed the incorrect text problem, nothing else I do to it as helped; I tried unflipping it, resetting all the formats etc, rotating it a bit - no advance.

One thing I did notice; loading the same docx into G.Docs also loses the text on this line, but doesn't make the rest of the text screwed up; still that suggests there's something about this line they both don't like.