Bug 34722 - FILEOPEN: Text boxes are not shown in generated .doc documents
Summary: FILEOPEN: Text boxes are not shown in generated .doc documents
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: high major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:doc
: 70831 (view as bug list)
Depends on:
Blocks: DOC-Textbox
  Show dependency treegraph
 
Reported: 2011-02-25 07:24 UTC by Aurimas Fišeras
Modified: 2019-12-06 12:20 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Document with now text boxes shown (53.50 KB, application/msword)
2011-02-25 07:24 UTC, Aurimas Fišeras
Details
example file where LO not display text boxes (76.00 KB, application/msword)
2012-06-30 10:45 UTC, Franta Hanzlik
Details
Comparison on Word 2010 and LibreOffice 4.1.1.2 (362.07 KB, image/jpeg)
2013-09-30 08:47 UTC, Zeki Bildirici
Details
Document with now text boxes shown - PDF from MS Office 2010 (234.30 KB, application/x-pdf)
2016-05-16 13:01 UTC, Timur
Details
example document saved by OOo2.4.3 (21.47 KB, application/vnd.oasis.opendocument.text)
2019-12-06 11:11 UTC, Regina Henschel
Details
Pdf result of first attachment (44.15 KB, application/pdf)
2019-12-06 12:16 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aurimas Fišeras 2011-02-25 07:24:22 UTC
Created attachment 43806 [details]
Document with now text boxes shown

Attached document is generated by some accounting program entirely from text boxes and shows no content when opened with LibO 3.3.1 or OO 3.x on Linux/Windows.

Office XP+, Word Viewer 2003+ opens this file fine.
KOffice 2.3.1 opens this document and shows all text.
Comment 1 Björn Michaelsen 2011-12-23 11:51:40 UTC Comment hidden (obsolete)
Comment 2 Aurimas Fišeras 2011-12-24 01:33:24 UTC Comment hidden (obsolete)
Comment 3 Franta Hanzlik 2012-06-30 10:45:01 UTC
Created attachment 63648 [details]
example file where LO not display text boxes

this very annoying bug still persist in LO 3.5.2. I attached another file where LO fail.
Comment 4 Aurimas Fišeras 2012-08-14 08:18:13 UTC Comment hidden (obsolete)
Comment 5 Franta Hanzlik 2012-09-01 13:55:48 UTC Comment hidden (obsolete)
Comment 6 Franta Hanzlik 2013-03-30 16:41:38 UTC Comment hidden (obsolete)
Comment 7 bfoman (inactive) 2013-08-15 15:02:12 UTC
Confirmed with:
Version: 4.2.0.0.alpha0+
Build ID: 087a610fcd5c0c354a9ed6bfccd3451b667d62a3
TinderBox: Win-x86@6-debug, Branch:master, Time: 2013-08-04_21:41:24
Windows 8.1 Enterprise Preview 64 bit

No text content in both files. All good in Word 2013.
Comment 8 Zeki Bildirici 2013-09-30 08:47:20 UTC Comment hidden (obsolete)
Comment 9 Zeki Bildirici 2013-09-30 08:51:50 UTC Comment hidden (obsolete)
Comment 10 Franta Hanzlik 2014-05-20 08:44:47 UTC
Just tried it in LO 4.2.4.2, and (after three+ years from bug submitting) no change - text is missing. It seems as (at least some) reported bugs are ignored.
What other I tried, only Calligra Word give some text output, but it is not quite ideal. But unlike LO at least some text is displayed/printed, although little corrupted.
Comment 11 retired 2014-05-22 13:32:54 UTC
Confirmed:4.4.0.0a0+:OSX

OSX 10.9.3 and LO Version: 4.4.0.0.alpha0+
Build ID: 2c61edfdf57dabbd86ecc440444b6b00443f916a
TinderBox: MacOSX-x86@49-TDF, Branch:master, Time: 2014-05-22_01:02:26
Comment 12 Julien Nabet 2014-09-03 21:15:09 UTC
I noticed this console log repeated several times:
warn:legacy.osl:24511:1:sw/source/filter/ww8/ww8graf.cxx:2441: Where is the Shape ?
Comment 13 Julien Nabet 2014-09-03 21:17:20 UTC
Miklos: thought you might be interested in this tracker.

(I could reproduce this with master sources updated today and noticed:
warn:legacy.osl:24511:1:sw/source/filter/ww8/ww8graf.cxx:2441: Where is the Shape ?
as I put in my previous comment)
Comment 14 Timur 2016-05-16 13:01:56 UTC
Created attachment 125092 [details]
Document with now text boxes shown - PDF from MS Office 2010
Comment 15 Timur 2016-09-24 07:51:30 UTC
*** Bug 70831 has been marked as a duplicate of this bug. ***
Comment 16 Timur 2016-09-24 07:56:26 UTC
Should also be tested with attachment 88070 [details] from Bug 70831 that looks like  attachment 108380 [details] and should look like attachment 118501 [details].
Comment 17 Telesto 2016-11-18 20:23:08 UTC
Confirming with:
Version: 5.3.0.0.alpha1+
Build ID: 43b5ca69aa545cf93eded55258d92d651917815f
CPU Threads: 4; OS Version: Windows 6.2; UI Render: default; Layout Engine: new; 
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2016-11-18_05:27:05
Locale: nl-NL (nl_NL); Calc: CL
Comment 18 QA Administrators 2018-10-23 02:48:28 UTC Comment hidden (obsolete)
Comment 19 Aurimas Fišeras 2018-10-23 06:11:15 UTC
Reproducible in:
Versija: 6.1.2.1
Darinio identifikatorius: 1:6.1.2-0ubuntu1
Procesoriaus gijos: 4; OS:Linux 4.18; Sąsajos pateikimas: numatytasis; VCL: gtk3; 
Lokalė: lt-LT (lt_LT.UTF-8); Calc: group threaded

And:
Versija: 6.2.0.0.alpha0+
Darinio identifikatorius: f209524965641596cea16a0ee7780fffe176235a
Procesoriaus gijos: 4; OS:Linux 4.18; Sąsajos pateikimas: numatytasis; VCL: gtk3; 
Lokalė: lt-LT (lt_LT.UTF-8); Calc: threaded
Comment 20 Chen-Ku 2018-11-29 08:36:16 UTC
till exists in version:
Version: 6.3.0.0.alpha0+ (x64)
Build ID: 0f25a3c36f27fd51453b9a9115f236b83c143684
CPU threads: 12; OS: Windows 10.0; UI render: GL; VCL: win; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2018-11-27_20:06:55
Locale: zh-TW (zh_TW); UI-Language: en-US
Calc: threaded
Comment 21 a170811 2018-11-29 08:42:42 UTC
till exists in version:
Version: 6.0.6.2
Build ID: 1:6.0.6-0ubuntu0.18.04.1
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: zh-TW (zh_TW.UTF-8);Calc: group
Comment 22 QA Administrators 2019-11-30 03:38:59 UTC Comment hidden (obsolete)
Comment 23 Aurimas Fišeras 2019-11-30 06:25:47 UTC
Bug is still present in:
Versija: 6.3.3.2
Darinio identifikatorius: 1:6.3.3-0ubuntu2
Procesoriaus gijos: 8; OS:Linux 5.3; Sąsajos pateikimas: numatytasis; VCL: gtk3; 
Lokalė: lt-LT (lt_LT.UTF-8); Sąsajos kalba: lt-LT

And:
Versija: 6.5.0.0.alpha0+
Darinio identifikatorius: 63634738dd03cc74806ce6843c16ff5e51a371a0
Procesoriaus gijos: 8; OS:Linux 5.3; Sąsajos pateikimas: Skia/Vulkan; VCL: gtk3; 
Lokalė: lt-LT (lt_LT.UTF-8); Sąsajos kalba: lt-LT
Comment 24 Julien Nabet 2019-12-05 21:03:12 UTC
Regina: taking a look at this one, I noticed that the log
warn:legacy.osl:271686:271686:sw/source/filter/ww8/ww8graf.cxx:2596: Where is the Shape ?
appeared first with mso_sptNotchedCircularArrow

bt:
#0  0x00007fffd7a717c7 in SvxMSDffManager::ImportShape(DffRecordHeader const&, SvStream&, SvxMSDffClientData&, tools::Rectangle&, tools::Rectangle const&, int, int*)
    (this=0x55555b1a6720, rHd=..., rSt=..., rClientData=..., rClientRect=..., rGlobalChildRect=..., nCalledByGroup=0, pShapeId=0x0) at /home/julien/lo/libreoffice/filter/source/msfilter/msdffimp.cxx:4925
#1  0x00007fffd7a6c725 in SvxMSDffManager::ImportObj(SvStream&, SvxMSDffClientData&, tools::Rectangle&, tools::Rectangle const&, int, int*)
    (this=0x55555b1a6720, rSt=..., rClientData=..., rClientRect=..., rGlobalChildRect=..., nCalledByGroup=0, pShapeId=0x0) at /home/julien/lo/libreoffice/filter/source/msfilter/msdffimp.cxx:4119
#2  0x00007fffd7a77b21 in SvxMSDffManager::GetShape(unsigned long, SdrObject*&, SvxMSDffImportData&) (this=0x55555b1a6720, nId=1157, rpShape=@0x7ffffffef498: 0x0, rData=...)
    at /home/julien/lo/libreoffice/filter/source/msfilter/msdffimp.cxx:6425
#3  0x00007fffd65e2c25 in SwWW8ImplReader::Read_GrafLayer(long) (this=0x55555bcdf210, nGrafAnchorCp=1) at /home/julien/lo/libreoffice/sw/source/filter/ww8/ww8graf.cxx:2590
#4  0x00007fffd6611f2d in SwWW8ImplReader::ReadChar(long, long) (this=0x55555bcdf210, nPosCp=1, nCpOfs=0) at /home/julien/lo/libreoffice/sw/source/filter/ww8/ww8par.cxx:3740
#5  0x00007fffd66112af in SwWW8ImplReader::ReadChars(int&, int, long, long) (this=0x55555bcdf210, rPos=@0x7ffffffefc88: 1, nNextAttr=121, nTextEnd=122, nCpOfs=0)
    at /home/julien/lo/libreoffice/sw/source/filter/ww8/ww8par.cxx:3527
#6  0x00007fffd66137c4 in SwWW8ImplReader::ReadText(int, int, ManTypes) (this=0x55555bcdf210, nStartCp=0, nTextLen=122, nType=MAN_MAINTEXT) at /home/julien/lo/libreoffice/sw/source/filter/ww8/ww8par.cxx:4088
#7  0x00007fffd661aff3 in SwWW8ImplReader::CoreLoad(WW8Glossary const*) (this=0x55555bcdf210, pGloss=0x0) at /home/julien/lo/libreoffice/sw/source/filter/ww8/ww8par.cxx:5277
#8  0x00007fffd661ea8a in SwWW8ImplReader::LoadThroughDecryption(WW8Glossary*) (this=0x55555bcdf210, pGloss=0x0) at /home/julien/lo/libreoffice/sw/source/filter/ww8/ww8par.cxx:5941
#9  0x00007fffd6620402 in SwWW8ImplReader::LoadDoc(WW8Glossary*) (this=0x55555bcdf210, pGloss=0x0) at /home/julien/lo/libreoffice/sw/source/filter/ww8/ww8par.cxx:6245
#10 0x00007fffd662100c in WW8Reader::Read(SwDoc&, rtl::OUString const&, SwPaM&, rtl::OUString const&) (this=0x55555b239620, rDoc=..., rBaseURL="file:///tmp/no_text_boxes_shown.doc", rPaM=SwPaM = {...})
    at /home/julien/lo/libreoffice/sw/source/filter/ww8/ww8par.cxx:6396

pRet is false because "GetCustomShapeContent( aObjData.eShapeType )" from line 4433
 (see https://opengrok.libreoffice.org/xref/core/filter/source/msfilter/msdffimp.cxx?r=be3a8183#4433) returns nothing.

Indeed this function doesn't deal with case "mso_sptNotchedCircularArrow"
(see https://opengrok.libreoffice.org/xref/core/svx/source/customshapes/EnhancedCustomShapeGeometry.cxx?r=d2702aea#8346)
but when taking a look at how the other ones are implemented, it seems it's quite complex.
I don't pretend I could succeed but are there some doc about how to implement them?
Comment 25 Regina Henschel 2019-12-05 23:24:28 UTC
(In reply to Julien Nabet from comment #24)
> Regina: taking a look at this one, I noticed that the log
> warn:legacy.osl:271686:271686:sw/source/filter/ww8/ww8graf.cxx:2596: Where
> is the Shape ?
> appeared first with mso_sptNotchedCircularArrow
> 
... 
> pRet is false because "GetCustomShapeContent( aObjData.eShapeType )" from
> line 4433
>  (see
> https://opengrok.libreoffice.org/xref/core/filter/source/msfilter/msdffimp.
> cxx?r=be3a8183#4433) returns nothing.
> 
> Indeed this function doesn't deal with case "mso_sptNotchedCircularArrow"
> (see
> https://opengrok.libreoffice.org/xref/core/svx/source/customshapes/
> EnhancedCustomShapeGeometry.cxx?r=d2702aea#8346)
> but when taking a look at how the other ones are implemented, it seems it's
> quite complex.
> I don't pretend I could succeed but are there some doc about how to
> implement them?

I don't know what you are looking for, and think, I cannot really help you.

The specification of the binary format is in [MS-ODRAW]: Office Drawing Binary File Format
https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-offfflp/6ae2fd93-51fc-4e75-a54a-1b175c627b51

The shapes in question are legacy "text box" shapes in MS, German "Textfeld". Those does not exists in modern docx.

OOo2.4.3 opens the file without problems, OOo3.2 crashes. The texts are legacy rectangles there, not custom shapes and not frames. They are anchored "at character". I see only three custom shapes, one for the rounded rectangle at the top, one for the rectangle with text UNIT PLUS in the middle, and one for the last rectangle at the bottom. The other lines are single lines.
Comment 26 Regina Henschel 2019-12-06 11:11:49 UTC
Created attachment 156351 [details]
example document saved by OOo2.4.3

I was wrong about "frame". OOo2.4.3 creates a group with an empty rectangle and a frame, that holds the text. And indeed, which a break point in msdffimp.cxx#4299, I see alternating mso_sptNotchedCircularArrow(100) and mso_sptTextBox(202).

BTW, at time of OOo2.4.3 the filter was in svx/source/msfilter.
Comment 27 Julien Nabet 2019-12-06 12:16:55 UTC
Created attachment 156353 [details]
Pdf result of first attachment

I opened the first attachment of this bugtracker with Word and pdf exported it on PDF Creator so we can compare more easily.
Comment 28 Julien Nabet 2019-12-06 12:20:38 UTC
(In reply to Regina Henschel from comment #26)
> Created attachment 156351 [details]
> example document saved by OOo2.4.3
> 
> I was wrong about "frame". OOo2.4.3 creates a group with an empty rectangle
> and a frame, that holds the text. And indeed, which a break point in
> msdffimp.cxx#4299, I see alternating mso_sptNotchedCircularArrow(100) and
> mso_sptTextBox(202).
> 
> BTW, at time of OOo2.4.3 the filter was in svx/source/msfilter.

About the second attachment you resaved with OOo2.4.3, indeed it works with LO 6.3.3.
But when comparing the second initial attachment opened in LO 6.3.3, we've got a big regression here, there's no text at all!