Bug 93635 - FILEOPEN: Embedded Word file slowly loaded
Summary: FILEOPEN: Embedded Word file slowly loaded
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.4.0.3 release
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:docx, perf
: 106700 (view as bug list)
Depends on:
Blocks: DOCX OLE-Objects
  Show dependency treegraph
 
Reported: 2015-08-24 21:16 UTC by Matthew Holloway
Modified: 2020-02-17 12:59 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Source document 'word inside word.docx' (45.40 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2015-08-24 21:16 UTC, Matthew Holloway
Details
Rendering in LibreOffice 5.0.0.5 (25.39 KB, application/pdf)
2015-08-24 21:17 UTC, Matthew Holloway
Details
Rendering in LibreOffice 4.2.8.2 (41.75 KB, application/pdf)
2015-08-24 21:18 UTC, Matthew Holloway
Details
Document exported with LibO 6.0.0.2.0+ (15.38 KB, application/pdf)
2018-01-29 08:45 UTC, Marina Latini (SUSE)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthew Holloway 2015-08-24 21:16:55 UTC
Created attachment 118134 [details]
Source document 'word inside word.docx'

This occurs in 5.0.0.5 (Windows 7), and appears to be a regression from LibreOffice 4.2.8.2 (Ubuntu 14.04).

The bug is that on page 2 the embedded office file is scaled incorrectly (only a single large line of text is shown at the bottom of the second page, when it should render multiple lines -- I'll attach PDFs showing this)
Comment 1 Matthew Holloway 2015-08-24 21:17:41 UTC
Created attachment 118135 [details]
Rendering in LibreOffice 5.0.0.5

The rendering that appears to be scaled incorrectly.
Comment 2 Matthew Holloway 2015-08-24 21:18:45 UTC
Created attachment 118136 [details]
Rendering in LibreOffice 4.2.8.2

The expected rendering, from LibreOffice 4.2.8.2
Comment 3 tommy27 2015-08-25 13:40:23 UTC
tested under Win8.1 x64
works fine in LibO 4.3.1 and bug starts in LibO 4.3.2

so the guilty commit is among these:
https://wiki.documentfoundation.org/Releases/4.3.2/RC1
https://wiki.documentfoundation.org/Releases/4.3.2/RC2

apart from the incorrect scaling there's also much slower loading of the file.

status NEW. version 4.3.2.2. regression. bibisectRequest.
Comment 4 Michael Weghorn 2015-08-25 16:01:03 UTC
I (bi)bisected this bug.
The two regressions (slow loading of the document and incorrect scaling) actually do start at the same commit.

result:

cc44ace348bc71b8e0411f3c4a3dbcec4852c8a5 is the first bad commit
commit cc44ace348bc71b8e0411f3c4a3dbcec4852c8a5
Author: Matthew Francis <mjay.francis@gmail.com>
Date:   Sun Mar 15 01:57:21 2015 +0800

    source-hash-41aa970b3120837ca9cadb12997a53ad322145a4
    
    commit 41aa970b3120837ca9cadb12997a53ad322145a4
    Author:     Miklos Vajna <vmiklos@collabora.co.uk>
    AuthorDate: Wed Aug 27 15:24:37 2014 +0200
    Commit:     Miklos Vajna <vmiklos@collabora.co.uk>
    CommitDate: Wed Aug 27 15:34:41 2014 +0200
    
        DOCX import: fix handling of embedded DOCX files
    
        The problem was that SwXTextEmbeddedObject::getEmbeddedObject() returned
        an empty reference for those embedded objects, so the HTML filter
        couldn't extract their content when it wanted to do so.
    
        It turns out the reason for this was that the DOCX importer only handled
        the replacement image + raw native data for the object. Fix this by
        creating the embedded object with the correct CLSID and import the
        raw data into the empty embedded document model.
    
        This is similar to what is done for XLSX-in-PPTX in
        oox::drawingml::ShapeExport::WriteOLE2Shape(), just for the import part.
    
        Change-Id: Ieb1dcb1774d2d4da00117e3a35160053066c78aa

:040000 040000 e8cb4e3b985c04b5016f70818a5e2706a479eac8 5e02d229804ec183756cea153845f8c59530ea74 M	opt


$ git bisect log
# bad: [cf6ea17155fabb2a120ba07c150735591ac861d7] source-hash-3f94c9e9ddfd807b449f3bb9b232cf2041fa12d2
# good: [fc71ac001f16209654d15ef8c1c4018aa55769f5] source-hash-c15927f20d4727c3b8de68497b6949e72f9e6e9e
git bisect start 'latest' 'oldest'
# good: [8cf60cc706948588e2f33a6d98b7c55d454e362a] source-hash-f340f0454627939f1830826fb5cc53a90e6c62a4
git bisect good 8cf60cc706948588e2f33a6d98b7c55d454e362a
# bad: [7beddf3808dadd525d7e55c00a5a90a2b44c23d3] source-hash-2f10386ce577f52e139aa23d41bc787d8e0b4d59
git bisect bad 7beddf3808dadd525d7e55c00a5a90a2b44c23d3
# bad: [7d319609d8266af06aa3256fd3773d052b9150dc] source-hash-1fec67aab152e0c0ad6dd85082c50f1beff7d520
git bisect bad 7d319609d8266af06aa3256fd3773d052b9150dc
# bad: [ff24df9a7aadef7aaf721b131c9e06f19fa9239a] source-hash-653025e6f10d07d0a95f7b75d56ff457f1902e82
git bisect bad ff24df9a7aadef7aaf721b131c9e06f19fa9239a
# good: [9460f8d13abf06281723950db84607788db19966] source-hash-2a93ed09240c6e9871593641dabbb7502af87986
git bisect good 9460f8d13abf06281723950db84607788db19966
# bad: [f44a1fe93fe524dedbabd854b038fc047b1d38f4] source-hash-4e96f7ffdb5d7b84ea70888626523dcdc5dfe0ac
git bisect bad f44a1fe93fe524dedbabd854b038fc047b1d38f4
# bad: [62faa37985c66e2f50919f9392257d209d21520a] source-hash-7db1ac59128ecc175ec1fd943ee77d469dcb0ea1
git bisect bad 62faa37985c66e2f50919f9392257d209d21520a
# good: [bdcbaae61e7c235354288d519d1b594f07fcece0] source-hash-aebcabd54cc5587f3856c48db0a4c4fc0f3f8ce8
git bisect good bdcbaae61e7c235354288d519d1b594f07fcece0
# good: [8a57560ef0c2aea5599c42505850ffbf4cbcb97b] source-hash-2fb876d85ddbfea0e6b6a38f71135e3dbe4233bb
git bisect good 8a57560ef0c2aea5599c42505850ffbf4cbcb97b
# bad: [50a379bc8660b76fcefac17317f4c1602db662a6] source-hash-56c9850145faa9ac04c3f09633e56b6c8c22c6c4
git bisect bad 50a379bc8660b76fcefac17317f4c1602db662a6
# good: [031648eed9b726609afbdac0f32b9fd4b0abda71] source-hash-b77bf9759a74454391fa5d2f4a6ec4594d6d3e89
git bisect good 031648eed9b726609afbdac0f32b9fd4b0abda71
# good: [961df52bc82aa28f9afaeac1878343cc25c47d62] source-hash-be84c0e8752cff050fbf8056848fa47a56be6b03
git bisect good 961df52bc82aa28f9afaeac1878343cc25c47d62
# bad: [ee2d6fb217add1dfeb44a80bfa2071d93f310309] source-hash-804d60d2ee4c099f685a6e42438fa0de15ca29be
git bisect bad ee2d6fb217add1dfeb44a80bfa2071d93f310309
# bad: [cc44ace348bc71b8e0411f3c4a3dbcec4852c8a5] source-hash-41aa970b3120837ca9cadb12997a53ad322145a4
git bisect bad cc44ace348bc71b8e0411f3c4a3dbcec4852c8a5
# first bad commit: [cc44ace348bc71b8e0411f3c4a3dbcec4852c8a5] source-hash-41aa970b3120837ca9cadb12997a53ad322145a4
Comment 5 Michael Weghorn 2015-08-25 16:02:15 UTC Comment hidden (obsolete)
Comment 6 Robinson Tryon (qubit) 2015-12-13 11:13:03 UTC Comment hidden (obsolete)
Comment 7 Xisco Faulí 2016-09-26 09:26:14 UTC Comment hidden (obsolete)
Comment 8 Miklos Vajna 2016-10-21 20:01:25 UTC
I'm not sure this is a regression. The relevant option is Tools -> Options -> Load/Save -> Microsoft Office -> WinWord to Writer or reverse.

In case that's enabled (and that's the default), we load the embedded docx, and underlying SwViewShell::PrtOle2() call updates the preview of the document, that takes a lot of time. This is something to fix, but it was like this forever.

This is now more visible as the option is now respected in the DOCX import filter, but the root cause is not new at all.
Comment 9 Xisco Faulí 2016-11-16 23:43:33 UTC
Hello Miklos,
I found another document that introduced a regression by the same commit: attachment 101597 [details].
Comment 10 QA Administrators 2018-01-19 03:33:23 UTC Comment hidden (obsolete)
Comment 11 Marina Latini (SUSE) 2018-01-29 08:43:38 UTC
confirmed on:

Versione: 6.0.0.2.0+
Build ID: 00m0(Build:2)
Thread CPU: 4; SO: Linux 4.14; Resa interfaccia: predefinito; VCL: gtk3; 
Versione locale: it-IT (it_IT.UTF-8); Calc: group
Comment 12 Marina Latini (SUSE) 2018-01-29 08:45:07 UTC
Created attachment 139423 [details]
Document exported with LibO 6.0.0.2.0+
Comment 13 Miklos Vajna 2018-01-30 08:33:15 UTC
FWIW the above commit unconditionally enabled the actual load of embedded Word files, but then commit 4034d96aeee07f069a6fb9e0445436577a6065e3 (writerfilter: respect WinWordToWriter config setting, 2014-08-28) made this conditional. So based on that I'm leaving this bug open for the performance part, but I'm removing the regression flag, comment 8 explains this problem was there before the above commit already.
Comment 14 László Németh 2018-12-19 21:54:30 UTC
Scaling problem of this issue has been fixed, see Bug 99631.
Comment 15 Timur 2020-01-20 11:17:11 UTC
*** Bug 106700 has been marked as a duplicate of this bug. ***