Bug 65865 - FILEOPEN: Do NOT load the default values of the styles in doc import filter
Summary: FILEOPEN: Do NOT load the default values of the styles in doc import filter
Status: RESOLVED DUPLICATE of bug 95576
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.1.0.0.beta2
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, filter:doc
Depends on:
Blocks:
 
Reported: 2013-06-17 15:39 UTC by Luke
Modified: 2016-05-11 02:50 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Paragraph Format Import Bug (84.00 KB, application/msword)
2013-06-17 15:39 UTC, Luke
Details
.docx version working in LO, broken in OO (46.35 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-06-26 18:08 UTC, Luke
Details
Writer vs Word (210.37 KB, image/png)
2015-09-09 08:24 UTC, Luke
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Luke 2013-06-17 15:39:22 UTC
Created attachment 80957 [details]
Paragraph Format Import Bug

This bug was found in OO 3.4 and LibreOffice v4.1 beta2 running in windows 7 and Linux. It appears that OpenOffice loses formatting when importing MS Word 2003/2007 .doc files with advanced styles. I have attached an example.

Steps to reproduce the bug:
1. Open attached ParagraphCenterBug.doc document in writer 
2. Open attached ParagraphCenterBug.doc document in Word2003/2007/2010
3. Compare the documents. 

Notice the the indent of the line (www.xpnet.com). The import filter seems to be incorrectly indenting all new lines in the centered text. New paragraphs are centered correctly.

Note 1: I believe this problem comes from the paragraph getting a hanging indent via numbering position of the outline numbering level 5. 

Note 2: There is a newline between the two lines as a "VT" character. If you copy paste that new line character from Word to LO, LO continues to have the misaligned text. However, if you simply press enter to add a new line in LO, the correct alignment is used.
Comment 1 Igor Gnatenko 2013-06-21 11:20:15 UTC
Confirm.
Comment 2 Luke 2013-06-26 18:08:17 UTC
Created attachment 81494 [details]
.docx version working in LO, broken in OO

The .docx version of this bug has been fixed in the LO importer. If you open the .docx version in OO3.4 or it will be incorrectly formatted. Can the LO .docx fix be backported to the .doc importer?
Comment 3 Xisco Faulí 2014-03-26 17:38:36 UTC
This issue is still reproducible with:
   - Libreoffice 4.1.5.3 Build ID: 1c1366bba2ba2b554cd2ca4d87c06da81c05d24
   - Libreoffice 4.2.2.1 Build ID: 3be8cda0bddd8e430d8cda1ebfd581265cca5a0f
   - Libreoffice 4.3.0.0.alpha0 Build ID: aeab0183e86fe011d32058864c02b2de4da32dc9

OS: Windows 7 Enterprise
Comment 4 Luke 2014-09-16 22:47:27 UTC
When this file is saved as .docx(attachment 81494 [details]), LO had the same issue in version 3.6.7.2, but by 4.0.0.0.beta1 it was fixed. 

Would a bisect to find the commit in the 4.0.x branch that fixed .docx importer be useful to fix the .doc importer?
Comment 5 Matthew Francis 2014-12-05 04:42:20 UTC
Results from bibisect-43all:
(for the point at which the indent import bug was fixed for .docx)
 241d451e09694446622f9767fb76db50481c9e32 is the first bad commit
commit 241d451e09694446622f9767fb76db50481c9e32
Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Date:   Mon Dec 10 08:31:03 2012 +0000

    source-hash-c3aa1cefdc6521d34a2a32c20bae1593e1edb5ba
    
    commit c3aa1cefdc6521d34a2a32c20bae1593e1edb5ba
    Author:     Fridrich Štrba <fridrich.strba@bluewin.ch>
    AuthorDate: Tue Aug 21 12:18:44 2012 +0200
    Commit:     Fridrich Štrba <fridrich.strba@bluewin.ch>
    CommitDate: Tue Aug 21 12:18:44 2012 +0200
    
        Uploading libmspub-0.0.3 release (support for MS Pub 97 and 98)
    
        Change-Id: I6ead205a272f0167157304748d7daf8ffc9211c9

# bad: [423a84c4f7068853974887d98442bc2a2d0cc91b] source-hash-c15927f20d4727c3b8de68497b6949e72f9e6e9e
# good: [65fd30f5cb4cdd37995a33420ed8273c0a29bf00] source-hash-d6cde02dbce8c28c6af836e2dc1120f8a6ef9932
git bisect start 'latest' 'oldest'
# bad: [e02439a3d6297a1f5334fa558ddec5ef4212c574] source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb
git bisect bad e02439a3d6297a1f5334fa558ddec5ef4212c574
# good: [8f4aeaad2f65d656328a451154142bb82efa4327] source-hash-1885266f274575327cdeee9852945a3e91f32f15
git bisect good 8f4aeaad2f65d656328a451154142bb82efa4327
# bad: [9995fae0d8a24ce31bcb5e9cd0459b69cfbf7a02] source-hash-8600bc24bbc9029e92bea6102bff2921bc10b33e
git bisect bad 9995fae0d8a24ce31bcb5e9cd0459b69cfbf7a02
# bad: [51b63dca7427db64929ae1885d7cf1cc7eb0ba28] source-hash-806d18ae7b8c241fe90e49d3d370306769c50a10
git bisect bad 51b63dca7427db64929ae1885d7cf1cc7eb0ba28
# bad: [446a69834acf747d9d18841ec583512ae8fa42e7] source-hash-06a8ca9339f02fccf6961c0de77c49673823b35f
git bisect bad 446a69834acf747d9d18841ec583512ae8fa42e7
# bad: [d2720e99b9e6cb7b099256cc7a6d2b3f907b8d7c] source-hash-7dd6c0a8372810f48e6bee35a11ac4ad0432640b
git bisect bad d2720e99b9e6cb7b099256cc7a6d2b3f907b8d7c
# bad: [98e26b741cd0eff4b7549d782d7db5a1e98eb1a6] source-hash-c29af1572ad15ac5199a09e5812fb8354c165329
git bisect bad 98e26b741cd0eff4b7549d782d7db5a1e98eb1a6
# good: [a72763112e846bcb1c4e4c6f1612ccab6ac73772] source-hash-4662df8a7561ce71ba00accbb5170e10818d6008
git bisect good a72763112e846bcb1c4e4c6f1612ccab6ac73772
# bad: [241d451e09694446622f9767fb76db50481c9e32] source-hash-c3aa1cefdc6521d34a2a32c20bae1593e1edb5ba
git bisect bad 241d451e09694446622f9767fb76db50481c9e32
# good: [52abf2b644b9c2396246581d02b1796dd9cd2dff] source-hash-37b9e290d9e3d20652df0abe1a1458412f3cfe2c
git bisect good 52abf2b644b9c2396246581d02b1796dd9cd2dff
# first bad commit: [241d451e09694446622f9767fb76db50481c9e32] source-hash-c3aa1cefdc6521d34a2a32c20bae1593e1edb5ba
Comment 6 Matthew Francis 2014-12-05 05:00:57 UTC
May possibly have been this commit? I don't see any other obvious candidates in the bibisected range

commit b95d203bc17c83ec0fe5139f519d53ed1d842d3a
Author: Cédric Bosdonnat <cedric.bosdonnat@free.fr>
Date:   Mon Aug 20 11:29:29 2012 +0200

    fdo#53175: Don't load the default values of the styles in writerfilter
    
    ...or we may have some additional properties set on some styles.
    
    Change-Id: I5a5d307931a2a6c1f25bd2da93381d8de65c2480
Comment 7 Matthew Francis 2015-01-13 01:47:32 UTC
Confirmed by building that it was commit b95d203bc17c83ec0fe5139f519d53ed1d842d3a which fixed this case for .docx

Adding Cc: to cedric.bosdonnat.ooo@free.fr; Is there any chance this could be applied to the .doc filter as well? Thanks
Comment 8 Xisco Faulí 2015-09-08 16:05:04 UTC
I can no longer reproduce this issue with

Version: 5.0.0.5
Build ID: 1b1a90865e348b492231e1c451437d7a15bb262b
Locale: es-ES (es_ES)

and Office Word 2010

on Windows 7 (64-bit)

Thus, I close this as RESOLVED WORKSFORME
Comment 9 Luke 2015-09-09 08:24:54 UTC
Created attachment 118543 [details]
Writer vs Word

Xisco, 
You need to compare the .doc to the docx or Writer vs Word. The close parenthesis should be under the "W".

Still not fixed in:

Version: 5.1.0.0.alpha1+
Build ID: 9a8a4442fd6368c20cf6a3d7efa3bd42962ee12f
Comment 10 Luke 2015-09-09 08:29:35 UTC
Miklos may be interested in this. Since:
http://cgit.freedesktop.org/libreoffice/core/commit/?id=b95d203bc17c83ec0fe5139f519d53ed1d842d3a

Fixed the docx issue. The likely solution is to not load the default values of the styles in doc import filter.
Comment 11 Xisco Faulí 2015-09-09 08:44:57 UTC
ohh I see, apologize for the inconvenience caused
Comment 12 Xisco Faulí 2015-11-23 12:27:54 UTC
This issue is still present in 

Version: 5.1.0.0.alpha1+
Build ID: e6fade1ce133039d28369751b77ac8faff6e40cb
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2015-11-16_00:12:42
Locale: es-ES (es_ES)

on Windows 7
Comment 13 Robinson Tryon (qubit) 2015-12-10 01:18:31 UTC
Migrating Whiteboard tags to Keywords: (bibisected)
Comment 14 Luke 2016-05-11 02:50:54 UTC

*** This bug has been marked as a duplicate of bug 95576 ***