Created attachment 62659 [details] Example 9.docx - original file, *.jpg - how document looks like in MS Office and LibreOffice We have document with the following numeration of paragraphs in MS Office 2007 Service Pack 3 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 2 2.1 2.2 3 3.1 3.2 4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 5 and so on If we open this document in LibO then numeration would be different 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 2 1.8 1.9 1 3.1 3.2 4 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18 1.19 1.20 1.21 3 and so on
The same problem was confirmed in Apache OpenOffice 3.4.0 https://issues.apache.org/ooo/show_bug.cgi?id=119840
Confirmed with: LO 3.5.4.2 Build ID: own W7 debug build Windows 7 Professional SP1 64 bit Autonumeration is different than in Word 2010.
Confirmed also on Mac OS X 10.6.8 (Intel): * With LibO 3.6.2.2., the TOC is even worse (most sub-headings are missing!) * With a current master build, the TOC is again as in LibO 3.5.4. The numbering problems are also visible in the “Navigator“ window, maybe even more impressive, as this window visualizes the headings hierarchy nicely. First I had the suspicion that the .docx file itself was damaged, because the formatting (e.g. heading 3.0) seems somewhat inconsistent, but I opened in MS Office 2010 on Win 7 and can confirm that the TOC looks correctly (or, at least, much better) there. Changed Summary -- this is about automatic numbering of headings, not of (ordinary/all) paragraphs, right?
Right! I can say, that in MS Office 2007 SP3 or even MS Office 2003 SP3 on Windows XP TOC looks correctly (even if we agree that you are right and file is damaged)
Even if we try to save file in different format in MS Office with Save as... (for example, in Microsoft Word 97/2000/XP/2003 (.doc) format) nothing would change - in LibO and AOO there are problems with TOC, in MSO all is fine (even in 2003 with File Format Converters, 2007, 2010). So, most likely .docx file is OK, it is not damaged.
(In reply to comment #5) > So, most likely .docx file is OK, it is not damaged. Agree! I did not want to slander your sample file ;-) I just considered the *possibility* that the file was somewhat damaged, because I had seen some similar bug before which made us all big headaches until we realized that a corrupted file was the reason. But anyway, if all these MS Office versions open the file correctly, and even when saved in different formats, LibreOffice should definitely open the file correctly, too -- so this is indeed an important bug.
@ Writer experts: Hello Cédric, Luboš, Michael, and Miklós, this is another interesting problem in the import of .docx files. It exists (with some variation of the symptoms) since LibreOffice 3.3.0, so it is no regression, but nevertheless it would be well worth fixing this issue, because it might impair the use of LibreOffice for all advanced document issues. I say “might”, because right now we have only one sample file, but there is no obvious reason why it should not also happen in many other complex files ... Breaking the automatic numbering of headings is a nightmare e.g. for academic usage. Thank you very much for looking into this issue!
Same effect with AOOo 3.4.1, so seems inherited from OOo. I can't reproduce the problem with my own docx sample documents created from writer, so a) serious misunderstanding in LibO concerning docx outline numbering b) reporter's sample document is damaged? c) some special effect (localization) ... ? d) something else? That never worked, so I don't see this one as a 3.5 MAB, and remove it (also because lifecycle of 3.5 is terminated). But indeed, I read concerning lots of difficulties to use LibO for academic use and docx document interchange. If this is a general problem that is very serious, and this Bug would be a good candidate for a HardHack. <http://wiki.documentfoundation.org/HardHacks>. But for that nomination we should know some more concerning the details of this problem, for example we need a reliable sample document with known history (created with MS WORD) only created as a sample document @all: Can anybody create a test kit with a reliable sample document created from WPRD?
Created attachment 70885 [details] Test Kit Works fine I created the .dcx from the .odt, and there outline numbering works fine in MS WORD Viewer and "LibreOffice 3.6.4.3 rc" German UI/ German Locale [Build-ID: 2ef5aff] {pull date 2012-11-28} on German WIN7 Home Premium (64bit) Can somebody find out the difference to reporter's sample with what I can reproduce the problem?
There seems to be another new possible duplicate about TOC numbering - see bug 56798.
Created attachment 70891 [details] re Example 9.docx: Word 2010 doc version and Word screenshots Attached ZIP contains a doc version of Example 9.docx from , as produced by Word 2010. Also some screenshots done with Word. After looking into the file, I would say that most problems come from inconsistent formatting. There are no list styles associated with the headings, all numbering is done through direct list formatting, and some are inserted as simple text. Word restarts numbering on certain level-1 paragraphs, and all the level-2 paragraphs continue accordingly (ch. 2 is followed by ch. 2.1, 2.2 etc.). But some of the level-1 paragraphs which restart numbering are not the headings themselves, but hidden paragraphs... The bugs in LO: LO does not respect the restart of numbering; and it numbers level-1 and level-2 paragraphs indepedently. (Import of list formats from DOC is not very good in OO/LO. You can see that LO creates for some numbered paragraphs automatic list styles; others are numbered with direct formatting.) Screenshot 1: first doc pages, with level-2-numbering shown in grey fields. The non-grey ones are simple numbers in the text. Screenshot 2: Chapters 1 and 2 with formatting revealed. Just before chapter 2.1, a new list is started (staring with value 2). Screenshot 3: Same text, with empty paragraphs and the (not automatically numbered) chapter-2-heading removed. Screenshot 4: Same as screenshot 3, but without "formatting revealed". You cannot see the hidden paragraph starting a new list (2.1, 2.2 etc.). You can see the hidden paragraphs in LO (if that is enabled) _after_ chapter 2 and _before_ chapter 3.1. They restart numbering on level 1 with "2" and "4", respectively (in Word); the level-2-numbering adjust to that and continue with "2.1" and "4.1", respectively. LO does not take account of the restarted numbering, and the level-2 numbering is independent from the level-1 numbering. Example 9's way of restarting numbered lists is a bit eccentrical... It would be interesting to know if the document was hacked together like that by a user not wanting to deal with automatic numbering properly, or if Word did some of that on its own.
Changed the summary: Problem is not restricted to DOCX but also concerns DOC import. And it's not only outlines which are concerned. If numbering had been done via headings in the test file, it would probably have worked better.
Created attachment 70894 [details] Word 2010 doc/docx files with numbered headings Here are some files with documents created from scratch under Word 2010. Text was written, paragraph styles applied, then numbering using the Word "Numbering" ribbon button; no list styles. File sets as DOC, DOCX and PDF. LO is very MS compatible here: File set 1: Word defaulted to a mixed numbering style (level 1: numbers; level 2: letters) and kept all numbered paragraphs in its own numbering group. File set 2: Changed level 2 to number instead of letter. File set 3: Wanted to add a numbered list as body text and used same numbering button. Word continues numbering the fresh list with the heading numbering. File set 4: So we "restart" the body text list numbering... (4a:) Word does it, but of course the next heading continues the numbering ("5") - and in addition to that, the following level-2 list jumps back to letters. (4b:) I can restart numbering of heading 2, and I can set level 2 list format to number again; but I didn't manage to get heading 2.1 start with "1" again. This is where the Word user starts hacking... File set 5: File set 3 converted to multi-level numbering. With all these files, LO shows the same numbering as Word, even with restarted numbering. When you open the files, it creates automatic styles WW8Num* which presumably reflect Word's numbering groups.
(re comment #13) > With all these files, LO shows the same numbering as Word, even with > restarted numbering. When you open the files, it creates automatic styles > WW8Num* which presumably reflect Word's numbering groups. With "Example 9", LO seems to get confused about this mapping to its styles. The "Heading 1" paragraph style is not on outline level 1, probably because there is no regular automatic h1 numbering in the file, as with the other headings (the numbers for h1 are mostly ordinary text; the restart of numbering for the headings happens in hidden paragraphs of style "List Paragraph"). LO associates the styles for headings 2 and 3 with chapter numbering, but not heading 1. If you correct that, the file looks a lot better. But LO keeps a single "numbering group" for what should be chapter 3 plus the following hidden paragraph that is supposed to restart numbering to 3. Both paragraphs are mapped to the same list style. I'm tempted to say that it would be asking too much to translate such a mixture of numbering methods into something coherent, but it's noteworthy that ApacheComment #1 says that Symphony gets the numbering right: https://issues.apache.org/ooo/show_bug.cgi?id=119840#c1
I can confirm that "Lotus Symphony Release 3.0.1 Revision 20120110.2000" on German WIN7 Home Premium (64bit) shows headings in sample 2012-06-06 03:09 UTC, Timon correctly. So it's a difficult decision: - Is it that worth to invest time to do a fix for such strange documents? - But if other free software is able to handle that, shouldn't LibO be able, too? @stfhell Is it possible to split this into separate bugs with brief clear and simple bug descriptions? I'm a little overwhelmed with that enormous lot of samples
> @stfhell > Is it possible to split this into separate bugs with brief clear and simple > bug descriptions? I'm a little overwhelmed with that enormous lot of samples The problem is I wouldn't know what to put exactly in the other bug reports. I was looking at how "Example 9" does its numbering and I was experimenting with autonumbering in Word to see where LO deviated and if there is any difference between DOC and DOCX import. I'm sorry for the lengthy comments but I couldn't make them shorter because I still don't know where exactly LO's problems start, I could only describe what I found in the file. Maybe it is in fact just one bug. The samples show (I think): (1) If autonumbering is used correctly in Word, LO can handle it. (2) Autonumbering in Word (without list styles) is obviously bound to lead to messy files, at some stage. Just ignore the files if you don't want to use them for any experiments. > I can confirm that "Lotus Symphony Release 3.0.1 Revision 20120110.2000" on > German WIN7 Home Premium (64bit) shows headings in sample 2012-06-06 03:09 > UTC, Timon correctly. > So it's a difficult decision: > - Is it that worth to invest time to do a fix for such strange documents? > - But if other free software is able to handle that, shouldn't LibO > be able, too? I think it would be worthwhile. "Example 9" is a typical real-life Word document. People create such files, partly because autonumbering is more or less "forced" on them by AutoCorrect. It's a feature which _should_ be used with some consideration. I regularly get Word files from people, and with LO (or OO) I can often only guess what kind of numbering the authors had in mind. Sometimes I have to use MS Word Viewer to print a PDF as a reference for the original doc's numbering. On import of DOC/DOCX, LO tries to convert Word's non-style autonumbering to a style-based numbering (the filter creates plenty of WW8 list styles). That is not a trivial thing to do, and I could imagine that this is the root of the problem.
Created attachment 90828 [details] Example 9 DOC and DOCX file, exported in pdf format in MSO 2007 SP3 and LibreOffice 4.2.0.0 beta 2 In LibreOffice 4.2.0.0.beta2 Build ID: 1a27be92e320f97c20d581a69ef1c8b99ea9885d things are much better, but only for DOC file format. In DOCX things got worse (partly numbering gone at all). Compare attached PDF's to see differences. In "Example 9 DOC LO 4.2.pdf" all is fine till page 4, but at page 4 all is messed up again 2 1.8 1.9 2 3.1 3.2 4 3.1 3.2 and so on In "Example 9 DOCX LO 4.2.pdf" problems are seen from the first page, we see only items 1, 2, 4, and so on and don't see 1.1, 1.2, and so on numeration at all.
Numbering works ok for me on Win 7 64-bit 4.3.2.2 and dev build Version: 4.4.0.0.alpha0+ Build ID: 3e2bd1e4022e25b77bcc8eba5e02c1adc57008a1 TinderBox: Win-x86@42, Branch:master, Time: 2014-10-16_01:04:13 Please test!
Win XP SP3, LibreOffice: 4.3.2.2 Build ID: edfb5295ba211bd31ad47d0bad0118690f76407d As I already wrote, not everything is fixed! On pages 1-3 now all is fine, BUT please have a look at pages 4-8 (numbering is still terrible)! 2 1.8 1.9 2 3.1 3.2 4 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 4 4.1 4.1.1 4.1.2 4.1.3 4.1.4 4.1.5 4.1.6 4.1.7 4.1.8 4.1.9 4.1.10 4.1.11 4.2 4.2.1 4.2.2 4.2.3 4.3 4.3.1 4.3.2 4.3.3 4.3.4 4.4 4.5
This is still an issue with the latest build: Version: 4.5.0.0.alpha0+ Build ID: 4ee55eed6a34f6f061a0cd369a30afb464f9fa27
Created attachment 114395 [details] stripped-down file I removed everything before and after the point where the outline heading was desync'd. It seems to be an outline style issue. The correct headings should be: 2 2.1 2.2 3 3.1 3.2 4
Seems like a duplicate of this bug: https://bugs.documentfoundation.org//show_bug.cgi?id=76817
We usually mark new bugs as duplicate of an older one, not vice versa.
(In reply to meneerjansen00 from comment #22) > Seems like a duplicate of this bug: > https://bugs.documentfoundation.org//show_bug.cgi?id=76817 I don't think so. The other bug is about numbering that is different when inserted in a docx. This one is about numbering that looks different if opened in A or B
Created attachment 134050 [details] Misnumbering not occurring in LO 5.3.3.2 2017-06-16 Opening file in the current stable LO version doesn't produce the undesirable result show in earlier screenshots.
(In reply to Gabriel Bowater from comment #25) > Created attachment 134050 [details] > Misnumbering not occurring in LO 5.3.3.2 2017-06-16 > > Opening file in the current stable LO version doesn't produce the > undesirable result show in earlier screenshots. Wrong conclusion. Contents table looks ok, but numbering in text is not. Can be seen if contents table is updated.
After author Justin Luth <justin_luth@sil.org> 2018-01-12 20:44:06 +0300 committer Miklos Vajna <vmiklos@collabora.co.uk> 2018-01-15 13:57:29 +0100 commit 7201d157a2ff2f0a8b6bb8fa57e31871187cbc81 (patch) tree 2eebe2d8f6cdacd102b79e52a081fa5471dbfec4 parent a6b69a9384801f77f4cc30a366a45561c28eab3e (diff) tdf#76817 ooxmlimport: connect Heading to existing numbers the attachment 114395 [details] has changed from being shown like 1.1 to 1. @Justinm, I thought you might be interested in this issue...
Created attachment 142259 [details] Bug 50774 - stripped4.doc: roundtripped by MS Word 2003 (In reply to Xisco Faulí from comment #27) > commit 7201d157a2ff2f0a8b6bb8fa57e31871187cbc81 (patch) > tdf#76817 ooxmlimport: connect Heading to existing numbers I reverted this patch today. But it is worth noting that .doc format looks the same way in LO. That suggests that this bug is an odd-ball example. Since MS and LO internally have very different numbering implementation's, it might not be possible to emulate this one. So, no - I'm not really interested in spending more time in this minefield :-)
Created attachment 153978 [details] Example 9 COMPARED.png fixed
*** This bug has been marked as a duplicate of bug 95848 ***
Verified FIXED in Version: 6.4.0.0.alpha0+ (x64) Build ID: 396869e0e71bd33f5d962779abf72f35d01245e5 Thanks Michael for fixing this with your work in Bug 95848
Still fixed only partially in Version: 6.4.5.2 (x64) Build ID: a726b36747cf2001e06b58ad5db1aa3a9a1872d6 Only table of contents is shown absolutely right. But if we start scrolling through the contents of the document further, we will see many discrepancies In table of contents we see 2.1 Назначение системы 7 2.2 Цели создания системы 7 3 ХАРАКТЕРИСТИКА ОБЪЕКТА АВТОМАТИЗАЦИИ 9 4.1 Перечень программ, их назначение и основные характеристики 10 4.2 Требования к способам обмена информацией и средствам связи для информационного обмена между компонентами системы и со смежными системами 10 4.3 Требования к численности и квалификации персонала системы, обеспечивающих администрирование и сопровождение системы, в том числе изменение конфигурации системы (адаптация под изменения в законодательстве и методиках расчета, создание отчетных форм и настройки обмена информацией с другими ИС) 11 4.4 Требования к обучению пользователей 12 4.5 Показатели назначения 12 4.6 Требования к надежности 13 4.7 Требования к безопасности 13 4.8 Требования к эргономике и технической эстетике 13 4.9 Требования к эксплуатации, техническому обслуживанию, ремонту и хранению компонентов системы 14 4.10 Требования к защите информации от несанкционированного доступа 14 4.11 Требования по сохранности информации при авариях 15 4.12 Требования к патентной чистоте In document text on pages 4 and 5 we see 1.1 Назначение системы 1.2 Цели создания системы 2 ХАРАКТЕРИСТИКА ОБЪЕКТА АВТОМАТИЗАЦИИ 3.1 Перечень программ, их назначение и основные характеристики • Программа 1 • Программа 2 • Программа 3 3.2 Требования к способам обмена информацией и средствам связи для информационного обмена между компонентами системы и со смежными системами 3.3 Требования к численности и квалификации персонала системы, обеспечивающих администрирование и сопровождение системы, в том числе изменение конфигурации системы (адаптация под изменения в законодательстве и методиках расчета, создание отчетных форм и настройки обмена информацией с другими ИС) 3.4 Требования к обучению пользователей 3.5 Показатели назначения 3.6 Требования к надежности 3.7 Требования к безопасности 3.8 Требования к эргономике и технической эстетике 3.9 Требования к эксплуатации, техническому обслуживанию, ремонту и хранению компонентов системы 3.10 Требования к защите информации от несанкционированного доступа 3.11 Требования по сохранности информации при авариях 3.12 Требования к патентной чистоте Numration again dancing as she pleases, but already in the document itself