Bug 74359 - FILEOPEN: [RTF filter] Content piece of the table’s large cell is lost in file from Web page created in Word 2007
Summary: FILEOPEN: [RTF filter] Content piece of the table’s large cell is lost in fil...
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
4.1.5.1 rc
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: RTF
  Show dependency treegraph
 
Reported: 2014-02-02 10:09 UTC by ape
Modified: 2017-06-09 10:38 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
DOCX file (fdo#74357) saved as ODT by LibO_Dev-4.2.3.0.0+ (2.74 MB, application/vnd.oasis.opendocumentformat.text)
2014-02-24 08:57 UTC, ape
Details

Note You need to log in before you can comment on or make changes to this bug.
Description ape 2014-02-02 10:09:42 UTC
I saved the Web page as an MHT archive. I converted the MHT file to the RTF file using WinWord-2007 (see an attachment). I opened the RTF file using LibreOffice-4.2.0; 4.1.5; 4.0.6 and saw the error:
 Content of the big table cell is lost completely.
 I checked the operation of other programs:
1. LibreOffice-4.0.6 is making the same mistake.
2. OpenOffice.org-3.1.1 opens the RTF file correctly, not making this mistake.
3. LibO-3.5.7, 3.6.7 and AOO-4.0.1 have other mistake (see bug 74356, bug 74357):
The large piece of the table cell’s content is lost. Table cell, located on the tenth page of the document, contains an image whose size is equal to the page size. All information of the cell, located after this big image, is lost.
--
This is a loss of information and regression to old programs, so the status is critical.
Comment 2 Roman Kuznetsov 2014-02-02 11:47:11 UTC
confirm bug
Comment 3 Joel Madero 2014-02-02 17:06:18 UTC
I see that it's different because it's RTF/DOC/DOCX - it could very well be fine to have it as one bug report but I'll leave it as separate for now. RTF filter bugs most definitely do not belong on MAB list as they are quite rare and not going to impact many users
Comment 4 Cor Nouws 2014-02-07 14:22:06 UTC
(In reply to comment #0)
> I saved the Web page as an MHT archive. I converted the MHT file to the RTF
> file using WinWord-2007 (see an attachment). I opened the RTF file using
> LibreOffice-4.2.0; 4.1.5; 4.0.6 and saw the error:
>  [...]

thanks for the report.

Do you know if the same problems happens if you create such a file
 - from scratch in LibreOffice
 - by pasting in LibreOffice and then saving as rtf
 - by pasting in LibreOffice and then saving as odt ?

regards,
Cor
Comment 5 ape 2014-02-08 05:55:19 UTC
(In reply to comment #4)
> Do you know if the same problems happens if you create such a file
>  - from scratch in LibreOffice
> ...
--
Hi, Cor!

  I guess:
1. The filter does not know how to process the data (text, graphics) contained in a large table cell located after a large image which size is the size of the page. This error occurs in earlier versions of the program: LibreOffice-3.5.x and 3.6.x.
2. New programs (LibreOffice-4.1.5 and 4.2.0) do not process the entire table cell if the cell contains some data that cannot be displayed.

  Inserting copy data of the MHT file in the Writer’s new document does not give more information about the bug. Only text is stored in the Clipboard in that case.

Regards, ape.
Comment 6 ape 2014-02-24 07:26:53 UTC
 Miklos, 
I added you to CC. Your patch solved the problem DOCX-files (bug 74357) very well. Please decide a similar issue with files of RTF format, if you have the time and opportunity.
--
ape
Comment 7 ape 2014-02-24 08:57:03 UTC
Created attachment 94633 [details]
DOCX file (fdo#74357) saved as ODT by LibO_Dev-4.2.3.0.0+

I have been using this version of the program:
LibreOfficeDev 4.2.3.0.0+ (Build ID: 5ba682c48e449f30e3cc1ec4acac75a6122ee6d7, TinderBox: Win-x86@42, Branch:libreoffice-4-2, Time: 2014-02-22_23:03:29)
--
1. The DOCX file (attachment 93208 [details]) was opened and then was saved as ODT format (see an attachment).
2. The ODT file was saved as Rich Text Format (more than 220 MB).
3. I don't see the content of eighteen primary pages when I open this RTF file using Writer.
4. I can see the contents of all pages when I open this RTF file by WinWord-2007.
--
I guess that the RTF import filter does not know how to process a cell with large image which size is the size of the page.
Comment 8 ape 2014-02-27 15:36:48 UTC
The piece of regression fixed in this version:
 LibreOfficeDev-4.2.3.0.0+ (Build ID: 4274001144adeb0b0a1e7da05d52c1bedbe899e5,
 TinderBox: Win-x86@42, Branch: libreoffice-4-2, Time: 2014-02-27_08:31:36).
--
Now Writer shows the first eleven pages of the RTF file (see URL in the comment 1), the same as the DOC file (see bug 74356).
But
 if you made these actions:
  opened ODT file (see comment 7; attachment 94633 [details])
  and saved this ODT file as RTF format (size ~224 MB),
 then Writer (e.g. Writer_4.2.1.1,_4.0.6) opens new RTF file is fine and shows all contents of all pages.
Comment 9 Joel Madero 2015-05-02 15:42:12 UTC Comment hidden (obsolete)
Comment 10 Buovjaga 2015-06-20 15:24:15 UTC
Opened original docx attachment 93208 [details], saved to odt, saved odt to rtf. Noted huge filesize 170 Megs, noted borked document.

Win 7 Pro 64-bit Version: 5.1.0.0.alpha1+
Build ID: 3ecef8cedb215e49237a11607197edc91639bfcd
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2015-06-19_23:16:58
Locale: fi-FI (fi_FI)
Comment 11 QA Administrators 2016-09-20 10:11:34 UTC Comment hidden (obsolete)
Comment 12 ape 2017-06-09 10:14:32 UTC
Version: 5.4.0.0.beta2
Build ID: 3cc1cdd8ee50f144e5514da51800a08119754d8f
CPU threads: 8; OS: Windows 5.2; UI render: default; 
Locale: ru-RU (ru_RU); Calc: group

The program opens all pages in the document. Bug resolved.
Comment 13 Buovjaga 2017-06-09 10:38:07 UTC
Confirmed it does not lose content anymore. The table contents do overflow to the right, beyond the page boundary, but that is another issue.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.5.0.0.alpha0+
Build ID: 88d3c067831dac8cf69ebaa079f1d809d727a342
CPU threads: 8; OS: Linux 4.11; UI render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on June 7th 2017