Bug 51262 - big rtf file from little odt file
Summary: big rtf file from little odt file
Status: RESOLVED INVALID
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.3.4 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Miklos Vajna
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-06-20 08:31 UTC by Roberto Innocenti
Modified: 2012-10-15 11:27 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
file of 15MB (834.75 KB, application/x-bzip)
2012-06-20 08:31 UTC, Roberto Innocenti
Details
Original file in docx (447.50 KB, application/x-bzip)
2012-06-23 02:37 UTC, Roberto Innocenti
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roberto Innocenti 2012-06-20 08:31:12 UTC
Created attachment 63268 [details]
file of 15MB

When I save in rtf my file with one page of text and a 100x200 pixel image in it it become 15MB , with Odt format is 300kb.
Comment 1 Julien Nabet 2012-06-23 01:09:01 UTC
Could you attach the original file (odt) so we can try to reproduce the problem ?
Comment 2 Roberto Innocenti 2012-06-23 02:31:22 UTC
it's already attached on the bug.

Roberto

2012/6/23 <bugzilla-daemon@freedesktop.org>

> https://bugs.freedesktop.org/show_bug.cgi?id=51262
>
> Julien Nabet <serval2412@yahoo.fr> changed:
>
>           What    |Removed                     |Added
>
> ----------------------------------------------------------------------------
>                 CC|                            |serval2412@yahoo.fr
>
> --- Comment #1 from Julien Nabet <serval2412@yahoo.fr> 2012-06-23
> 01:09:01 PDT ---
> Could you attach the original file (odt) so we can try to reproduce the
> problem
> ?
>
> --
> Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.
>
Comment 3 Roberto Innocenti 2012-06-23 02:37:23 UTC
Created attachment 63367 [details]
Original file in docx

That file was chenged only the text and saved in rtf ( I don't know the version of OpenOffice or LibreOffice that do that ).
I have open the rtf, saved in odt and open again and saved again in rtf and is still 15MB as the first save from docx to rtf.
Comment 4 Julien Nabet 2012-06-23 12:56:06 UTC
The extension of the "Original file in docx" should be tar.gz, not tar.bz

Then, I reproduced the problem on Pc Debian x86-64, with master sources updated.
I tried several conversion so here they are with corresponding size 
original docx : 472793
docx -> rtf : 15727102
docx -> odt : 325384
docx -> odt -> rtf : 16017100

Finally, I removed the picture :
docx without picture : 472793
docx -> odt :  324209
docx -> odt -> rtf : 9779

So it seems the problem comes from the management of the picture by rtf conversion

Miklós: one for you ?
Comment 5 Ronildo Matsuura 2012-08-07 13:19:18 UTC
I am having the same problem! 

old rtf: 82,0KB
save as new rtf and... 3,28MB!!!

I'm tested v3.5.4, v.3.5.5 and v3.6.0.4rc
Comment 6 Mike Kaganski 2012-10-13 13:21:29 UTC
The image dimensions in the attached RTF are not 100x200, they're 1751x444x32BPP, and the image is only visually shrunk to a smaller size.

The image is stored twice in the RTF: once as EMF, once as WMF. Both have similar size, about 3.1 MB of binary data, and about 7 MB ASCII representation in RTF.

So the large file size is inevitable here (as RTF is "human-readable" ASCII format); the only strange thing is the extra copy of the same picture: if omitted, it could reduce the file size by half.

The ODT, as well as DOCX, are both compressed; the image in this file allows for high compression ratio; so it's natural that those formats have many times smaller sizes.

And one thing to note: if the file is resaved by MS Wordpad, the image size becomes larger: about 4.5 MB (instead of 3.1), and the RTF becomes about 9 MB (only one image there, no extra EMF copy). So LO does a good job optimizing the image.
Comment 7 Miklos Vajna 2012-10-15 08:40:02 UTC
The copy is there because old readers may not be able to parse compressed formats (JPG or PNG). Can we close this as invalid, please?
Comment 8 Mike Kaganski 2012-10-15 10:33:07 UTC
I vote for closing this bug, as it is not a bug at all.

Open this 15MB RTF in MS Office Word 2010 -> Save as docx -> close it (resulting size is 463 KB); open the docx in Word -> Save as RTF -> resulting file is 16.6 MB.
Comment 9 Miklos Vajna 2012-10-15 11:27:27 UTC
OK, closing as invalid.