Bug 77536

Summary: huge cpu / memory consumption when importing pdf
Product: LibreOffice Reporter: Riccardo Magliocchetti <riccardo.magliocchetti>
Component: filters and storageAssignee: Not Assigned <libreoffice-bugs>
Status: RESOLVED FIXED    
Severity: normal CC: jmadero.dev
Priority: medium    
Version: 4.2.3.3 release   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
Crash report or crash signature: Regression By:

Description Riccardo Magliocchetti 2014-04-16 18:29:55 UTC
22,19%  soffice.bin  libpdfimportlo.so           [.] boost::shared_ptr<boost::spirit::fileiter_impl::mmap_file_iterator<char>::mapping>::operator=(boost::shared_ptr<boost::spirit::fileiter_impl::mmap_file_iterator<char>::mapp
 10,86%   xpdfimport  xpdfimport                  [.] void std::vector<unsigned char, std::allocator<unsigned char> >::emplace_back<unsigned char>(unsigned char&&)
 10,42%  soffice.bin  libpdfimportlo.so           [.] boost::detail::sp_counted_base::release()
  7,83%   xpdfimport  libpoppler.so.44.0.0        [.] GfxImageColorMap::getRGB(unsigned char*, GfxRGB*)
  7,74%   xpdfimport  xpdfimport                  [.] pdfi::writePpm_(std::vector<unsigned char, std::allocator<unsigned char> >&, Stream*, int, int, GfxImageColorMap*, bool)
  7,69%   xpdfimport  libpoppler.so.44.0.0        [.] FlateStream::getChars(int, unsigned char*)
  5,40%  soffice.bin  libpdfimportlo.so           [.] boost::spirit::impl::concrete_parser<boost::spirit::action<boost::spirit::sequence<boost::spirit::sequence<boost::spirit::strlit<char const*>, boost::spirit::kleene_star<bo
  4,96%   xpdfimport  libpoppler.so.44.0.0        [.] FlateStream::readSome()
  4,84%   xpdfimport  libpoppler.so.44.0.0        [.] GfxDeviceRGBColorSpace::getRGB(GfxColor*, GfxRGB*)
  2,34%   xpdfimport  libc-2.18.so                [.] __memmove_ssse3
  2,15%  soffice.bin  libpdfimportlo.so           [.] boost::spirit::match<boost::spirit::nil_t> boost::spirit::impl::contiguous_parser_parse<boost::spirit::match<boost::spirit::nil_t>, boost::spirit::chseq<char const*>, boost
  2,03%   xpdfimport  xpdfimport                  [.] __x86.get_pc_thunk.bx
  1,12%   xpdfimport  [kernel.kallsyms]           [k] clear_page_c
  0,85%  soffice.bin  [kernel.kallsyms]           [k] copy_user_generic_string
  0,82%  soffice.bin  libc-2.18.so                [.] __memset_sse2
  0,76%   xpdfimport  [kernel.kallsyms]           [k] copy_user_generic_string
  0,68%   xpdfimport  xpdfimport                  [.] 0x00001810
  0,54%  soffice.bin  libc-2.18.so                [.] isspace
Comment 1 Riccardo Magliocchetti 2014-04-16 18:45:37 UTC
[Submitted by mistake while trying to close bug title autocompletion]

Unfortunately I cannot share the pdf. The pdf was exported by libo from a pptx that it slow too loading.

Here's the perf output for the pptx:

  5,15%  soffice.bin  libc-2.18.so                [.] _int_malloc
  5,05%  soffice.bin  libc-2.18.so                [.] _int_free
  4,02%  soffice.bin  libc-2.18.so                [.] malloc
  2,50%  soffice.bin  libharfbuzz.so.0.927.0      [.] 0x00015fa3
  2,33%  soffice.bin  libuno_sal.so.3             [.] rtl_uString_release
  2,19%  soffice.bin  libsvllo.so                 [.] SfxItemSet::GetItemState(unsigned short, unsigned char, SfxPoolItem const**) const
  2,02%  soffice.bin  libfontconfig.so.1.8.0      [.] FcCompareFamily
  1,66%  soffice.bin  libfontconfig.so.1.8.0      [.] FcCompareValueList
  1,45%  soffice.bin  libfontconfig.so.1.8.0      [.] FcStrCaseWalkerNext.part.3
  1,39%  soffice.bin  [kernel.kallsyms]           [k] read_hpet
  1,30%  soffice.bin  libc-2.18.so                [.] malloc_consolidate
  1,24%  soffice.bin  libsvllo.so                 [.] SfxItemSet::Get(unsigned short, unsigned char) const
  1,14%  soffice.bin  libuno_sal.so.3             [.] 0x00019e60
  1,07%  soffice.bin  libsvllo.so                 [.] SfxItemPool::Put(SfxPoolItem const&, unsigned short)
  1,05%  soffice.bin  libuno_sal.so.3             [.] rtl_uString_assign
  1,01%  soffice.bin  libsvllo.so                 [.] SfxItemPool::GetDefaultItem(unsigned short) const
  0,82%  soffice.bin  libfontconfig.so.1.8.0      [.] FcStrSetMember
  0,79%  soffice.bin  libuno_sal.so.3             [.] rtl_uString_acquire
  0,73%  soffice.bin  libc-2.18.so                [.] __strchr_sse2_bsf
  0,67%  soffice.bin  libpthread-2.18.so          [.] pthread_mutex_lock
  0,66%  soffice.bin  libeditenglo.so             [.] __x86.get_pc_thunk.bx
  0,61%  soffice.bin  libvcllo.so                 [.] __x86.get_pc_thunk.bx
  0,58%  soffice.bin  libc-2.18.so                [.] __x86.get_pc_thunk.bx
  0,58%  soffice.bin  libeditenglo.so             [.] ImpEditEngine::SeekCursor(ContentNode*, unsigned short, SvxFont&, OutputDevice*, unsigned short)
  0,57%  soffice.bin  libsvllo.so                 [.] SfxItemPool::IsInRange(unsigned short) const
  0,53%  soffice.bin  libstdc++.so.6.0.20         [.] operator new(unsigned int)
  0,52%  soffice.bin  libfontconfig.so.1.8.0      [.] FcConfigCompareValue
  0,52%  soffice.bin  libc-2.18.so                [.] free
  0,51%  soffice.bin  libc-2.18.so                [.] __memset_sse2
  0,51%  soffice.bin  libsvllo.so                 [.] SfxWhichIter::NextWhich()
  0,51%  soffice.bin  libc-2.18.so                [.] __memcpy_ssse3
  0,50%  soffice.bin  libstdc++.so.6.0.20         [.] 0x00062083
  0,50%  soffice.bin  libutllo.so                 [.] GetEnglishSearchFontName(rtl::OUString&)
  0,50%  soffice.bin  libsvllo.so                 [.] SfxItemSet::Set(SfxItemSet const&, unsigned char)
  0,49%  soffice.bin  libpthread-2.18.so          [.] __pthread_mutex_unlock_usercnt
  0,49%  soffice.bin  libuno_sal.so.3             [.] rtl_uString_newFromStr
  0,48%  soffice.bin  libsvllo.so                 [.] SfxItemSet::~SfxItemSet()
  0,47%  soffice.bin  libeditenglo.so             [.] CreateFont(SvxFont&, SfxItemSet const&, bool, short)
  0,46%  soffice.bin  libfontconfig.so.1.8.0      [.] FcCompare
  0,46%  soffice.bin  libsvllo.so                 [.] __x86.get_pc_thunk.bx
  0,46%  soffice.bin  libsvllo.so                 [.] SfxItemPool::Remove(SfxPoolItem const&)
  0,44%  soffice.bin  libeditenglo.so             [.] ImpEditEngine::CreateLines(long, unsigned long)
  0,43%  soffice.bin  libstdc++.so.6.0.20         [.] operator delete(void*)
  0,41%  soffice.bin  libsvllo.so                 [.] SfxItemSet::SfxItemSet(SfxItemSet const&)
  0,41%  soffice.bin  libfontconfig.so.1.8.0      [.] FcStrCmpIgnoreCaseAndDelims
Comment 2 Riccardo Magliocchetti 2014-04-16 18:55:21 UTC
huge cpu / memory consumption means my 3G machine is swapping and gnome shell freeze, pdf is 33MB.

The pptx takes around 30 seconds to load and is 1.5MB in size.
Comment 3 Philipp Weissenbacher 2014-07-09 12:52:11 UTC
Hi,

Can you download the latest Fresh version and take a look if its still an issue?

http://www.libreoffice.org/download/libreoffice-fresh/?type=win-x86&version=4.3.0&lang=en-GB
Comment 4 Riccardo Magliocchetti 2014-07-09 19:06:56 UTC
(In reply to comment #3)
> Hi,
> 
> Can you download the latest Fresh version and take a look if its still an
> issue?
> 
> http://www.libreoffice.org/download/libreoffice-fresh/?type=win-
> x86&version=4.3.0&lang=en-GB

I will give 4.3.0 a try when it is out. Thanks for reminding.
Comment 5 Riccardo Magliocchetti 2014-07-19 16:49:38 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Hi,
> > 
> > Can you download the latest Fresh version and take a look if its still an
> > issue?
> > 
> > http://www.libreoffice.org/download/libreoffice-fresh/?type=win-
> > x86&version=4.3.0&lang=en-GB
> 
> I will give 4.3.0 a try when it is out. Thanks for reminding.

Reproduced with 4.3.0-rc3
Comment 6 Joel Madero 2014-07-19 16:59:49 UTC
We need a test pdf - this isn't sufficient for QA nor for Devs to fix the problem.

Marking as NEEDINFO - if the file is confidential I recommend trying to alter it in some way to fix this (replace all characters with X or something like that). Then mark the bug as UNCONFIRMED.
Comment 7 Riccardo Magliocchetti 2014-07-19 17:43:26 UTC
Here [1] you can find a file that converted to pdf exhibit the same behaviour, i say hopefully because i'm not even able to export it in pdf because my session get killed by OOM. If someone could please convert it to pdf and try to reproduce it would be much appreciated.

[1] http://people.freedesktop.org/~rm/libo/strata.pptx
Comment 8 Riccardo Magliocchetti 2014-10-27 15:04:53 UTC
Good news! So with Debian built libo 1:4.4.0~alpha1-2 i can convert from pptx to pdf and back from pdf to pptx. So closing as resolved fixed.

From a previous conversation with mst he suggested that this may be the commit fixing the issue:
cgit.freedesktop.org/libreoffice/core/commit/?id=0ca0202a0994c0b7c99c366fd5cafd8a655df203