Bug 81272 - Libreoffice Is Very Slow Rendering Chinese Characters (because of font fallback?)
Summary: Libreoffice Is Very Slow Rendering Chinese Characters (because of font fallba...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.2.5.2 release
Hardware: All Linux (All)
: medium major
Assignee: Jonathan Clark
URL:
Whiteboard: target:4.4.0 target:24.8.0 inReleaseN...
Keywords: perf
Depends on:
Blocks: CJK
  Show dependency treegraph
 
Reported: 2014-07-12 18:22 UTC by carldong76
Modified: 2024-06-21 14:23 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
A Very Slow Chinese Document (29.13 KB, application/vnd.oasis.opendocument.text)
2014-07-12 18:22 UTC, carldong76
Details
A Very Fast English Document (24.04 KB, application/vnd.oasis.opendocument.text)
2014-07-12 18:23 UTC, carldong76
Details
A Very Slow Chinese Document (29.13 KB, application/vnd.oasis.opendocument.text)
2014-07-12 18:23 UTC, carldong76
Details
perf flamegraph svg (426.55 KB, application/x-xz)
2020-01-18 02:28 UTC, Kevin Suo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description carldong76 2014-07-12 18:22:43 UTC
Created attachment 102677 [details]
A Very Slow Chinese Document

I tried writing some long documents using Chinese. However, as the number of characters approaches 1000, the scrolling becomes so slow that if I scroll once, I need to wait for several seconds for it to stop. I have created some simple comparason documents for Chinese and English. The English document is very fast though.

I am under Funtoo Linux, amd64. The version I use is 4.2.5.2, but it also appeared for earlier versions. The attachments are the two simple documents I used for comparing.
Comment 1 carldong76 2014-07-12 18:23:14 UTC
Created attachment 102678 [details]
A Very Fast English Document
Comment 2 carldong76 2014-07-12 18:23:56 UTC
Created attachment 102679 [details]
A Very Slow Chinese Document
Comment 3 Kevin Suo 2014-07-13 04:55:55 UTC
Confirmed with libreffice 4.3.0.2, ubuntu 14.04 x86.

Scrolling and editing the Chinese document is very slow, compared with the english document.

It may be because of the font fallback. 
Western font was applied to the Chinese chars. If you set a Chinese font (for example, SimSun, Wenquanyi MircoHei, etc) it will be fast again.

Set to NEW, changed summary to reflect the font fallback issue.
Comment 4 carldong76 2014-07-13 15:41:25 UTC
I confirmed that if I set font explicitly, it becomes fast.
Comment 5 Matthew Francis 2014-08-17 14:55:29 UTC
OSX 10.9.4 / LO 4.3.0.4, 4.4 master:

The "slow" document seems sluggish to me whether or not the text is set explicitly to a Chinese font

If anything it also seems rather slower on 4.4 master than in 4.3.0.4 release

If I open the document and type some "a"s, it is initially slow but responsive as long as I keep typing. However, after stopping, the next letter typed is only processed after a long pause.

I wondered based on this if spellchecking might be involved, but setting the language of the text to "None" doesn't seem to make any difference - unless there is some processing that isn't disabled when this is done
Comment 6 Matthew Francis 2014-09-04 07:20:43 UTC
Poked at this a little with callgrind. There appear to be a number of villains in this case:

1) SwTxtFrm::CollectAutoCmplWrds
Called from beneath SwLayIdle::DoIdleJob()
Workaround: Disable Tools – Autocorrect Options... – Word Completion – Collect words

2) SwTxtFrm::_AutoSpell
Called from beneath SwLayIdle::DoIdleJob()
Workaround: Disable Tools – Automatic Spell Checking

3) SwTxtNode::CountWords
Called from beneath SwLayIdle::DoIdleJob()
Workaround: None found

4) SwTxtNode::CountWords
Called from beneath DocumentStatisticsManager::IncrementalDocStatCalculate
Workaround: None found


Each of these spends a long time dissecting text using SwScanner. In addition, (3) and (4) appear to be counting the same words twice, which compounds the fact that it's a slow operation on a long paragraph.

With all four disabled (commenting code out where necessary), editing the giant paragraph in the text document is merely slow rather than intolerable.
Comment 7 Matthew Francis 2014-09-04 10:54:40 UTC
For a paragraph with N continuous characters of Chinese text (e.g. N x "中"), iterating over the paragraph with SwScanner will cause xdictionary::getWordBoundary() to be called N times, each of which will call xdictionary::seekSegment(), which will in turn iterate over each of the N characters

-> N^2 operations

This needs refactoring so seekSegment() doesn't keep doing the same work over and over again



For Chinese text, the path through from SwScanner to xdictionary goes like this:

    frame #1: 0x0000000110919243 libi18npoollo.dylib`com::sun::star::i18n::xdictionary::seekSegment(this=0x000000010c08a000, rText=0x00007fff5fbfa930, pos=1, segBoundary=0x000000010c08a028) + 115 at xdictionary.cxx:280
    frame #2: 0x00000001109199dd libi18npoollo.dylib`com::sun::star::i18n::xdictionary::getWordBoundary(this=0x000000010c08a000, rText=0x00007fff5fbfa930, anyPos=1, wordType=3, bDirection=true) + 173 at xdictionary.cxx:412
    frame #3: 0x00000001109073e7 libi18npoollo.dylib`com::sun::star::i18n::BreakIterator_CJK::getWordBoundary(this=0x000000011ff93ab8, text=0x00007fff5fbfa930, anyPos=1, nLocale=0x00000001206d7600, wordType=3, bDirection='\x01') + 119 at breakiterator_cjk.cxx:81
    frame #4: 0x000000011090753c libi18npoollo.dylib`non-virtual thunk to com::sun::star::i18n::BreakIterator_CJK::getWordBoundary(this=0x000000011ff93ae0, text=0x00007fff5fbfa930, anyPos=1, nLocale=0x00000001206d7600, wordType=3, bDirection='\x01') + 92 at breakiterator_cjk.cxx:88
    frame #5: 0x000000011090e0e4 libi18npoollo.dylib`com::sun::star::i18n::BreakIteratorImpl::getWordBoundary(this=0x0000000117072b78, Text=0x00007fff5fbfa930, nPos=1, rLocale=0x00000001206d7600, rWordType=3, bDirection='\x01') + 612 at breakiteratorImpl.cxx:182
    frame #6: 0x000000011090e1dc libi18npoollo.dylib`non-virtual thunk to com::sun::star::i18n::BreakIteratorImpl::getWordBoundary(this=0x0000000117072ba0, Text=0x00007fff5fbfa930, nPos=1, rLocale=0x00000001206d7600, rWordType=3, bDirection='\x01') + 92 at breakiteratorImpl.cxx:186
    frame #7: 0x00000001184e85d0 libswlo.dylib`SwScanner::NextWord(this=0x00007fff5fbfa918) + 1296 at txtedt.cxx:836
Comment 8 Commit Notification 2014-09-10 14:11:57 UTC
Matthew J. Francis committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=a34a8fca21c670c4e7ee147d05ed9e6e4136cbe1

fdo#81272 Speed up break iterators



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 9 Commit Notification 2014-09-10 15:49:52 UTC
Caolan McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=997d1387abcfa40eca8d15a2fe025edc4a1de040

Revert "fdo#81272 Speed up break iterators"



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2014-09-10 20:03:37 UTC
Matthew J. Francis committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=44ead04eb5fc61a3f56f783adb1509fab440e212

fdo#81272 Speed up break iterators



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 11 Caolán McNamara 2014-09-10 20:04:25 UTC
caolanm->fdbugs: Can we call this fixed now ?
Comment 12 Matthew Francis 2014-09-11 03:57:02 UTC
Two concerns with closing at this point:

1) The reporter had a slightly different symptom to the problem that's just been patched - that the slowdown was dependent on fonts. I never reproduced that on OSX; the reported issue may not be identical.

2) After the patch, performance with CJK text is much better in the 10,000 character range, but still poor with a few times that (30k or above is still appreciably slow on my local machine). This compares poorly with a paragraph of the same number of western text characters.


I have a few ideas on possibilities for further improving performance, but not much that's as simple as the first patch...
Comment 13 Kevin Suo 2015-01-12 14:36:42 UTC
(In reply to Matthew Francis from comment #12)
I confirm that this bug still exists in the following version:
Version: 4.4.0.2
Build ID: a3603970151a6ae2596acd62b70112f4d376b990
Locale: zh_CN
Fedora 22 X64.

Steps to reproduce:
1. Open attachment 102679 [details] with Writer;
2. Try to delete some chars using the BACKSPACE, or try to type in some text.
--> Very slow.
Comment 14 Matthieu 2015-12-26 08:28:35 UTC
(In reply to Matthew Francis from comment #12)

I confirm that I am affected by this bug too, in this version and following the same steps as Kevin Suo:
Version: 4.4.7.2
Build ID: 4.4.7.2-1.fc22
Locale: en_US.UTF-8
Fedora 22 x64
Comment 15 QA Administrators 2017-01-03 19:55:26 UTC Comment hidden (obsolete)
Comment 16 Alex Thurgood 2018-11-07 10:41:06 UTC
Still reproducible with

Version: 6.2.0.0.alpha1+
Build ID: 740b99783b5480fcd1e5fce7c1beb5967d015041
CPU threads: 4; OS: Mac OS X 10.14.1; UI render: default; VCL: osx; 
Locale: fr-FR (fr_FR.UTF-8); Calc: threaded

Just deleting characters from the text document on MacOS Mojave using the backspace key is terribly slow.
Comment 17 Kevin Suo 2020-01-18 02:28:16 UTC
Created attachment 157238 [details]
perf flamegraph svg
Comment 18 Roman Kuznetsov 2021-09-21 18:14:29 UTC
I don't see the problem in

Version: 7.3.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: a9cc066a86c6bd3423c5802c5a4eded55a50c754
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: en-US
Calc: threaded

Alex, Kevin, could you please retest it on your machines?
Comment 19 Philipp Weissenbacher 2022-02-25 10:09:36 UTC
I retested the attachment "A Very Slow Chinese Document" in

Version: 7.3.0.3 / LibreOffice Community
Build ID: 0f246aa12d0eee4a0f7adcefbf7c878fc2238db3
CPU threads: 8; OS: Mac OS X 12.2.1; UI render: Skia/Metal; VCL: osx
Locale: en-US (en_NL.UTF-8); UI: en-US
Calc: threaded

on a

Mac mini (M1, 2020), 16GB of RAM, on macOS 12.2.1

- Deleting, adding, scrolling and searching for characters is as fast as can be expected

- If I search for "字" and enable "Find All", it takes roughly 3 seconds for the results to appear.
Comment 20 Kevin Suo 2022-02-25 10:53:33 UTC
WORKSFORME in
Version: 7.4.0.0.alpha0+ / LibreOffice Community
Build ID: 02e1be8883a08ab17f3e890a834ab88f13c5867d
CPU threads: 8; OS: Linux 5.16; UI render: default; VCL: gtk3
Locale: zh-CN (zh_CN.UTF-8); UI: zh-CN
Build Platform: Fedora34@X64, Branch:master, bibisect-linux-64-7.4-CN
Calc: threaded

Close as RESOLVED WORKSFORME.
Comment 21 Peter Nowee 2022-10-23 16:19:39 UTC
Sorry for the late reply, but I am definitively still seeing this bug.

To reproduce:
1. Tools -> Options -> LibreOffice Writer -> Basic Fonts (Asian)
   Set the Default (top row) to a Western font, for example 
   Liberation Serif or DejaVu Sans.
   (Alternatively, set it to an Asian font, close LibreOffice and
   uninstall that Asian font from your computer.)
2. Open attachment 102679 [details] with Writer;
3. In the middle of the first page, try to delete some chars using
   the BACKSPACE, or try to type in some text. Slow to Very slow.
4. Search for "字" and enable "Find All". Very slow, even unresponsive
   for a while.

To show that the (fallback) font is the problem:
5.a. Change the font of the text to an Asian font, for example
     Noto Sans CJK SC or AR PL UKai TW.
or:
5.b. Tools -> Options -> LibreOffice Writer -> Basic Fonts (Asian)
     Set the Default (top row) to an Asian font, for example
     Noto Sans CJK SC or AR PL UKai TW.

Now steps 3 and 4 become normal speed (fast) again.

Version: 7.4.1.2 / LibreOffice Community
Build ID: 40(Build:2)
CPU threads: 2; OS: Linux 5.10; UI render: default; VCL: x11
Locale: en-US (en_US.UTF-8); UI: en-US
Debian package version: 1:7.4.1-1~bpo11+2
Calc: threaded
Comment 22 Commit Notification 2024-05-29 07:38:10 UTC
Jonathan Clark committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/0b6a07f07dd05d0db4ddeedb9b112e26b5fd5eb5

tdf#81272 Improved CJK fallback rendering performance

It will be available in 24.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 23 Jonathan Clark 2024-05-29 08:01:36 UTC
For this fix, I focused on the specific issue that was originally reported: slow Chinese text rendering due to font fallback, which resolved when a Chinese font was applied to the text.

I was able to reproduce this issue on my machine.

Using a test document and converting to PDF on the command line, I measured the following:
Without a Chinese font set - 6.3s
With a Chinese font set - 2.2s

After the change, I measured:
Without a Chinese font set - 2.8s
With a Chinese font set - 2.2s

Or, an ~85% reduction in overhead specifically attributable to font fallback. While it's always possible to do more, profiler data shows that font fallback is no longer the standout source of overhead, so it might be better to focus further performance work elsewhere.

Based on this font fallback performance improvement, I am marking this bug fixed.