Bug 108703 - LibO is trying to access OpenOffice\dict_word.brk & dict_word_en.brk when scrolling through a document
Summary: LibO is trying to access OpenOffice\dict_word.brk & dict_word_en.brk when scr...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
6.0.0.0.alpha0+
Hardware: All Windows (All)
: medium trivial
Assignee: Eike Rathke
URL:
Whiteboard: target:6.0.0 target:5.4.3
Keywords: bibisected, bisected, regression
: 111078 (view as bug list)
Depends on:
Blocks: Too-Much-File-Access
  Show dependency treegraph
 
Reported: 2017-06-22 20:22 UTC by Telesto
Modified: 2017-10-13 20:04 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Telesto 2017-06-22 20:22:25 UTC
Description:
LibO is trying to access OpenOffice\dict_word.brk & dict_word_en.brk when scrolling through a document. 

Steps to Reproduce:
1. Open attachment 133938 [details]
2. Enable spell-check
2. Use Process Monitor with Filter set to soffice.bin
3. Scroll the document up and down

Actual Results:  
LibO is trying to access OpenOffice\dict_word.brk & dict_word_en.brk

Expected Results:
Probably not; seems a bit outdated to me


Reproducible: Always

User Profile Reset: No

Additional Info:
Found in
Version: 6.0.0.0.alpha0+
Build ID: 18f513145477d4621290253d936dad7a40ee4c05
CPU threads: 4; OS: Windows 6.19; UI render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-06-21_06:40:38
Locale: nl-NL (nl_NL); Calc: CL

but not in
5.4.0.0b2

icuuc59.dll	udata_getRawMemory_59 + 0x55
icuuc59.dll	icu_59::CStr::operator() + 0x448
icuuc59.dll	icu_59::CStr::operator() + 0xcbc
icuuc59.dll	udata_open_59 + 0x5c
i18npoollo.dll	com_sun_star_i18n_BreakIterator_get_implementation + 0x1982
i18npoollo.dll	com_sun_star_i18n_BreakIterator_get_implementation + 0x13f0
i18npoollo.dll	i18npoollo.dll + 0x6dfc
editenglo.dll	SfxEnumItem<enum SvxFrameDirection>::GetValue + 0x4c02
editenglo.dll	SvxTabStop::GetFill + 0xf72d
editenglo.dll	EditEngine::CompleteOnlineSpelling + 0x41
<unknown>	0x77a41e29
vcllo.dll	Timer::Invoke + 0xd
vcllo.dll	Application::Execute + 0x18e
sofficeapp.dll	sofficeapp.dll + 0xd626
vcllo.dll	DeInitVCL + 0xb5a
vcllo.dll	SVMain + 0x29
sofficeapp.dll	soffice_main + 0x79
soffice.bin	soffice.bin + 0x1021
KERNEL32.DLL	BaseThreadInitThunk + 0xe
ntdll.dll	RtlInitializeExceptionChain + 0x85
ntdll.dll	RtlInitializeExceptionChain + 0x58



User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0
Comment 1 Buovjaga 2017-06-29 11:22:29 UTC
Repro.

Version: 6.0.0.0.alpha0+ (x64)
Build ID: e0f67add2ec56706ce06a03572535266f21c0303
CPU threads: 4; OS: Windows 6.19; UI render: default; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2017-06-27_23:04:56
Locale: fi-FI (fi_FI); Calc: group
Comment 2 Telesto 2017-08-01 09:20:23 UTC
Similar behavior can be observed in Writer:
1. Open Writer
2. Tools -> Autocorrect -> Autocorrect Options
3. Select a language and monitor the accessed files. OpenOffice\dict_word.brk & dict_word_en.brk will be accessed multiple times.
Comment 3 Telesto 2017-08-12 07:17:22 UTC
I could be related to the last two commits:
https://opengrok.libreoffice.org/history/core/i18npool/CustomTarget_breakiterator.mk
Comment 4 Buovjaga 2017-08-12 07:46:51 UTC
(In reply to Telesto from comment #3)
> I could be related to the last two commits:
> https://opengrok.libreoffice.org/history/core/i18npool/
> CustomTarget_breakiterator.mk

That is a makefile. We would need to find the codepath accessing the brks.
Comment 5 Mike Kaganski 2017-09-18 09:29:26 UTC
The code is in BreakIterator_Unicode::loadICUBreakIterator() - see i18npool/source/breakiterator/breakiterator_unicode.cxx
Comment 6 Mike Kaganski 2017-09-18 11:29:33 UTC
/cygdrive/d/sources/bibisect-win32-5.4
$ git bisect log
# bad: [ce4dd90e7ca9dbdd95cd371173de6fc199859a4d] source sha:f200d5700782ae179fd96b6ad4b0fe8e7edd1616
# good: [633bfe84509c1953415e5dd0f564098a16890131] source sha:4136757b4e51c4e6f7cb4132c95538a7f831ef2c
git bisect start 'master' 'oldest'
# good: [b0dbbec4cf8fe5d5e886cce07fd4f377e4f2559e] source sha:c2b1336b7b2fbec0172c09e247593bd43320f5fd
git bisect good b0dbbec4cf8fe5d5e886cce07fd4f377e4f2559e
# good: [b0dbbec4cf8fe5d5e886cce07fd4f377e4f2559e] source sha:c2b1336b7b2fbec0172c09e247593bd43320f5fd
git bisect good b0dbbec4cf8fe5d5e886cce07fd4f377e4f2559e
# good: [92bbe846cf3ea2aac2b05cdf4a4d2db61b9f555e] source sha:975440b9189602b5a10059d892cb09e6849148f7
git bisect good 92bbe846cf3ea2aac2b05cdf4a4d2db61b9f555e
# bad: [825d9d3aeb62f626969e464f4078b0146436e4dc] source sha:3965f4cb28676133dc37a926e56b4d612e2a57ba
git bisect bad 825d9d3aeb62f626969e464f4078b0146436e4dc
# bad: [eb3522634054ad4468092ed913695c9f95520ca0] source sha:52c8f47e8774304d207ef15c010b204ead291077
git bisect bad eb3522634054ad4468092ed913695c9f95520ca0
# good: [27a5d811a7277ece30d85638624aaf8189c39db9] source sha:b7324ecbf36aae49627d5a5ff250a94de3abc4aa
git bisect good 27a5d811a7277ece30d85638624aaf8189c39db9
# good: [c15f20a098368f7c75c766cfcd4c88f6d7f374eb] source sha:41f5c11c3b5f5b57f480dd809b850fe563b53691
git bisect good c15f20a098368f7c75c766cfcd4c88f6d7f374eb
# good: [d84f74bc7f8b2f4b1cf10a608bb9019e5b367881] source sha:1b471124df251011b0053900cb82ceb0f3d8be86
git bisect good d84f74bc7f8b2f4b1cf10a608bb9019e5b367881
# bad: [3fec25994b18f379577d1943334801c359977535] source sha:e5b1c5374464f6f86b3d331fb89c0e126008136a
git bisect bad 3fec25994b18f379577d1943334801c359977535
# bad: [9112df67eeffe2e6e6a8fdd57a304fb9561758bf] source sha:7c4c9947b8e52ce67af1ab131ed583a41f0ddbfa
git bisect bad 9112df67eeffe2e6e6a8fdd57a304fb9561758bf
# bad: [736f2e8f9636983e01543b69e2a13a22cf2d9e94] source sha:437105b940d997d742bd5e31cfa0ce4b949b29f2
git bisect bad 736f2e8f9636983e01543b69e2a13a22cf2d9e94
# bad: [eeb91e595170c71795c3e004ab70b35412d6f6df] source sha:6d187d88829fc4cbf8400636f17c4e2a684e2117
git bisect bad eeb91e595170c71795c3e004ab70b35412d6f6df
# bad: [5bc3ad4c9bf310826254e66e1ad2c0ae2cefa7a2] source sha:55c5b27bd683a7c36f07c1be781d8baad30b4571
git bisect bad 5bc3ad4c9bf310826254e66e1ad2c0ae2cefa7a2
# bad: [e55fb0c2766464b78871292469367ad26dab7361] source sha:fabad007c60958f2ff87e8f636ff6a798ad1f963
git bisect bad e55fb0c2766464b78871292469367ad26dab7361
# first bad commit: [e55fb0c2766464b78871292469367ad26dab7361] source sha:fabad007c60958f2ff87e8f636ff6a798ad1f963

https://cgit.freedesktop.org/libreoffice/core/commit/?id=fabad007c60958f2ff87e8f636ff6a798ad1f963

author	Eike Rathke <erack@redhat.com>	2017-04-21 23:24:19 (GMT)
committer	Eike Rathke <erack@redhat.com>	2017-04-26 17:48:18 (GMT)
commit	fabad007c60958f2ff87e8f636ff6a798ad1f963
tree	cc7b5c588235f251b71738b128bfaa0b3ce291b0
parent	1b471124df251011b0053900cb82ceb0f3d8be86
Upgrade to ICU 59.1
Comment 7 Eike Rathke 2017-09-18 15:57:12 UTC
Where's the problem?
Comment 8 Eike Rathke 2017-09-18 16:00:26 UTC
Or rather, *what* is the problem?
Comment 9 Buovjaga 2017-09-18 16:17:31 UTC
(In reply to Eike Rathke from comment #8)
> Or rather, *what* is the problem?

The location program\OpenOffice does not exist and the accesses are superfluous.
Comment 10 Buovjaga 2017-09-18 16:19:06 UTC
(In reply to Buovjaga from comment #9)
> (In reply to Eike Rathke from comment #8)
> > Or rather, *what* is the problem?
> 
> The location program\OpenOffice does not exist and the accesses are
> superfluous.

More precisely, it tries to create the files and the result is: PATH NOT FOUND in Process monitor
Comment 11 Mike Kaganski 2017-09-18 18:12:55 UTC
(In reply to Eike Rathke from comment #7)
> Where's the problem?
> Or rather, *what* is the problem?

Prior to that commit, while scrolling in Calc, there was no disk I/O. Starting with this commit, and up to current master, there's massive disk I/O that tries to open those non-existent files. We don't (and didn't) ship those files, so it must be something changed that enabled the calls to BreakIterator_Unicode::loadICUBreakIterator() that were previously not called. In the method, the "OpenOffice" is hardcoded.

The I/O is thought as one of possible causes of drastic UI performance regression on Windows in 5.4.
Comment 12 Eike Rathke 2017-09-25 17:35:18 UTC
(In reply to Mike Kaganski from comment #11)
> The I/O is thought as one of possible causes of drastic UI performance
> regression on Windows in 5.4.

That can't be because above it says to be bisected to the upgrade to ICU 59.1, which we don't have in 5.4 (reverted because Vista couldn't be supported then).
Comment 13 Eike Rathke 2017-09-25 18:37:57 UTC
Investigating.
Comment 14 Eike Rathke 2017-09-25 18:42:01 UTC
*** Bug 111078 has been marked as a duplicate of this bug. ***
Comment 15 Commit Notification 2017-09-28 10:54:03 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=0f6b9eca2007629e61d3ab9cdf2a616a33cbdefe

i18n-perf: cache map of breakiterators, tdf#108703

It will be available in 6.0.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Eike Rathke 2017-09-28 11:19:48 UTC
(In reply to Eike Rathke from comment #12)
> (In reply to Mike Kaganski from comment #11)
> > The I/O is thought as one of possible causes of drastic UI performance
> > regression on Windows in 5.4.
> 
> That can't be because above it says to be bisected to the upgrade to ICU
> 59.1, which we don't have in 5.4 (reverted because Vista couldn't be
> supported then).
Or actually it could be, but the bisected commit result is wrong. The ICU behaviour to look for files first then for data objects is old, maybe triggered much more frequently now in 5.4
Comment 17 Eike Rathke 2017-09-28 11:36:30 UTC
Pending review https://gerrit.libreoffice.org/42906 for 5-4
Comment 18 Commit Notification 2017-10-13 20:04:44 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-5-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=2bb31e9ff5f5433f29ee02bc673193bc1093e0fe&h=libreoffice-5-4

i18n-perf: cache map of breakiterators, tdf#108703

It will be available in 5.4.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.