Bug 88695 - RowSet and its clones share a cache: can lead to cache thrashing
Summary: RowSet and its clones share a cache: can lead to cache thrashing
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: reviewed:2022
Keywords: difficultyInteresting, easyHack, skillCpp, skillSql
Depends on:
Blocks:
 
Reported: 2015-01-22 08:38 UTC by Lionel Elie Mamane
Modified: 2022-11-08 21:37 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lionel Elie Mamane 2015-01-22 08:38:17 UTC
A RowSet (dbaccess::ORowSet in dbaccess/source/core/api/RowSet.[ch]xx) and its clones (created by member function createResultSet, type dbaccess:ORowSetClone) share a cache (dbaccess::ORowSetCache).

This can lead to constant cache trashing when the rowset and its clone (or two clones) are in very different positions and operations on them are interleaved, so that the cache's window is constantly moved back and forth.

What can we do about that?

1) They each have their own cache. That's a no-go because then one rowset/clone becomes blind to the inserts, updates and deletes of the other rowset/clone.

2) They each have their own cache, but the caches cooperate. Complex.

3) We improve dbaccess::ORowSetCache to manage *several* windows instead of only one, so that at all times least one window serves each ORowSet(Clone). If two ORowSet(Clone)s can be served by the same window, all the better. A window "dies" when all ORowSet(Clone)s move out of it. That looks like the most reasonable solution. For simplicity, windows never overlap; if they do, they get merged?
Comment 1 Lionel Elie Mamane 2015-01-22 09:02:02 UTC
The cache window is m_pMatrix in in dbaccess::ORowSetCache (file dbaccess/source/core/api/ORowSetCache.[ch]xx). The idea would be to have something like a collection windows, accessible through their respective ranges, instead of a single window.

Range is just two integers, the first row of it and the first row that is not inside (so that the range is from inclusive to exclusive, like m_nStartPos and m_nEndPos are now). They replace m_nStartPos and m_nEndPos which should be removed.

To choose the data structure that will hold the range-to-window mapping, the basic requirements are:
1) Given a position (row number),
   to have easy/fast lookup of the window that position is in.
2) Easy/fast insertion of a new window, merge of two windows,
   split of a window in two and resizing and moving of windows.

The general solution would be something like a balanced lookup tree. Since we don't intend to have many windows, a simple array of (range, window) pairs could be adequate.

The idea is to keep a single m_nPosition/m_bAfterLast/etc, since that's the current design.

As a second step, we could separate all of those into "one per attached RowSet(Clone)", which would avoid the need for the RowSet(Clone)s to constantly reposition their cache for fear that another RowSet(Clone) has moved it. On second thought, this might be the only way to go since else the cache wouldn't know when to kill an unused window, since it wouldn't know when it is unused!

(In reply to Lionel Elie Mamane from comment #0)
> 2) They each have their own cache, but the caches cooperate. Complex.

Since apparently clones cannot make any insert/updates, these things would be less difficult than I thought. You can investigate that and do that instead of you wish. Essentially, I have in mind a design that the cache of the RowSet knows of the caches of the RowSetClones and thus invalidates/updates/... them on modification/insertion/deletion.
Comment 2 Pranav Kant 2015-03-13 21:49:51 UTC
I would like to work on this.
Comment 3 Lionel Elie Mamane 2015-03-14 02:28:04 UTC
(In reply to Pranav Kant from comment #2)
> I would like to work on this.

Great. Let me know of your progress, and of any question you might have.
Comment 4 Robinson Tryon (qubit) 2015-12-13 13:24:14 UTC Comment hidden (obsolete)
Comment 5 Robinson Tryon (qubit) 2016-02-18 14:51:31 UTC Comment hidden (obsolete)
Comment 6 Hossein 2022-11-08 14:23:49 UTC
Re-evaluating the EasyHack in 2022

This issue is still relevant. The related methods/variables are almost unchanged, and the problem seem to persists. But, we need a test case to actually show the problem. After fixing the issue, we need a test to show that the problem is fixed.

One side note: This problem is usually called "cache thrashing" and not "cache trashing".
https://en.wikipedia.org/wiki/Thrashing_(computer_science)