Bug Hunting Session
Bug 48345 - Display wrong datasets when scrolling through tables
Summary: Display wrong datasets when scrolling through tables
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
3.5.2 release
Hardware: All All
: medium critical
Assignee: Lionel Elie Mamane
URL:
Whiteboard: target:3.6.0 target:3.5.5
Keywords:
: 49522 50721 (view as bug list)
Depends on:
Blocks: mab3.5 47520
  Show dependency treegraph
 
Reported: 2012-04-05 09:33 UTC by racoon
Modified: 2012-06-10 15:14 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Example Database File (171.93 KB, application/vnd.sun.xml.base)
2012-04-05 09:33 UTC, racoon
Details

Note You need to log in before you can comment on or make changes to this bug.
Description racoon 2012-04-05 09:33:02 UTC
Created attachment 59526 [details]
Example Database File

Hello,

when I scroll through any tables by using the Page-Down-Key and go up with the Page-Up-Key, Base shows me in every line the same wrong dataset.

Steps to Reproduce:
Please open the attached file, then open the table and scroll down some pages with the Page-Down-Key. Then use the Page-Up-Key and you will see, that in every line is the same dataset.
Comment 1 Julien Nabet 2012-04-08 01:20:46 UTC
I reproduced this behaviour on pc Debian x86-64 on master branch (updated today).
No error/warning messages in console.
HSQL dataengine.
Comment 2 Lionel Elie Mamane 2012-05-15 09:41:13 UTC
There's something wrong in how the window is populated in function dbaccess::ORowSetCache::fill (file dbaccess/source/core/api/RowSetCache.cxx), but I don't think that the problem is in that function; it is basically a loop of

    for(; _bCheck && _aIter != _aEnd; _aIter++, _nPos++)
    {
        m_pCacheSet->fillValueRow(*_aIter, _nPos);
        _bCheck = m_pCacheSet->next();
    }

Note that the *first* execution of this loop gives correct data; this shows e.g. if you scroll back to the *first* row in the table, that first row is displayed correctly.

Subsequent executions of the loop are fucked up and fill in wrong data (from another row than expected).

I followed the call of fillValueRow, it executes correctly but with wrong data. It seems to me that m_pCacheSet is not positioned correctly, that's it. This would mean the bug is in the call to m_pCacheSet->next(). The latter is basically a delegation to m_xDriverSet->next(), so the problem must be there. 

Ah, this is an OKeySet... Looking at OKeySet::next()... Yup, I think I nailed it.

Testing...
Comment 3 Lionel Elie Mamane 2012-05-15 09:42:11 UTC
*** Bug 49522 has been marked as a duplicate of this bug. ***
Comment 4 Not Assigned 2012-05-15 09:53:22 UTC
Lionel Elie Mamane committed a patch related to this issue.
It has been pushed to "features/base-preview":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=e0ef49f034561fcfb024482c1e532c9cae6f5d14&g=features/base-preview

fdo#48345 need to refresh row also when not m_bRowCountFinal
Comment 5 Not Assigned 2012-05-15 09:59:17 UTC
Lionel Elie Mamane committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=8fca982829c03995e74faf17e5777b3aa61b91dc

fdo#48345 need to refresh row also when not m_bRowCountFinal
Comment 6 Lionel Elie Mamane 2012-05-15 10:09:18 UTC
Requested review for 3.5 branch... May take a few days.
Comment 7 Caolán McNamara 2012-05-18 08:38:18 UTC
FWIW the offending lines appear to come from http://cgit.freedesktop.org/libreoffice/core/commit/?id=1ae17f5b03cc14844fb600ca3573a96deb37ab3b so maybe its worth seeing if https://issues.apache.org/ooo/show_bug.cgi?id=102625 is still fixed now
Comment 8 Not Assigned 2012-05-18 08:45:21 UTC
Lionel Elie Mamane committed a patch related to this issue.
It has been pushed to "libreoffice-3-5":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=222ba799ad57b0ff93ac103c8af65c706ca4cf37&g=libreoffice-3-5

fdo#48345 need to refresh row also when not m_bRowCountFinal


It will be available in LibreOffice 3.5.5.
Comment 9 Alex Thurgood 2012-05-31 00:58:16 UTC
(In reply to comment #8)
> Lionel Elie Mamane committed a patch related to this issue.
> It has been pushed to "libreoffice-3-5":
> 
> http://cgit.freedesktop.org/libreoffice/core/commit/?id=222ba799ad57b0ff93ac103c8af65c706ca4cf37&g=libreoffice-3-5
> 
> fdo#48345 need to refresh row also when not m_bRowCountFinal
> 
> 
> It will be available in LibreOffice 3.5.5.


Hi all,

How will this impact db performance, especially with Java connections ? 

Ocke's solution, which caused this display problem in the first place, was only a partial solution to the problem that OOo and LO load the whole dataset into memory even though this is not (on the whole) required. 


Alex
Comment 10 Lionel Elie Mamane 2012-05-31 02:05:35 UTC
(In reply to comment #7)
> FWIW the offending lines appear to come from
> http://cgit.freedesktop.org/libreoffice/core/commit/?id=1ae17f5b03cc14844fb600ca3573a96deb37ab3b
> so maybe its worth seeing if
> https://issues.apache.org/ooo/show_bug.cgi?id=102625 is still fixed now

My analysis is that #i102625# was *never* really fixed in whole; in the referenced commit, the committer (Ocke) (claims he) fixed a *different* performance problem than what the reporter was complaining about.

(In reply to comment #9)
> (In reply to comment #8)

>> http://cgit.freedesktop.org/libreoffice/core/commit/?id=222ba799ad57b0ff93ac103c8af65c706ca4cf37&g=libreoffice-3-5

>> fdo#48345 need to refresh row also when not m_bRowCountFinal

> How will this impact db performance, especially with Java connections ?

What I'm sure about: my patch does not lead to any requerying if the data is already in LibreOffice memory. Basically, I call refreshRow more often. But this code near the top of refreshRow:

    if ( m_aKeyIter->second.second.second.is() )
    {
        m_xRow = m_aKeyIter->second.second.second;
        return;
    }

says that if the data is already in memory, the m_xRow reference is just changed to point to it. No refetching. So there is no additional database traffic, unless the needed data is not in memory.

> Ocke's solution, which caused this display problem in the first place, was only
> a partial solution to the problem that OOo and LO load the whole dataset into
> memory even though this is not (on the whole) required.

I don't think it changes what OOo/LibO stores in memory; maybe it removes duplicates (things that were stored twice are now stored once? Only in RowSetCache instead of also in KeySet?). It changes the RowSetCache/KeySet interaction so that RowSetCache tells KeySet not to bother to refresh the row (and potentially requery the DB) under some circumstances (when it calls FOO_checked(..., sal_False)). I suppose it is when it knows it will never ask the KeySet for that data anyway, because it has it in cache? I'm rather suspicious of that, basically RowSetCache tells KeySet "If *anybody* asks, give them wrong data; that's OK because I won't ask". Two problems:

 - Will anybody else than the RowSetCache ask? Maybe not, but I'm not sure.

 - That's not robust code. If RowSetCache asks KeySet for that data (subtle bug), we have a "wrong data" bug.

However, I have no smoking gun for my suspicions.
Comment 11 Terrence Enger 2012-05-31 08:02:14 UTC
(In reply to comment #10)
> 
> What I'm sure about: my patch does not lead to any requerying if the data is
> already in LibreOffice memory. Basically, I call refreshRow more often. But
> this code near the top of refreshRow:
> 
>     if ( m_aKeyIter->second.second.second.is() )
>     {
>         m_xRow = m_aKeyIter->second.second.second;
>         return;
>     }
> 
> says that if the data is already in memory, the m_xRow reference is just
> changed to point to it. No refetching. So there is no additional database
> traffic, unless the needed data is not in memory.

Bug 47520 reports that the condition in the if-statement can fail with a past-the-end iterator and a SIGABRT.  Perhaps it would be helpful to look at that bug report.
Comment 12 Lionel Elie Mamane 2012-05-31 09:04:35 UTC
(In reply to comment #11)
> (In reply to comment #10)

>>     if ( m_aKeyIter->second.second.second.is() )

> Bug 47520 reports that the condition in the if-statement can fail with a
> past-the-end iterator and a SIGABRT.  Perhaps it would be helpful to look at
> that bug report.

I see no reference to that if-statement in bug 47520. Maybe you meant another bug?
Comment 13 Terrence Enger 2012-05-31 09:43:43 UTC
(In reply to comment #12)
> I see no reference to that if-statement in bug 47520. Maybe you meant another
> bug?

Silly me, I meant bug 50372 "BASE SIGABRT: dereference a past-the-end iterator".  I am sorry for the misdirection.
Comment 14 Lionel Elie Mamane 2012-06-01 10:02:11 UTC
(In reply to comment #10)
> (In reply to comment #9)
>> How will this impact db performance, especially with Java connections ?

> What I'm sure about: my patch does not lead to any requerying if the data is
> already in LibreOffice memory.

I'm starting to get a glimpse how complex the issue is. Here's my current understanding:

What I said before is technically true, *but* the data USUALLY IS NOT in LibO memory. It is only we inserted / updated / ... that data OURSELVES. So *yes*, this change leads to more requerying again and essentially reverts the optimisation seen in the buglog of #i102625#. This optimisation was: if we just fetched a row in our "SELECT * FROM ...", then use the driver row directly to serve subsequent column requests. That optimisation was buggily implemented and pointed to the driver row even when one did *not* just fetch a row, thereby using another row.

OTOH, this optimisation leads to problems with ODBC drivers (see e.g. bug 47520), because one is not allowed to fetch the same data twice: primary key columns are fetched once by the KeySet for its internal use, and then once by the wrapping windowed cache (ORowSetCache).

So basically, in KeySet we have to read the *whole* row *once*, cache the result, and then use that to fill the internals of KeySet (the primary key columns) and to answer any getShort/getLong/getString/... query.

We *could* make the cache permanent (store it in m_aKeyIter->second.second.second), but that would take undue amounts of memory for big tables, and it feels particularly stupid that we have an intelligent "windowed" cache on top of that, as we would cache the whole table anyway... Why have another cache on top of that? So I'll make that cache invalid (memory freed) as soon as one moves in the KeySet (with absolute(i), next(), previous(), first(), last(), etc).

This will get us back the optimisation that we don't stupidly read just the primary key a row in "SELECT * FROM table" and then immediatly do a new query "SELECT * FROM table WHERE primary_key=current_row" to fetch the rest of the data in that row, which is what was done in #i102625# was ultimately about. OTOH, if we have moved away from the row, we *will* refetch it.

This looks to me like the right balance between memory consumption and network (LibO<->DataBase) traffic.

That "optimisation done right" will IMO *not* go into 3.5, though. Only 3.6.
Comment 15 Lionel Elie Mamane 2012-06-04 10:30:41 UTC
(In reply to comment #14)
> (In reply to comment #10)
>> (In reply to comment #9)

>>> How will this impact db performance, especially with Java connections ?

> *yes*, this change leads to more requerying again and essentially reverts the
> optimisation seen in the buglog of #i102625#.

> So basically, in KeySet we have to read the *whole* row *once*, cache the
> result, and then use that to fill the internals of KeySet (the primary key
> columns) and to answer any getShort/getLong/getString/... query.

I just pushed a reimplementation of that optimisation to master (LibreOffice 3.6 beta). Testing appreciated.
Comment 16 frofa 2012-06-10 15:14:11 UTC
*** Bug 50721 has been marked as a duplicate of this bug. ***