56110 – re-write stream operators to methods

Bug 56110 - re-write stream operators to methods

Summary: re-write stream operators to methods

Status:	RESOLVED FIXED

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	LibreOffice (show other bugs)
Version: (earliest affected)	4.0.0.0.alpha0+ Master
Hardware:	Other All

Importance:	medium normal
Assignee:	Not Assigned

URL:
Whiteboard:
Keywords:	difficultyBeginner, easyHack, skillCpp, topicCleanup

Depends on:
Blocks:

Reported:	2012-10-18 01:07 UTC by Michael Meeks
Modified:	2016-02-18 16:37 UTC (History)
CC List:	3 users (show)

See Also:	39428
Crash report or crash signature:

Attachments
Step 2 of my proposed process (4.96 KB, text/plain) 2013-02-22 12:45 UTC, Christopher Hotchkiss	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Michael Meeks 2012-10-18 01:07:33 UTC

The SvStream class exports a lot of operators for both reading and writing values to binary files. We need to clean that up and create new methods eg. 'readInt32()' that return a defined default result - ie. zero for ints, false for bools, "" for strings etc. ie. perhaps:

sal_Int32 readInt32(sal_Int32 nDefault = 0);

We should take care as we do the change to catch those (rather few) places where the undefined result is used as a feature: ie. "if we failed to read, the default was the old value of the result". ie. this sort of thing:

SvStream& SvStream::operator>>(sal_uInt16& r)
{
    sal_uInt16 n = 0;
    READNUMBER_WITHOUT_SWAP(sal_uInt16, n)
    if (good())
    {
        if (bSwap)
            SwapUShort(n);
        r = n;
    }
    return *this;
}

is really bad news for security auditing - and it'd be lovely to kill it everywhere.

Similar changes for writing would also be great - again, to avoid casual changes to types causing knock-on breakage in files.

It is possible that a magic CLANG plugin could auto-generate such a patch, which may be worth investigating due to the many call-sites involved.

Comment 1 leighman 2012-12-06 14:26:47 UTC

EasyHack with code details -> NEW

Comment 2 Christopher Hotchkiss 2013-02-16 00:00:27 UTC

Just so I understand this change, the SvStream class should stay, it is just the stream operators that should go, correct? I can reuse the other methods in the class.

Comment 3 Michael Meeks 2013-02-16 21:37:29 UTC

> Just so I understand this change, the SvStream class should stay

Yes - that's right ! :-) Almost certainly this work is best done as a clang plugin - checkout compilerplugins/README - at least we should be able to catch the most simple cases that way: when a locally scoped variable is deliberately initialized and then there is some branch afterwards on it's result.

Of course, the clang thing makes it less of an 'easy' hack - so feel free to start this by hand to get a feel for what needs doing.

Thanks !

Comment 4 Christopher Hotchkiss 2013-02-20 00:29:08 UTC

I gave this change a shot on a single method and it has been quite a major pain due to the impacting code sites (I've put 40+ hours in with no end in sight). I'm going to start over on it and I would like to know if doing it in a more incremental way would be okay for the project.

In short, how about these steps:
 1. Create the new read and write methods, possibly with templates and specific implementations of the template.
 2. Mark the stream operators with the "__attribute__((deprecated))" flag so that the compiler screams about the usage.
 3. Switch the unit tests over to the new methods.
 4. Push to Gerrit for a review. (Is this appropiate at this point?)
 5. Start on a series of passes to fix all the warnings.
 6. Push each pass up to Gerrit for review.
 7. Once the warnings are clean, deleted the now unused methods.

Also other than ccache is there anything else I can do to speed up compiles, every time I touch a header it seems I have to eat a 3+ hour compile.

Comment 5 Christopher Hotchkiss 2013-02-22 12:45:32 UTC

Created attachment 75311 [details]
Step 2 of my proposed process

I tested out step 2 of the proposed process just to get an idea of the size of this change. By applying the following patch to the build we get 12,203 extra warnings.

My question is due to the huge of the change is submitting an incremental set of patches okay? If so, I'll get started and submit step 1.

Comment 6 Michael Meeks 2013-02-22 14:47:27 UTC

Hi Christopher:

> I gave this change a shot on a single method and it has been quite a major
> pain due to the impacting code sites (I've put 40+ hours in with no end in
> sight).

Right ! :-) prolly worth starting with a lesser used type eg. 'double' :-)

> I'm going to start over on it and I would like to know if doing it
> in a more incremental way would be okay for the project.

Yes - that's fine; we love incremental approaches in general.

> 2. Mark the stream operators with the "__attribute__((deprecated))"
> flag so that the compiler screams about the usage.

I think that's something you might want locally but not in the tinderbox builds some of which tend to compile with warnings-as-errors :-)

> Also other than ccache is there anything else I can do to speed up
> compiles, every time I touch a header it seems I have to eat
> a 3+ hour compile.

ccache will not help at all if there is a header change; are you using icecream ? (if you have idling lan connected hardware that is ;-)

> I tested out step 2 of the proposed process just to get an idea
> of the size of this change. By applying the following patch to
> the build we get 12,203 extra warnings.

Right - so - this is why (in general) I think it is better to write a Clang plugin - to automate the conversion of 90+% of those call-sites where we can reasonably guess that the data was not intended to be initialized to some default state in the case of read failure; and then audit the harder ones manually.

> My question is due to the huge of the change is submitting an
> incremental set of patches okay? If so, I'll get started and
> submit step 1.

Sure - doing this incrementally is fine - but -really- it would be a better use of your brain (I think) to install clang, and have a go at the compilerplugins/ directory - at least to catch the 'easy' cases :-)

How does that sound ? [ and thanks so much for looking at this ! ]

Comment 7 Julien Nabet 2013-09-14 12:19:01 UTC

Any update here?
Anyway, I put fdo#39428 in see also.

Comment 8 Michael Meeks 2013-09-14 14:24:49 UTC

I guess poking at the compilerplugins/ and the clang approach here would be really good; in theory we can automatically re-write this :-)

Comment 9 Björn Michaelsen 2013-10-04 18:47:25 UTC

adding LibreOffice developer list as CC to unresolved EasyHacks for better visibility.

see e.g. http://nabble.documentfoundation.org/minutes-of-ESC-call-td4076214.html for details

Comment 10 Robinson Tryon (qubit) 2013-10-19 01:23:40 UTC

Removing comma from whiteboard (please use a space to delimit values in this field)
https://wiki.documentfoundation.org/QA/Bugzilla/Fields/Whiteboard#Getting_Started

Comment 11 Björn Michaelsen 2015-02-10 09:05:33 UTC

ASSIGNED->NEW: Hasnt changed since 2013 apparently ...

Comment 12 Noel Grandin 2015-02-10 09:19:31 UTC

I've already done the bulk of this. 

There might be some remnants left in other modules of operator<< and operator>> acting on stream-like classes, but they're definitely not EasyHack material

Comment 13 Robinson Tryon (qubit) 2015-12-15 16:27:27 UTC

Migrating Whiteboard tags to Keywords: (EasyHack,DifficultyBeginner,SkillCpp,TopicCleanup)
[NinjaEdit]

Comment 14 Robinson Tryon (qubit) 2016-02-18 16:37:24 UTC

Remove LibreOffice Dev List from CC on EasyHacks
(curtailing excessive email to list)
[NinjaEdit]