Bug 54110 - Create script to identify most active bug wranglers
Summary: Create script to identify most active bug wranglers
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: Other All
: medium normal
Assignee: leighman
URL:
Whiteboard:
Keywords: difficultyBeginner, easyHack, topicQA
Depends on:
Blocks: 54163
  Show dependency treegraph
 
Reported: 2012-08-27 09:08 UTC by Björn Michaelsen
Modified: 2015-12-16 05:17 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Find top 10 bug reporters and wranglers (1.09 KB, text/x-python)
2012-08-27 16:57 UTC, leighman
Details
Improve printing (905 bytes, patch)
2012-08-28 16:18 UTC, leighman
Details
Use python3-compatible print syntax (1.15 KB, patch)
2012-08-28 19:13 UTC, leighman
Details
New version of full script (1.19 KB, text/x-python)
2012-08-28 19:15 UTC, leighman
Details
Tested python3 compatible (1.22 KB, text/x-python)
2012-08-28 19:49 UTC, leighman
Details
bugzilla statistics (22.93 KB, application/vnd.oasis.opendocument.spreadsheet)
2012-08-28 20:23 UTC, Rainer Bielefeld Retired
Details
Improved version of qawrangler-stats.py (5.33 KB, text/x-python)
2013-04-28 23:28 UTC, Marc Garcia
Details
Refactor patch (7.23 KB, patch)
2013-04-29 11:47 UTC, Marc Garcia
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Björn Michaelsen 2012-08-27 09:08:56 UTC
We have up to now no good way to find the most active bug wranglers -- we can look for those with the most reported bugs, but hardly for those who do the most changes and comments. However this should not be too hard to find out:
The archives of the libreoffice-bugs list at:

 http://lists.freedesktop.org/archives/libreoffice-bugs/

has all the information needed.

So this task is to create a Python or Perl script (best commited to the dev-tools repository in the end) that:
- takes one of the archives from the site above
- searches with regexps for both:
  - ^.*ReportedBy:.*$ (for reporters)
  - ^.* changed:$ (for bugwranglers)
- and reports the total counts found

That would allow the QA team to publish data on who is most active in QA on a monthly basis and help motivate people to become active bug wranglers.
Comment 1 leighman 2012-08-27 16:57:05 UTC
Created attachment 66182 [details]
Find top 10 bug reporters and wranglers

Serious case of 'my first Python' so probably could be done better but I think it does what is required.
Comment 2 Björn Michaelsen 2012-08-28 11:10:20 UTC
Awesome work!

Could you add yourself as developer at:

 http://wiki.documentfoundation.org/Development/Developers

and send the license blurb to the libreoffice@lists.libreoffice.org?

I pushed the script (with you as author) to:

 http://cgit.freedesktop.org/libreoffice/contrib/dev-tools/tree/scripts/qawrangler-stats.py

Maybe you want to send a mail to the qa-list with the stats of the last 3 month or so? I dont want to steal your thunder there.
Comment 3 leighman 2012-08-28 16:18:37 UTC
Created attachment 66235 [details]
Improve printing

I made the printout a bit prettier.
It's diffed against an old local file, hopefully that is still useful.
QA list post done, tho Thunderbird mangled the columns.
License blurb in the works
Comment 4 leighman 2012-08-28 19:13:08 UTC
Created attachment 66241 [details]
Use python3-compatible print syntax

Use python3-compatible print syntax
Comment 5 leighman 2012-08-28 19:15:35 UTC
Created attachment 66242 [details]
New version of full script

May now be python3 compatible.
urllib2 might cause a problem, haven't tried.
Comment 6 Rainer Bielefeld Retired 2012-08-28 19:24:25 UTC
(In reply to comment #5)
> urllib2 might cause a problem, haven't tried.

Your suspect came true: My result when Try on Win7 64bit: "ImportError: No module named urllib2"
Comment 7 leighman 2012-08-28 19:49:35 UTC
Created attachment 66244 [details]
Tested python3 compatible

Another full new version, this time tested compatible with python 3 on Ubuntu 12.04
Dunno if Libreoffice wants python 3 or not
Comment 8 Rainer Bielefeld Retired 2012-08-28 20:23:03 UTC
Created attachment 66247 [details]
bugzilla statistics

Now it works also for me, great work, I never found a manageable possibility to get statistics concerning the wrangling activity.

Attached document shows the result of a Bugzilla statistic (Link in table heading), what differs in some details. I see many reporters I know in the table the script provides, but I missed Bjoern, so that I became curious. I never checked that systematically, but for my own acitivity the Bugzilla statistic always was reliable. Now I wonder what trick might be in the mail archives that the script finds different results?
Comment 9 Rainer Bielefeld Retired 2012-08-29 05:15:37 UTC
(In reply to comment #8)

> ... statistic always was reliable. Now I wonder what trick might be in 
> the mail archives that the script finds different results?

The weak point is the man behind the key board, not the script or the archive. 
Today I checked again and immediately that in my Bugzilla query the selection "Bug creation" is missing.
Doing a correct query shows 43 reports for me with time range 2012-06-30 ...
2012-08-01 and 39 reports for me with time range 2012-07-01 ... 2012-07-31 (I have some difficulties to understand details of these time range query fields). So the result "41" from the Mailing list archive is very plausible.

Later I will write a short blog post concerning this great tool, I would really like to see it in the Extensions repository where it will be available more prominent, please see Bug 54163

Please excuse me for the noise concerning the results!
Comment 10 Björn Michaelsen 2012-08-29 08:19:17 UTC
Pushed as:

 http://cgit.freedesktop.org/libreoffice/contrib/dev-tools/commit/?id=6958a67487b1f9da3c253630ab62f066fc9bc05f

I took the freedom to extend the range to the Top30, as:

- the top ten are mostly trusted old hand -- we know about them
- it might make people look at the end of the list and see: "hey, I can do that too!"
- its exactly these causal contributors we want to motivate
Comment 11 leighman 2012-08-29 11:30:12 UTC
Only comment is the column headers still say Top 10
Comment 12 Rainer Bielefeld Retired 2012-08-29 13:08:20 UTC
Comment on attachment 66247 [details]
bugzilla statistics

Obsolete Wrong results
Comment 13 Björn Michaelsen 2012-08-29 16:02:02 UTC
(In reply to comment #11)
> Only comment is the column headers still say Top 10

Already fixed. Thanks again for this awesome script!
Comment 14 Marc Garcia 2013-04-28 23:28:07 UTC
Created attachment 78586 [details]
Improved version of qawrangler-stats.py

I did some improvements to the script developed for this bug report (I think it's more neat to use this old bug report, than creating a new one for this).

Among the improvements:
 - Made the script more modular
 - Made it more Pythonic
 - Added copyright information
 - Usage information is displayed when called with -h parameter
 - Made it more scalable, as it allows to add new possible information (e.g. the platform or the component of reported bugs), by simply adding its name and its regular expression to a defined constant
 - Made more customizable. As well as the year and month, it's possible to specify by the command line, the number of wranglers to be displayed (-n parameter), or the format of the output (see next point)
 - Added an alternative output as CSV (using the parameter --csv)

Due to slow internet connection (can't clone libreoffice git repo), I can't upload a patch at the moment, but I'll do it soon.
Comment 15 leighman 2013-04-29 11:34:06 UTC
Re copyright notice: This was done for Libreoffice and AFAIK has nothing to do with ASF so does not need that clause.
Comment 16 Marc Garcia 2013-04-29 11:47:26 UTC
Created attachment 78598 [details]
Refactor patch

Not sure about the libreoffice policy on the copyright note, in other projects I contributed, all files were supposed to have the copyright disclaimer.

I attach a patch for the changes on the the script. I already sent it to libreoffice@lists.freedesktop.org

While the script was working fine, and most of the code is basically the same, the refactor improved the script in the next ways:
 - Made the script more modular
 - Made it more Pythonic
 - Added copyright information
 - Usage information is displayed when called with -h parameter
 - Made it more scalable, as it allows to add new possible information (e.g. the platform or the component of reported bugs), by simply adding its name and its regular expression to a defined constant
 - Made more customizable. As well as the year and month, it's possible to specify by the command line, the number of authors to be displayed (-n parameter), or the format of the output (see next point)
 - Added an alternative output as CSV (using the parameter --csv), as well as the original human readable output
 - Added a new group to report, "commentators", together with existing "wranglers" (renamed "changers") and "reporters"
 - Displaying the full name of the authors (e.g. "My Name <myname@libreoffice.org>" instead of simply "myname@libreoffice.org")
Comment 17 Björn Michaelsen 2013-04-29 17:21:43 UTC
the script never was part of OOo or Apache, so the license header is kinda wrong.

Please consider using gerrit, which makes it much easier to review your changes:

 git clone git://gerrit.libreoffice.org/dev-tools.git dev-tools

then:

 cd dev-tools
 vim scripts/qawrangler-stats.py
 git commit .
 git push ssh://gerrit.libreoffice.org:29418/dev-tools HEAD:refs/for/master

See https://wiki.documentfoundation.org/Development/gerrit/setup for details ...
Comment 18 Robinson Tryon (qubit) 2015-12-16 05:17:49 UTC Comment hidden (obsolete)