Download it now!
Bug 124141 - Create a document analyser for LibreOffice triage and QA
Summary: Create a document analyser for LibreOffice triage and QA
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
Keywords: difficultyMedium, easyHack, skillJava, skillPython, skillScript, skillUno, topicQA
Depends on:
Reported: 2019-03-17 22:29 UTC by Björn Michaelsen
Modified: 2020-03-30 18:13 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:

The attachment contains my easy hack to the document anayser for a Libreoffice Document (777 bytes, text/plain)
2019-04-09 16:25 UTC, ipshii1609
Script for counting elements in *.odt documents (1.29 KB, text/x-python)
2020-03-30 18:13 UTC, Sebastian O.

Note You need to log in before you can comment on or make changes to this bug.
Description Björn Michaelsen 2019-03-17 22:29:43 UTC
Often issues arise with specific documents, esp. performance issues. It will help LibreOffice QA and developers when triaging issues, if there is a overview of what might be special about one specific document.

This EasyHack is to create a script or LibreOffice extension that provides statistics about a document. For a text document including for example:

- number of paragraphs
- number of pages
- number of images/embedded media
- number of changetracking (redlines)
- number of styles, bookmarks, tables, indexes, text frames, OLE objects, sections, hyperlinks, references, comments ...

The extension should produce the output as simple text, so that this can be easily copypasted into a bugreport. For other document types, other information might be relevant. For a simple scope, it should be ok to start with basic numbers about text documents.

Steps to Reproduce:

Actual Results:

Expected Results:

Reproducible: Always

User Profile Reset: No

Additional Info:
Comment 1 Anuj Agrawal 2019-03-19 13:25:36 UTC
I'm Anuj Agrawal. I'd like to work on this issue. Can you please elaborate on the format of the text output you wish the Script to generate?
Comment 2 ipshii1609 2019-04-09 16:25:28 UTC
Created attachment 150623 [details]
The attachment contains my easy hack to the document anayser for a Libreoffice Document
Comment 3 Pankaj Kumar 2019-04-23 07:55:03 UTC
Hi Anuj Agarwal,
Any update on the bug?
Comment 4 Ebrain Mirambeau 2019-07-31 04:33:44 UTC
I took a look at 's solution and I think that it's both incomplete and written in Python. I would like to work on it. Any objections?
Comment 5 gs_1001 2020-01-10 17:12:50 UTC
Can somebody please provide any update on this bug. One user did provide a script written in python. Is this bug still open. If yes then please do speak in the context of the mentioned script.
Comment 6 Piya 2020-01-20 15:10:47 UTC
I would like to start working on this bug. Wish me luck!
Comment 7 Buovjaga 2020-03-14 19:39:12 UTC
It seems Piya abandoned this, so unassigning.
Comment 8 Sebastian O. 2020-03-30 18:13:50 UTC
Created attachment 159166 [details]
Script for counting elements in *.odt documents

Hello everyone!

I fixed some errors in the script from :
- No counting of tables
- No counting of images

Also rewrote it, to make future additions possible.

To be done:

- Adding more category's
- Fixing page counting when doing manual page breaks, or finding a proper way to count pages.

Will try to add more stuff in the near future.