Bug 118006 - Add support for Quattro Pro .qpw format, looking for sample documents[libwps]
Summary: Add support for Quattro Pro .qpw format, looking for sample documents[libwps]
Status: NEW
Alias: None
Product: Document Liberation Project
Classification: Unclassified
Component: General (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium enhancement
Assignee: Devansh Varshney
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-06-05 09:01 UTC by osnola
Modified: 2025-04-10 08:41 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
some QPW9 files (73.89 KB, application/zip)
2018-06-05 09:01 UTC, osnola
Details
files resaved with Quattro pro x9 (71.91 KB, application/zip)
2018-07-08 06:43 UTC, raal
Details
file Quattro pro x9 (4.50 KB, application/x-ole-storage)
2018-07-20 21:11 UTC, raal
Details

Note You need to log in before you can comment on or make changes to this bug.
Description osnola 2018-06-05 09:01:42 UTC
Created attachment 142530 [details]
some QPW9 files

Hello,
as I am adding a filter to convert QPW files in libwps, and I only have some Quattro Pro 9 files (and maybe some Quattro Pro 12 files), I am looking for sample documents created by more recent versions of Quattro Pro:
- either to add code to convert them,
- or to check that the filter will not try to convert them (if the result is too incomplete).

Notes:
- the code to convert .qpw9 files can be found in the current source of libwps (if someone want to improve it), so it will probably appear in LibreOffice 6.2,
- I attach a zip files which contains some QPW9 files: 
  + some of these files are test files (resaved in QPW9) found on bugzilla, or
     bz.apache.org, ... ; 
  + passwordtoto.qpw is a small protected file with the password toto,
  + others are personal test files which makes no sense.
If someone can open them in some newest version of Quattro Pro and "save as" them to get new version files, it will be easier to see what changes between each version.
- in all cases, knowing the version of QPW which creates a file is "a plus".
Comment 1 Xisco Faulí 2018-07-06 12:31:24 UTC
Hi oslona,
I've done a public call for documents in the dev/qa/project mailing list [1] and in Telegram.

[1] http://document-foundation-mail-archive.969070.n3.nabble.com/Libreoffice-qa-Looking-for-Quattro-Pro-qpw-files-td4243343.html
Comment 2 osnola 2018-07-06 13:28:06 UTC
Thank for the public call :-)

Note:
- I compiled recently an online version of libwps with emscripten:
	http://libwps.sourceforge.net/convertWPS.html
  which can be used to test which files are currently converted, what
  problem/bug exist, ...
Comment 3 raal 2018-07-08 06:43:49 UTC
Created attachment 143385 [details]
files resaved with Quattro pro x9
Comment 4 raal 2018-07-20 21:11:16 UTC
Created attachment 143661 [details]
file Quattro pro x9
Comment 5 Xisco Faulí 2018-10-18 11:45:57 UTC
Putting to NEW
Comment 6 osnola 2023-05-04 07:18:00 UTC
I don't know if the conversion of SheetColor.qpw is correct; but the other files seem to open correctly in LibreOffice 7.0. Maybe we can close this bug report.
Comment 8 Devansh Varshney 2024-04-01 09:34:14 UTC
Is this related -

https://tika.apache.org/2.9.0/api/org/apache/tika/parser/wordperfect/QuattroProParser.html

also,

http://web.archive.org/web/20021023090332/http://www.corel.com/partners_developers/ds/CO32SDK/docs/qp7/qpf3recd.htm

contains documentation for the Quattro Pro 7 file format, which is different from the .qpw format used in more recent versions of Quattro Pro?
Comment 9 osnola 2024-04-02 09:09:53 UTC
This seems related:
- when I tried to add .qpw support, it was to extend the previous quattro pro file filter that converted .wb2, .wb3, ...
- I'm not sure, but Quattro Pro 7 seems to create .wb3 files, so there must already exist a filter in liwbps (at least that's the impression I get from looking at the format documentation). It's probably not perfect, but if it is, it would be useful to have files that haven't been converted.
Comment 10 Devansh Varshney 2024-04-02 16:47:33 UTC
I did get the idea that have to make a new class(file) qpwimport.cxx to handle the parsing for this format?

Moreover, I did found this discussion -

https://forum.openoffice.org/en/forum/viewtopic.php?p=33845&sid=b23aa0956bbc019ec892a2135aba2a90#p33845
Comment 11 Devansh Varshney 2024-04-03 06:41:20 UTC
The limitations and challenges in importing large Quattro Pro (.qpw) files into other spreadsheet applications like Excel and OpenOffice.org (now Apache OpenOffice and LibreOffice). 

1. Quattro Pro's .qpw format supports larger spreadsheet sizes compared to pre-2007 Excel format:

   a. .qpw format: 18,276 columns x 1,000,000 rows
   b. Pre-2007 Excel format: 256 columns x 65,536 rows

2. Even with the newer Excel 2007 format (16,384 columns x 1,000,000 rows), it still falls short of accommodating the maximum sizes supported by Quattro Pro.

3. OpenOffice.org (and its successors) have been using the old Excel file format limitations, which poses challenges when importing large Quattro Pro sheets.

From OOo BZ -

https://bz.apache.org/ooo/show_bug.cgi?id=30215 (Support 1048576 rows)
https://bz.apache.org/ooo/show_bug.cgi?id=5460  (Corel Quattro Pro import filter/file open)



Some points for me -

1. We have to ensure that the import filter can handle the maximum spreadsheet sizes supported by the .qpw format (18,276 columns x 1,000,000 rows).

2. Implement the necessary parsing and data extraction logic to read and interpret the .qpw format correctly, considering any format-specific features and record types.

3 .Map the imported data accurately to the corresponding cells, formulas, and formatting in LibreOffice Calc, taking into account any differences in the way data is represented between the two formats.

4. Optimize the import process to handle large spreadsheets efficiently, considering memory usage and performance.

5. Thoroughly test the import filter with a wide range of .qpw files, including large and complex spreadsheets, to ensure robustness and reliability.
Comment 12 osnola 2024-04-04 07:41:38 UTC
I hope I'm not talking too much nonsense, I'm getting old. If I remember correctly, OpenOffice had a filter for old QuattroPro files at least .wb2 ( which also opened some .wb3 files because the format hadn't evolved much ) and maybe .wq1 and .wq2 files ( not sure)

This filter disappeared when the filters written in java were removed. I have since rewritten filters for .wq1-2, .wb1-3 and qpw (at least the first files, the .qpw format may have evolved since then) in libwps... This filter can undoubtedly be improved.
Comment 13 Dennis Roczek 2024-04-18 09:37:39 UTC
Well you are picking the rare corner cases.

(In reply to Devansh Varshney from comment #11)
> The limitations and challenges in importing large Quattro Pro (.qpw) files
> into other spreadsheet applications like Excel and OpenOffice.org (now
> Apache OpenOffice and LibreOffice). 
> 
> 1. Quattro Pro's .qpw format supports larger spreadsheet sizes compared to
> pre-2007 Excel format:
> 
>    a. .qpw format: 18,276 columns x 1,000,000 rows
>    b. Pre-2007 Excel format: 256 columns x 65,536 rows
> 
> 2. Even with the newer Excel 2007 format (16,384 columns x 1,000,000 rows),
> it still falls short of accommodating the maximum sizes supported by Quattro
> Pro.
> 
> 3. OpenOffice.org (and its successors) have been using the old Excel file
> format limitations, which poses challenges when importing large Quattro Pro
> sheets.
OpenOffice does still have these limitations (or better saying not support the full 2007+ OOXML row limit), LibreOffice did increase for 7.4 the support for the number of lines (1.048.576 lines x 16.384 columns). 

Even somebody used that many lines, I believe we finally added some warning that LibreOffice is not able to pick every line/columns from the file.

> From OOo BZ -
> 
> https://bz.apache.org/ooo/show_bug.cgi?id=30215 (Support 1048576 rows)
> https://bz.apache.org/ooo/show_bug.cgi?id=5460  (Corel Quattro Pro import
> filter/file open)
> 
> 
> 
> Some points for me -
> 
> 1. We have to ensure that the import filter can handle the maximum
> spreadsheet sizes supported by the .qpw format (18,276 columns x 1,000,000
> rows).
Don't. I believe it is still a rare corner case and should be used at the end as file format stuff is much more important (better import nearly everything with as much as possible features than import all lines without any features) ;-)
Comment 14 Devansh Varshney 2024-04-30 16:23:45 UTC
(In reply to Dennis Roczek from comment #13)
> Well you are picking the rare corner cases.

Understood. I would like to take this but maybe first I will close my some almost complete PRs first https://gerrit.libreoffice.org/q/owner:varshney.devansh614@gmail.com


Moreover, I have also requested to work on the support of the Histogram chart (if I get selected) else the Box and Whisker chart in LO. :)

But, if time permit how would I first approach this? 
By introducing the similar structure of the code as it's already there for the .wb* file formats for the qpw files. Then eventually add the support for the formats which are there in the qpw files.

I mean creating the new set of source files for the .qpw filter, such as qpw.cxx, qpwform.cxx, and qpwstyle.cxx, in the same directory sc/source/filter/qpro .

First might look like -

a. Identifying and validating the .qpw file format.
b. Reading the file header and extracting relevant information.
c. Parsing the file structure and identifying the different components.


TL;DR - how do I approach this?
Comment 15 Dennis Roczek 2024-05-02 19:22:24 UTC
Check https://github.com/Distrotech/libwps
This is the library in question. (Hope i provided the correct reprository. Or did we finally migrathe that to TDF Servers?)
Comment 16 osnola 2024-05-03 10:51:21 UTC
The correct repository is https://sourceforge.net/projects/libwps/
The parsers can be found in
- src/lib/QuattroDos* : .wq1-.wq2
- src/lib/Quattro9* : .qpw
- src/lib/Quattro* (not followed by 9 or Dos ) : .wb1-.wb3
Comment 17 Devansh Varshney 2024-05-03 16:04:59 UTC
Thank you. I will try do this and I and currently working on the support of Histogram chart in LO :)
Comment 18 Devansh Varshney 2025-04-09 11:24:25 UTC
https://wiki.documentfoundation.org/index.php?title=Development/GSoC/Ideas_without_a_mentor&diff=791446&oldid=790222


bugs.documentfoundation.org/show_bug.cgi?id=118006

This is what I have understood -

**Structure:** Quattro9.cpp reads records. For spreadsheet data, it delegates to Quattro9Spreadsheet.cpp. For graphics, it delegates to Quattro9Graph.cpp. For formulas, Quattro9Spreadsheet.cpp reads the binary data and then uses QuattroFormulaManager (from QuattroFormula.cpp) to translate the opcodes.

So I have to run LO+local libwps on all samples. Document precisely which formulas work, which give #NAME? (likely function mapping issue), which give #VALUE! or other errors (likely operand/operator issue), and which formatting/data types are missing/wrong?
Comment 19 osnola 2025-04-10 08:16:17 UTC
Libwps is based on librevenge (see https://www.documentliberation.org ). So if you get librevenge, compile it ( or install it as a package ). You can compile libwps as a stand-alone project, and get the basic executables wks2csv, wks2raw, wks2text. Then install libodfgen and compile writerperfect to obtain wks2ods.


Note: 
- if you configure it with --enable-full-debug, launching a conversion will also generate an ascii file in the current repository ( which contains what it has "found" ).
- yes, currently QuattroFormula.cpp is used to convert formulas in Quattro Pro 9 files... and previous Quattro Pro files as I don't find many differences between these formats. If there are too many differences, creating a new QuattroFormula9.cpp file may make sense.
Comment 20 osnola 2025-04-10 08:41:21 UTC
One last thing: Quattro Pro lets you add plug-ins to extend functions, which makes function conversion complicated: I don't know if the same plug-in always has the same number and the same name, but function names are likely to be different if the plug-in version is French, American, ...

An example: 
MN0.ascii:0018a8 [Entries(DLLIdFunct)[lib]:id=3,QENG,]0b000a00030004000051454e4700
MN0.ascii:0018b6 [Entries(DLLIdFunct)[func]:id=1,BESSELI,]0c000d00010007000042455353454c4900
MN0.ascii:0018c7 [Entries(DLLIdFunct)[func]:id=2,BESSELJ,]0c000d00020007000042455353454c4a00
MN0.ascii:0018d8 [Entries(DLLIdFunct)[func]:id=3,BESSELK,]0c000d00030007000042455353454c4b00
MN0.ascii:0018e9 [Entries(DLLIdFunct)[func]:id=4,BESSELY,]0c000d00040007000042455353454c5900
MN0.ascii:0018fa [Entries(DLLIdFunct)[func]:id=5,BINTONUM,]0c000e00050008000042494e544f4e554d00
MN0.ascii:00190c [Entries(DLLIdFunct)[func]:id=6,BINTOHEX,]0c000e00060008000042494e544f48455800
MN0.ascii:00191e [Entries(DLLIdFunct)[func]:id=7,BINTOOCT,]0c000e00070008000042494e544f4f435400
MN0.ascii:001930 [Entries(DLLIdFunct)[func]:id=12,DELTA,]0c000b000c0005000044454c544100
MN0.ascii:00193f [Entries(DLLIdFunct)[func]:id=13,GESTEP,]0c000c000d0006000047455354455000
MN0.ascii:00194f [Entries(DLLIdFunct)[func]:id=14,HEXTOBIN,]0c000e000e00080000484558544f42494e00
MN0.ascii:001961 [Entries(DLLIdFunct)[func]:id=15,HEXTOOCT,]0c000e000f00080000484558544f4f435400
MN0.ascii:001973 [Entries(DLLIdFunct)[func]:id=38,BASE,]0c000a0026000400004241534500
MN0.ascii:001981 [Entries(DLLIdFunct)[func]:id=39,CEILING,]0c000d0027000700004345494c494e4700
MN0.ascii:001992 [Entries(DLLIdFunct)[func]:id=40,EVEN,]0c000a0028000400004556454e00
MN0.ascii:0019a0 [Entries(DLLIdFunct)[func]:id=41,FACTDOUBLE,]0c00100029000a000046414354444f55424c4500
MN0.ascii:0019b4 [Entries(DLLIdFunct)[func]:id=42,FLOOR,]0c000b002a00050000464c4f4f5200
MN0.ascii:0019c3 [Entries(DLLIdFunct)[lib]:id=4,QFINANCE,]0b000e0004000800005146494e414e434500
MN0.ascii:0019d5 [Entries(DLLIdFunct)[func]:id=9,ABDAYS,]0c000c00090006000041424441595300
MN0.ascii:0019e5 [Entries(DLLIdFunct)[func]:id=11,BUSDAY,]0c000c000b0006000042555344415900
MN0.ascii:0019f5 [Entries(DLLIdFunct)[func]:id=12,FBDAY,]0c000b000c00050000464244415900
with an plug-in QENG (id=3) and functions BESSELI (id=1), ... FLOOR(id=42) then a plug-in QFINANCE (id=4), ...