Bug 55066 (123-flaws) - FILEOPEN: flaws in importing 123 files
Summary: FILEOPEN: flaws in importing 123 files
Status: NEW
Alias: 123-flaws
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.5.5.3 release
Hardware: x86-64 (AMD64) All
: high normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: difficultyInteresting, easyHack, skillCpp
Depends on:
Blocks: Format-Filters
  Show dependency treegraph
 
Reported: 2012-09-18 19:13 UTC by Horst
Modified: 2019-03-24 01:26 UTC (History)
11 users (show)

See Also:
Crash report or crash signature:


Attachments
4 sheets, last sheet no numbers/formulas (70.55 KB, application/vnd.lotus-1-2-3)
2012-09-18 19:13 UTC, Horst
Details
universal-content.123 (5.02 KB, application/vnd.lotus-1-2-3)
2012-09-18 19:15 UTC, Horst
Details
universal-content2.ods expected result (9.11 KB, application/vnd.oasis.opendocument.spreadsheet)
2012-09-18 19:16 UTC, Horst
Details
parsing of universal-content2 (26.42 KB, text/plain)
2016-04-02 09:38 UTC, osnola
Details
new parsing of universal-content2 (34.88 KB, text/plain)
2016-12-24 06:36 UTC, osnola
Details
the current conversion of woodscrews.123 with libwps (422.13 KB, application/vnd.oasis.opendocument.spreadsheet)
2017-03-04 11:00 UTC, osnola
Details
the current conversion of UNIVERSAL-CONTENT2.123 with libwps (19.90 KB, application/vnd.oasis.opendocument.spreadsheet)
2017-03-04 11:01 UTC, osnola
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Horst 2012-09-18 19:13:45 UTC
Created attachment 67347 [details]
4 sheets, last sheet no numbers/formulas

Description:
When importing any 123-file the following elements are not translated
1) Cell Borders
2) Charts
3) Background/Font colors
4) Date/Time Formats show only the serial date number no formating
5) Cell width is set in CALC to "Optimal Cell Width", Merge & Center Cells not converted.
This behavior was tested from 3.5.5release to 3.6.2rc except versions that crash import.

for Version 3.6.2rc additional:
missing data/numbers and formulas.
On my workbooks it happend only on the last sheet (woodscrews.123)
with universal-content2.123 on both sheets.
I am also attaching universal-content2.ods (expected result)
Comment 1 Horst 2012-09-18 19:15:12 UTC
Created attachment 67348 [details]
universal-content.123
Comment 2 Horst 2012-09-18 19:16:16 UTC
Created attachment 67349 [details]
universal-content2.ods expected result
Comment 3 Not Assigned 2012-09-18 22:52:59 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=09ce51c3e3151ac73a306518deb81119fa159775

import values and formulas from lotus-123, fdo#55066



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 4 Not Assigned 2012-09-19 10:17:21 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "libreoffice-3-6":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=cc0ceb51e2e99eaa2ae90506ef4b26bdb7f28b39&g=libreoffice-3-6

import values and formulas from lotus-123, fdo#55066


It will be available in LibreOffice 3.6.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 5 Not Assigned 2012-09-19 13:49:03 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "libreoffice-3-6-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=eff2d03eb41b7bbf7c8201e378785c04e6afe906&g=libreoffice-3-6-2

import values and formulas from lotus-123, fdo#55066


It will be available already in LibreOffice 3.6.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 6 Markus Mohrhard 2012-09-24 16:32:34 UTC
So the value and formula import is fixed.

Sadly I won't be able to spend time on adding the other other features. I'll make this an easy hack. The lotus file format looks easy enough and the lotus import code is simple enough too.

This task ca be done by anybody with a little knowledge about C++.
Comment 7 Markus Mohrhard 2012-09-24 16:35:00 UTC
Some code pointers:

Lotus import code is in sc/source/filter/lotus/ and a file format specification is at http://www.schnarff.com/file-formats/lotus-1-2-3/WSFF2.TXT and http://www.schnarff.com/file-formats/lotus-1-2-3/WSFF1.TXT.

If you are interested in extending the support please contact me and I'll provide some smaller test documents.
Comment 8 Horst 2012-09-24 21:56:47 UTC
(In reply to comment #7)
> Some code pointers:
> 
Thank you for your "unsittlichen Antrag" :) but there are 2 problems:
1) I never programmed in C++ only Delphi/Pascal, FORTRAN (miner one) but would take me a few weeks to get into it.
2) I went over the the *.txt files and I think all of this is already implemented (they are from 1985 DOS-version). And I could not find anything so far for the missing pieces like BG color, Font color, Borders, Merge&center (WIN-version). I might be able to figure it out with Hex-Editor and a lot of trial&error.

My recommendation for importing .123 files to LO (Excel has NO importer!!):
If you have 123/Smartsuite on your computer save the file as .XLS and then import it to LO. If you need the Smartsuite they still sell it for around $50.- and install it. It run ok on my Win7 64-bit but some help does not work and some minor booboos.
Comment 9 Horst 2012-09-24 22:02:33 UTC
FYI: Just remenbered that I have the new IBM/Lotus Symphony 3.0.1 installed.
This guys don't even support any 123 file formats though based on OO3.4 (I think)
Comment 10 Horst 2012-09-28 20:39:21 UTC
Import works fine (last page data and formulas) in 3.6.2.2 rc2.

Found a few .WK4 worksheets to import but they all return a "Read-Error unkown Lotus 1-2-3 file format".

Happens in all versions that do not crash 3.5.5 til 3.6.2.2.
Comment 11 Björn Michaelsen 2013-10-04 18:46:10 UTC
adding LibreOffice developer list as CC to unresolved EasyHacks for better visibility.

see e.g. http://nabble.documentfoundation.org/minutes-of-ESC-call-td4076214.html for details
Comment 12 Björn Michaelsen 2014-01-17 00:43:32 UTC Comment hidden (obsolete)
Comment 13 yosi 2015-11-02 03:50:22 UTC

(In reply to Markus Mohrhard (retired) from comment #7)
> Some code pointers:
> 
> Lotus import code is in sc/source/filter/lotus/ and a file format
> specification is at
> http://www.schnarff.com/file-formats/lotus-1-2-3/WSFF2.TXT and
> http://www.schnarff.com/file-formats/lotus-1-2-3/WSFF1.TXT.
> 
> If you are interested in extending the support please contact me and I'll
> provide some smaller test documents.

Intersting extanding support, pls. reply, problems with simulate bug and running on my existing version.
Comment 14 Robinson Tryon (qubit) 2015-12-13 09:51:20 UTC Comment hidden (obsolete)
Comment 15 Shubham Tibra 2016-01-09 12:07:12 UTC
I'd like to work on this bug, is it still open to work on.

But I'll need some help. I am new here.

I didn't understand how the import code works and how it uses the file format?

Can someone help me?
Comment 16 Urmas 2016-01-09 22:06:44 UTC
What version of Lotus was that file created with?
Comment 17 osnola 2016-01-10 09:26:57 UTC
Hello,

(In reply to Shubham Tibra from comment #15)
> I'd like to work on this bug, is it still open to work on.
> 
> But I'll need some help. I am new here.
> 
> I didn't understand how the import code works and how it uses the file
> format?
> 
> Can someone help me?

the source of the LibreOffice's filter(*) are in sc/source/filter/lotus.
For instance, the version number of universal-content.123 (0x1005) is read in
https://docs.libreoffice.org/sc/html/filter_8cxx_source.html#l00109, so this function will return eWK123...

(*) concerning older Lotus's files, there are read via the libwps's filter : the entry point on the source is http://sourceforge.net/p/libwps/code/ci/master/tree/src/lib/Lotus.cpp#l310 ...
Comment 18 Shubham Tibra 2016-01-11 08:49:15 UTC
(In reply to osnola from comment #17)
> Hello,
> 
> (In reply to Shubham Tibra from comment #15)
> > I'd like to work on this bug, is it still open to work on.
> > 
> > But I'll need some help. I am new here.
> > 
> > I didn't understand how the import code works and how it uses the file
> > format?
> > 
> > Can someone help me?
> 
> the source of the LibreOffice's filter(*) are in sc/source/filter/lotus.
> For instance, the version number of universal-content.123 (0x1005) is read in
> https://docs.libreoffice.org/sc/html/filter_8cxx_source.html#l00109, so this
> function will return eWK123...
> 
> (*) concerning older Lotus's files, there are read via the libwps's filter :
> the entry point on the source is
> http://sourceforge.net/p/libwps/code/ci/master/tree/src/lib/Lotus.cpp#l310
> ...

Got it.Thanks for the help :)
Comment 19 Robinson Tryon (qubit) 2016-02-18 14:52:05 UTC Comment hidden (obsolete)
Comment 20 Jose 2016-04-01 10:57:35 UTC
Trying to fix import cell borders, the only reference I find to borders is in lotattr.cxx:

const ScPatternAttr& LotAttrCache::GetPattAttr( const LotAttrWK3& rAttr )

void LotAttrCache::LotusToScBorderLine( sal_uInt8 nLine, ::editeng::SvxBorderLine& aBL )

The last 2 methods are the only ones deal with borders in the sc/source/filter/lotus/   folder.

But neither of them are executed when importing a .123 file. So at the moment I guess I have to call those methods someway to start fixing this bug.

Am I right ?

If anyone would like to aport some light he will be more than welcome.
Comment 21 Jose 2016-04-01 13:43:07 UTC
Well, I can see that those 2 methods can read and apply the border when the file extension is .wk3
The files attached are .123 , so there is no any method to read attributes from that extension.
Comment 22 osnola 2016-04-02 09:38:58 UTC
Created attachment 124026 [details]
parsing of universal-content2

Hello,
I modified my version of libwps to do some introspection of this file(*), 
i.e. to obtain these file.

Notes:
- OP_CreatePattern123 in sc/source/filter/lotus/op.cxx contains some code to parse 
  the CellStyle fields that are not in libwps, ...
- I try to read the LineStyle structures by improvisation, so there may be some 
  mistakes, i.e. previously the patternId, the width, the dashId were stored in 1 
  uint16_t now there are stored in 3 uint16_t
- I try to code the reading of 0x0601, 0x0701 and 0x0008 structures, but there 
  maybe some mistakes.
- Datafe6E codes also the styles with id=0x6XXX(but I have never seems styles with 
  such id in previous files...)

(*) by trying to recognize some new fields. I need to do some checking then I will do a commit probably tomorrow...
Comment 23 Jose 2016-04-02 21:30:22 UTC
(In reply to osnola from comment #22)
> Created attachment 124026 [details]
> parsing of universal-content2
> 
> Hello,
> I modified my version of libwps to do some introspection of this file(*), 
> i.e. to obtain these file.
> 
> Notes:
> - OP_CreatePattern123 in sc/source/filter/lotus/op.cxx contains some code to
> parse 
>   the CellStyle fields that are not in libwps, ...
> - I try to read the LineStyle structures by improvisation, so there may be
> some 
>   mistakes, i.e. previously the patternId, the width, the dashId were stored
> in 1 
>   uint16_t now there are stored in 3 uint16_t
> - I try to code the reading of 0x0601, 0x0701 and 0x0008 structures, but
> there 
>   maybe some mistakes.
> - Datafe6E codes also the styles with id=0x6XXX(but I have never seems
> styles with 
>   such id in previous files...)
> 
> (*) by trying to recognize some new fields. I need to do some checking then
> I will do a commit probably tomorrow...

Did you commit it ?

If so can you say the reference.

Thanks
Comment 24 osnola 2016-04-03 07:43:22 UTC
Hello,
(In reply to Jose from comment #23)
> Did you commit it ?
> 
> If so can you say the reference.
> 
I just commit it (see  https://sourceforge.net/p/libwps/code/ci/94c6921e313e132daa551b77b451fc06a70a3031/ )

Notes concerning libwps:
- currently the conversion on Lotus123 is only active when libwps is compiled in 
  DEBUG mode and the results are worse than the LibreOffice lotus' filter because 
  there is no code to retrieve the cells' formats,
- when it is configured with the flag --enable-full-debug, it also creates some 
  files to see what is parsed in the current repository (i.e. in this case a file 
  MN0.ascii that I have renamed in .text)
Comment 25 osnola 2016-12-24 06:36:19 UTC
Created attachment 129914 [details]
new parsing of universal-content2
Comment 26 osnola 2016-12-24 07:14:04 UTC
Hello,
as I have done some works in libwps to try to convert some Lotus 123 spreadsheets, I updated the introspection's result.

Some notes concerning universal-content2.text :
- in fact, a 123's file is a list of streams; the main stream is "123" but there can be other streams if the spreadsheet contains charts, ... . The streams' structure is stored at the end of the file, I found the same structure to store streams in some RagTime's file, so I named it OLE1Struct (but it is probably not related to OLE),
- 1b000a00e60f(MainStyle) is indeed important as it stored the relation between a "major" style and its definition ; ie. we need to use it to retrieve the cell's styles based on a major style,
- there is at least 4 different "setStyle" zones (the zones which begin 0108XXXX):
    + 01080200: define the font's cell's style,
    + 01080400: define the numbering format of a cell
    + 01080800: define the cell's borders + ???
    + 01080c00: define the column/row size, page breaking, ...

Concerning libwps, I update the code to accept to convert Lotus' 123 SmartSuite 97 spreadsheets (files which begin by 00001a000310): now, it must be able to retrieve most cell's formatting and some basic graphic shapes.

Note: The same code seems be able to convert more recent Lotus' files but libwps will only accept to convert them if it is compiled in debug mode.
Comment 27 Tor Lillqvist 2016-12-27 11:06:31 UTC
Is this really "difficultyBeginner" ?
Comment 28 jani 2016-12-27 11:10:46 UTC
I agree with you.
Comment 29 osnola 2016-12-27 12:49:24 UTC
(In reply to Tor Lillqvist from comment #27)
> Is this really "difficultyBeginner" ?

This depends what we want to add: 
- adding color's font or background cell's colors may be relatively easy (as I have added a similar's code in libwps)
- retrieving cells borders, date/time format is a little more complicated (ie. we can copy the code's of libwps, but this may require some refactoring in the lotus' filter),
- while retrieving chart, ... is probably very difficult.

Now, the main problem is that the current code seems to suppose that there is only the "123" stream in 123's file which is not true: in fact, a 123's file is an archive file which contains different streams (which can explain some missing data), ...

Note:
- currently, I have updated libwps to retrieve more data in Lotus SmartSuite 1997 spreadsheet files => the next libwps's release will convert these files. I will probably try to enhance it to convert more recent lotus files in some months.
Comment 30 osnola 2017-03-04 11:00:30 UTC
Created attachment 131625 [details]
the current conversion of woodscrews.123 with libwps
Comment 31 osnola 2017-03-04 11:01:07 UTC
Created attachment 131626 [details]
the current conversion of UNIVERSAL-CONTENT2.123 with libwps
Comment 32 osnola 2017-03-04 11:07:21 UTC
Hello,
just for note, I have added support for Lotus SmartSuite 98's files in the current sources of libwps (and also support to read Lotus wk1,wk3,wk4,123 files which have a password).
=> when a new version of libwps will be released and integrated to LibreOffice, I guess that this will improve a lot the conversions (I have attached the conversions' files that I now obtain)...
Comment 33 Horst 2017-03-05 21:35:09 UTC
I tried to convert WOODSCREWS.123 to LO 5.3.0.3 and it is still the old reported mess.
The converted woodscrews.123 with libwps is almost perfect.

My question: In which version of LO is change made?
Comment 34 Horst 2017-03-05 21:37:24 UTC Comment hidden (obsolete)
Comment 35 osnola 2017-03-06 08:10:24 UTC
(In reply to Horst from comment #34)
> I tried to convert WOODSCREWS.123 to LO 5.3.0.3 and it is still the old
> reported mess.
> The converted woodscrews.123 with libwps is almost perfect.
> 
> My question: In which version of LO is change made?

Currently, the conversions were done with the current source of libwps: https://sourceforge.net/p/libwps/wiki/Home/, which also contains a filter to convert lotus' spreadsheet file (this filter is currently used by LO to convert wk1, wk3 and wk4 files).

The changes will appear in LO when a new version of libwps will be released(*), integrated to LO and the following problem: https://www.mail-archive.com/libreoffice@lists.freedesktop.org/msg184957.html is corrected...

(*) I will probably do that in the following days.