Bug 35411 - Increase maximum token size in formula expression
Summary: Increase maximum token size in formula expression
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium enhancement
Assignee: Eike Rathke
URL:
Whiteboard: target:4.2.0 target:4.1.0.2 target:4....
Keywords:
: 50419 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-03-18 02:49 UTC by Novetica
Modified: 2023-10-18 17:00 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Novetica 2011-03-18 02:49:30 UTC
Hi,
currently the size for calc formulas is 512 tokens. This could be low for some advanced and complex spreadsheet application.

Please rise it to 8192 (as MS Excel) or implement a dedicated option to unlock/change the feature.

I've tried to change following const and recompile:

#define MAXCODE      8192
#define MAXSTRLEN    8192
#define MAXJUMPCOUNT 512  

Now I'm able to enter long formula that correctly works but I'm not able to save as odf (calc closes without any debug error). I can correctly save the document with long formula as xlsx but I lose all Libreoffice feature (pivot, ...).

Please help,

N.
Comment 1 Kohei Yoshida 2011-03-21 18:58:56 UTC
BTW, is Excel having a limit of 8192 tokens documented somewhere?
Comment 2 Novetica 2011-03-22 03:51:54 UTC
(In reply to comment #1)
> BTW, is Excel having a limit of 8192 tokens documented somewhere?

Hi Kohei,
I'm indeed not sure that Excel support 8192 tokens, probably this number was referred somewhere on Formula chars size (http://user.services.openoffice.org/en/forum/viewtopic.php?f=9&t=33995). A guy reported that Excel supports around 900 tokens, maybe a rise to 1024 could be enough.

Excel 2007 specs:
http://office.microsoft.com/en-us/excel-help/excel-specifications-and-limits-HP010073849.aspx

Length of formula contents	8,192 characters
Internal length of formula	16,384 bytes
Iterations	32,767
Arguments in a function	255
Nested levels of functions	64
Comment 3 Kohei Yoshida 2011-11-30 14:59:12 UTC
Adding Eike in CC.  Eike, is there any gotchas that you could see when increasing the max token size from 512 to 8192?
Comment 4 Kohei Yoshida 2011-11-30 16:25:12 UTC
Adding this to my list of things for 3.5, but will wait for Eike's input.
Comment 5 Eike Rathke 2011-12-01 04:53:56 UTC
Generally I don't see any problem increasing these numbers. However, to be sure we're talking about same things here:

MAXCODE is the maximum numbers of tokens in a formula. I doubt any sane mind would overview 8192 numbers, operators, strings, ... in one formula ;-) but we can certainly increase this to whatever Excel supports. Note that Excel counts things differently, see below.

MAXSTRLEN is the maximum string length of one token if the token is a literal string. It looks like Excel supports inline strings of 8192 characters if a formula consists only of one or two strings ("Length of formula contents"). I don't see any problem if we increase MAXSTRLEN to 8192.

MAXJUMPCOUNT is the maximum number of jumps a CHOOSE() can have, so whatever Excel has there ("Arguments in a function" is defined as 255) we could support as well. It's only the formula wizard that can't cope with more than 32 variable arguments (for any function, while the core does), that could be tackled another day.

It is not exactly clear to me what Excel defines as "Length of formula contents" and what the difference is to "Internal length of formula", I assume the contents are what we define as MAXSTRLEN and the internal length is a mixture of number of tokens plus lengths of inline literal strings, as Excel internally stores these things differently and string contents within the token array.

One caveat I saw when looking up our numbers: some constants are defined multiply at different places because one certain mind didn't clean that up when splitting things.. this of course is a potential source of trouble:

MAXCODE
formula/inc/formula/FormulaCompiler.hxx
sc/inc/compiler.hxx

MAXJUMPCOUNT
formula/inc/formula/FormulaCompiler.hxx
formula/source/core/api/token.cxx
inc/compiler.hxx

These must be unified and defined at one place formula/inc/formula/FormulaCompiler.hxx

Problems the bug-submitter encountered in saving to ODF may be related to that, but may also have a different cause, so be prepared for more work lying ahead than just unifying and increasing these constants.
Comment 6 Kohei Yoshida 2011-12-07 06:23:51 UTC
Thanks Eike, for your detailed input.

@Novetica, I do need some example formula expression that you enter which Calc currently refuses in order for us to make any changes.  Thanks!
Comment 7 bfoman (inactive) 2012-06-22 04:14:42 UTC
*** Bug 50419 has been marked as a duplicate of this bug. ***
Comment 8 Timon 2012-06-22 04:54:37 UTC
(In reply to comment #6)
> Thanks Eike, for your detailed input.
> 
> @Novetica, I do need some example formula expression that you enter which Calc
> currently refuses in order for us to make any changes.  Thanks!

Here is an example https://bugs.freedesktop.org/attachment.cgi?id=62156 with detailed description https://bugs.freedesktop.org/show_bug.cgi?id=50419
Comment 9 Florian Reisinger 2012-08-14 13:57:36 UTC
Dear bug submitter!

Due to the fact, that there are a lot of NEEDINFO bugs with no answer within the last six months, we close all of these bugs.

To keep this message short, more infos are available @ https://wiki.documentfoundation.org/QA/NeedinfoClosure#Statement

Thanks for understanding and hopefully updating your bug, so that everything is prepared for developers to fix your problem.

Yours!

Florian
Comment 10 Florian Reisinger 2012-08-14 13:58:55 UTC Comment hidden (obsolete)
Comment 11 Florian Reisinger 2012-08-14 14:03:28 UTC Comment hidden (obsolete)
Comment 12 Florian Reisinger 2012-08-14 14:05:44 UTC Comment hidden (obsolete)
Comment 13 bfoman (inactive) 2012-08-23 13:04:47 UTC
Reopening as this seems to be important for advanced and complex spreadsheets.
Comment 14 Florian Reisinger 2012-08-23 17:20:16 UTC
Changing target from 3.5 to 3.7 and increasing importance a bit...
Comment 15 Rainer Bielefeld Retired 2012-08-24 16:26:08 UTC
Target Back to Kohei's version.

I see an Enhancement Request here, limitations inherited from OOo

Did already somebody check ODF compatibility issues? I doubt, but may be there are defined some limitations?

@Florian:
Target has nothing to do with importance, and why 3.7 should show more importance than 3.5 is not easy to understand. Please leave this whiteboard tag to developers, except there are apparent mistakes.

@Kohei, @Eike:
Is Attachment 62156 [details] of Bug Bug 50419 already enough "real life" example? To me that seems quite a stretch and not a real application.

@Novetica/all:
Do you have more realistic examples? If they are not for public, please collect them and leave a note here, so that Kohei can demand the documents when he returns to this Enhancement Request.
Comment 16 Andrew 2012-11-23 10:32:30 UTC
I also hit this bug while importing to Calc from MS Exel.
I've got file from partner and it imported with errors.

Inside it have formulae like:
IF(H3=N3,O3,IF(J3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),K3,IF(L3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),M3,IF(N3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),O3,IF(P3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),Q3,IF(R3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),S3,IF(T3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),U3,IF(V3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),W3,IF(X3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),Y3,IF(Z3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),AA3,IF(AB3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),AC3,IF(AD3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),AE3,IF(AF3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),AG3,IF(AH3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),AI3,IF(AJ3=MIN(J3,L3,N3,P3,R3,T3,V3,X3,Z3,AB3,AD3,AF3,AH3,AJ3),AK3,"-")))))))))))))))
Comment 17 Eike Rathke 2013-07-01 20:04:21 UTC
For the records, the above expression would be parsed into 542 tokens, of which after 511 the expression is truncated after the last ,V3,X3,

This indeed seems to relate to MAXCODE.

Grabbing this.
Comment 18 Eike Rathke 2013-07-01 20:07:13 UTC
Note that increasing MAXSTRLEN doesn't seem to make sense, Excel doesn't support literal strings longer than 255 characters in expressions, or at least one can't enter them in the UI.
Comment 19 Commit Notification 2013-07-01 22:13:33 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=9c1ca6dca3b553c302a635357e33591605343b99

resolved fdo#35411 redefined MAXCODE 512 to FORMULA_MAXTOKENS 8192



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 20 Eike Rathke 2013-07-01 22:27:55 UTC
Pending review
for 4-1 as https://gerrit.libreoffice.org/4667
for 4-0 as https://gerrit.libreoffice.org/4668
Comment 21 Commit Notification 2013-07-02 05:37:52 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-4-1":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1698abe554b0a657899b17381822efa4c46a8ca3&h=libreoffice-4-1

resolved fdo#35411 redefined MAXCODE 512 to FORMULA_MAXTOKENS 8192


It will be available in LibreOffice 4.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 22 Commit Notification 2013-07-02 05:38:12 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=a19db54f78720dc8200b3504e84cd19d445f4d18&h=libreoffice-4-0

resolved fdo#35411 redefined MAXCODE 512 to FORMULA_MAXTOKENS 8192


It will be available in LibreOffice 4.0.5.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 23 Buovjaga 2023-10-18 17:00:58 UTC
Notes for unit test writers:

Revert has to be done manually.