Bug 97977 - FILEOPEN: XSLX - MODE results in wrong calculation
Summary: FILEOPEN: XSLX - MODE results in wrong calculation
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Winfried Donkers
URL:
Whiteboard: target:6.1.0
Keywords: difficultyInteresting, easyHack, filter:ooxml, skillCpp
: 116724 (view as bug list)
Depends on:
Blocks: Function-iWorksNumbers
  Show dependency treegraph
 
Reported: 2016-02-18 13:19 UTC by Dennis Roczek
Modified: 2018-04-01 10:08 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Mode using OOXML (8.17 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2016-02-18 13:19 UTC, Dennis Roczek
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dennis Roczek 2016-02-18 13:19:16 UTC
Created attachment 122768 [details]
Mode using OOXML

Related to Bug 97887 and to Bug 97334:

The ODF spec defines MODE so that the smallest number is the result when multiple numbers occurs the same time (so MODE(6;6;2;2;3) results in 2)

The OOXML spec results the first number (so 6). See attached file (you might have to recalculate the numbers).
Comment 1 Buovjaga 2016-02-20 15:42:50 UTC
Yeah all are 1 after recalc.

Win 7 Pro 64-bit Version: 5.2.0.0.alpha0+
Build ID: ef02de2698d90fd874bddf3146165cbe85487bc5
CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; 
TinderBox: Win-x86@39, Branch:master, Time: 2016-02-19_23:40:50
Locale: fi-FI (fi_FI)
Comment 2 jani 2016-02-24 14:48:48 UTC
Missing code pointer (as noted in a email)
Comment 3 Dennis Roczek 2016-02-24 16:07:20 UTC
http://opengrok.libreoffice.org/xref/core/sc/source/core/tool/interpr3.cxx#3447

search for
ocModalValue
and
SC_OPCODE_MODAL_VALUE

to get an idea how a function is implemented, see
https://cgit.freedesktop.org/libreoffice/core/commit/?id=8539039e0c108da7d0306c37962415ce7e271c84
Comment 4 manuel.defranceschi 2016-06-28 13:06:21 UTC
I assigned the easyHack to myself.
Comment 5 jani 2016-06-28 14:22:30 UTC
(In reply to manuel.defranceschi from comment #4)
> I assigned the easyHack to myself.


Look forward to see a patch for this. We have made a step by step guide to help you get started:
https://wiki.documentfoundation.org/Development/GetInvolved


Have fun
jan i
Comment 6 jani 2016-07-31 07:37:51 UTC
a polit ping, still working on this issue?
Comment 7 jani 2016-08-31 08:16:03 UTC
Unassigning.
Comment 8 Tamás Zolnai 2016-12-03 14:14:59 UTC
Hi guys,

What is the expected behavior here?
I see 1, 9 and 5 here in the cells which is the same as what MS EXCEL shows.
Comment 9 Dennis Roczek 2016-12-03 14:22:16 UTC
Hi Tamas,
for the example in the file:
=MODE(9;1;5;1;9;5;6;6)

ODF: 1 (smallest number in in the result set)
OOXML: 9 (first occurred number in the result set)

So basically we need a new "MODE" function, e.g. MODE.EXCEL or MODE_ADD or the like.
Comment 10 abhilash300singh 2017-01-18 20:24:24 UTC
Hi Dennis,
I see GetSortArray at http://opengrok.libreoffice.org/xref/core/sc/source/core/tool/interpr3.cxx#3739 returns the input array in an ascending sorted fashion. 

Do we already have a method that returns input array untouched, without any sorting? 

I suspected GetNumberSequenceArray at http://opengrok.libreoffice.org/xref/core/sc/source/core/tool/interpr3.cxx#3619 ( without reading it's implementation ) to be so. On debugging, I found it wasn't so, and the returned array was sorted in a strange fashion.

If we already have a method that returns input array untouched, the algorithm implementation for this should be trivial using hash.

Thanks.
Comment 11 Dennis Roczek 2017-01-19 13:03:14 UTC
BTW, just for the reference:


[13:30:55] <erAck> DennisRoczek: can you point out where "The OOXML spec results the first number (so 6)." is actually defined?
[13:42:05] <DennisRoczek> erAck: Description: Computes the most frequently occurring of the numeric values of its arguments. If the set of values contains more than one most-frequent value, the first occurrence of any most-frequent value in the list is used as the result. 
[13:42:24] <DennisRoczek> page 2294 in ECMA-376-1:2016 
[13:42:34] <erAck> DennisRoczek: found it; 18.17.7.219
Comment 12 Eike Rathke 2017-01-19 13:30:08 UTC
(In reply to abhilash300singh from comment #10)
> If we already have a method that returns input array untouched, the
> algorithm implementation for this should be trivial using hash.

How would hash be related?

You will still have to have a sorted array to determine the frequencies, so an "untouched input array" wouldn't help much, I guess, unless you create a map of bins to count in a different way, which likely is slower. However, it's available as GetNumberSequenceArray() which also GetSortArray() uses before sorting. Note also that the result isn't the first value encountered, but "the first occurrence of any most-frequent value in the list". So MODE(9;1;5;1;9;5;1;5) should return 1.

GetSortArray() can be a vector<long>* pIndexOrder passed to obtain the corresponding position order of the sorted array.

Loading MODE from Excel documents probably should map that to MODE.SNGL then, which already is available and currently shares implementation with ScModalValue().
Comment 13 jani 2017-05-14 07:43:02 UTC Comment hidden (obsolete)
Comment 14 Xisco Faulí 2017-06-14 02:22:20 UTC Comment hidden (obsolete)
Comment 15 Xisco Faulí 2017-07-15 02:31:31 UTC Comment hidden (obsolete)
Comment 16 Xisco Faulí 2017-08-15 02:23:41 UTC Comment hidden (obsolete)
Comment 17 Xisco Faulí 2017-09-15 15:27:41 UTC Comment hidden (obsolete)
Comment 18 Xisco Faulí 2017-09-16 02:04:33 UTC Comment hidden (obsolete)
Comment 19 Xisco Faulí 2017-09-17 02:04:58 UTC
A polite ping, still working on this bug?
Comment 20 Manuj Vashist 2018-02-04 23:35:06 UTC
Is this still open?
If yes then what has to be done here, as I can see MODE function 
follows the OOXML specs and MODE.SNGL follows ODF specs.
Comment 21 Dennis Roczek 2018-02-26 14:58:42 UTC
Hi Manuj,

yes, that bug is still open.

See Comment 9 for steps to reproduce.
Comment 22 Winfried Donkers 2018-02-27 10:19:20 UTC
There are 3 MODE functions in Calc: MODE, MODE.SNGL and MODE.MULT.
The first is should be fully ODF compliant, the other 2 are OOXML functions.
The Excel function MODE (originating from Add-In function pack) has been declared obsolete by Microsoft and replaced with MODE.SNGL and MODE.MULT. The 'old' MODE function has retained for the time being.

So, if MODE fully complies with ODFF1.2, that is OK. Upon exporting to OOXML, it ought to be renamed to MODE.SNGL of MODE.MULT (haven't time to figure out which, right now). Currently it isn't, it's saved as MODE in OOXML.

Export to xls is more difficult. Creating a new function in Calc for an obsolete function and file type does seem to be the way to go.
(BTW, this is a general problem with Add-in functions that are being replaced by Microsoft with new versions.)
Comment 23 Dennis Roczek 2018-02-27 15:40:39 UTC
(In reply to Winfried Donkers from comment #22)
> There are 3 MODE functions in Calc: MODE, MODE.SNGL and MODE.MULT.
> The first is should be fully ODF compliant, the other 2 are OOXML functions.
> The Excel function MODE (originating from Add-In function pack) has been
> declared obsolete by Microsoft and replaced with MODE.SNGL and MODE.MULT.
> The 'old' MODE function has retained for the time being.
> 
> So, if MODE fully complies with ODFF1.2, that is OK.
Hi Olivier, can you add a notice to the help that MODE is ODF-complaint? This is missing in the help actually. (just noticed)

> Upon exporting to
> OOXML, it ought to be renamed to MODE.SNGL of MODE.MULT (haven't time to
> figure out which, right now). Currently it isn't, it's saved as MODE in
> OOXML.
Well then we do have another problem. :-(

I just tested with Excel 2016 O365 (desktop app) version: all three variants of Mode (with the following numbers (9;1;5;1;9;5;6;6)) give as a result 9 back.

LibreOffice Calc does calculate a 1 (as stated in our help).
 
> Export to xls is more difficult. Creating a new function in Calc for an
> obsolete function and file type does seem to be the way to go.
Well, yes and no. The file format isn't open for extensions, this won't work except mapping MODE.SNGL (or MODE.MULT) to MODE.

> (BTW, this is a general problem with Add-in functions that are being
> replaced by Microsoft with new versions.)
Depend if we have already a function integrated since ages. I wouldn't invent new functions for such minimal differences...
Comment 24 Dennis Roczek 2018-02-27 15:41:30 UTC
Hi Olivier,

I missed to add you in the last comment, see Comment 23.

Thanks.
Comment 25 Commit Notification 2018-02-27 18:14:42 UTC
Olivier Hallot committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/help/commit/?id=dd09149dfe0135a4abbd1abbcee15d115af0f982

tdf#97977 Mark MODE function as ODFF compliant
Comment 26 Winfried Donkers 2018-02-28 11:36:45 UTC
(In reply to Dennis Roczek from comment #23)
[...]
> > Upon exporting to
> > OOXML, it ought to be renamed to MODE.SNGL of MODE.MULT (haven't time to
> > figure out which, right now). Currently it isn't, it's saved as MODE in
> > OOXML.
> Well then we do have another problem. :-(
> 
> I just tested with Excel 2016 O365 (desktop app) version: all three variants
> of Mode (with the following numbers (9;1;5;1;9;5;6;6)) give as a result 9
> back.
> 
> LibreOffice Calc does calculate a 1 (as stated in our help).
>  

Currently, MODE and MODE.SNGL are identical in Calc. 
I can change that so that MODE.SNGL and MODE.MULT return 9 with MODE.XX(9;1;5;1;9;5;6;6), i.e. the first value with the maximum occurences.
That way the OOXML-functions are fully compatible with Excel.
(Not assigning myself to this bug report as the fix for MODE.SNGL and MODE.MULT are no fix for MODE.)

Mapping MODE to MODE.SNGL when exporting to OOXML has no effect; the results in Calc stay different from those in Excel.

The difference for these special cases will probably remain unless either OOXML specs of ODFF standard is changed.
Comment 27 Commit Notification 2018-03-15 17:48:31 UTC
Winfried Donkers committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c71e8df90b916f32a1d7d3a0849d0cdf14924475

tdf#97977 related : make MODE.SNGL comply with Excel.

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 28 Winfried Donkers 2018-03-20 16:57:43 UTC
MODE.SNGL is now compatible with Excel; MODE.MULT will follow shortly.
As MODE is essentially an ODFF function, I propose to leave that function as it is and open separate bug report to add to the help for MODE something like "MODE, being compliant with ODFF-standard, is not 100% percent compatible with Excel's function MODE. For compatibilty with Excel, use MODE.SNGL of MODE.MULT.".
Comment 29 Commit Notification 2018-03-29 19:47:05 UTC
Winfried Donkers committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=fc121c32ca0effb47d219066a6848435cba562cc

tdf#97977 make MODE.MULT fully compatible with Excel.

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 30 Winfried Donkers 2018-03-31 14:34:02 UTC
Tests with Excel show that Excel handles MODE from ods documents (which should follow ODFF definition) and xlsx documents the same, i.e. identical as MODE.SNGL.

I think all possible changes are done, now.
Comment 31 Dennis Roczek 2018-03-31 14:47:37 UTC
(In reply to Winfried Donkers from comment #30)
> Tests with Excel show that Excel handles MODE from ods documents (which
> should follow ODFF definition) and xlsx documents the same, i.e. identical
> as MODE.SNGL.
> 
> I think all possible changes are done, now.

Oh wow, then they have changed that behavior. (I posted in one of their comment boxes using office excel online a message that they do not follow any standard)

Thanks Winfried. :-)

(And where do I post the same for Apple Numbers? *g*)
Comment 32 Dennis Roczek 2018-03-31 14:51:08 UTC
*** Bug 116724 has been marked as a duplicate of this bug. ***
Comment 33 Winfried Donkers 2018-04-01 10:08:13 UTC
(In reply to Dennis Roczek from comment #31)
> (In reply to Winfried Donkers from comment #30)
> > Tests with Excel show that Excel handles MODE from ods documents (which
> > should follow ODFF definition) and xlsx documents the same, i.e. identical
> > as MODE.SNGL.
> > 
> > I think all possible changes are done, now.
> 
> Oh wow, then they have changed that behavior. (I posted in one of their
> comment boxes using office excel online a message that they do not follow
> any standard)

I didn't say that Excel follows any standard that I know of, just that Excel handles ODFF-MODE function as if it is Excel's MODE function.

> Thanks Winfried. :-)

You're welcome. Unfortunately the time I can spare is limited, so there are lots of bug reports that I don't see or can't fix in the time available to me.