Bug 142993 - Sampling with replacement should still allow "keep order"
Summary: Sampling with replacement should still allow "keep order"
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
6.3.0.4 release
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: implementationError
Depends on:
Blocks: Data-Statistics
  Show dependency treegraph
 
Reported: 2021-06-23 01:03 UTC by Stéphane Guillou (stragu)
Modified: 2024-03-21 00:24 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stéphane Guillou (stragu) 2021-06-23 01:03:08 UTC
Description:
When using the Sampling tool, the options "With replacement" and "Keep order" are exclusive. Ticking one unticks the other one.

The user should be able to create a sample with replacement while still conserving the order they appear in.

The UX would also improve as the two choices currently use tickboxes that make it look like both can be ticked at the same time.

Steps to Reproduce:
1. Open Calc
2. Create a range of values, for example the sequence 1 to 10 in the range A1:A10
3. Select said range
4. Open the sampling dialogue: Data > Statistics > Sampling...
5. Input range should be the selected range
6. Results to: B1
7. Sample size: any value above 1
8. Sampling method: Random
9. Tick "With replacement" and "Keep order"

Actual Results:
The two option are exclusive. Ticking one unticks the other.

Expected Results:
The two options don't need to be exclusive: one might want sampling to happen with replacement, but still keeping the values in the original order.
For example:
- Sample the sequence: 1, 3, 2, 4
- Use "With replacement" and "Keep order" and a sample size of 6
- Get an output like: 1, 1, 1, 2, 4, 4


Reproducible: Always


User Profile Reset: No



Additional Info:
Tested on:

Version: 7.3.0.0.alpha0+ / LibreOffice Community
Build ID: e3086b58eb5427d520b86c185f9d911bb6f7a3a0
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2021-06-21_15:37:11
Calc: threaded

and:

Version: 7.2.0.0.beta1 / LibreOffice Community
Build ID: c6974f7afec4cd5195617ae48c6ef9aacfe85ddd
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

and:

Version: 7.0.6.2
Build ID: 144abb84a525d8e30c9dbbefa69cbbf2d8d4ae3b
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded
Comment 1 Buovjaga 2021-07-19 16:57:58 UTC
Repro

NixOS
Version: 7.3.0.0.alpha0+ / LibreOffice Community
Build ID: b1df9c67349cf4cc5be4128d797aefb87f50e38f
CPU threads: 16; OS: Linux 5.13; UI render: default; VCL: x11
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
Calc: threaded
Comment 2 Buovjaga 2021-08-10 17:45:41 UTC
Looks like it behaved like this since the options were introduced in 6.3
Comment 3 QA Administrators 2023-08-11 03:05:44 UTC Comment hidden (obsolete)
Comment 4 Stéphane Guillou (stragu) 2024-03-19 07:58:21 UTC
Still reproduced in:

Version: 24.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 479b5bbe8ca2177ba7574e7aa2308b5d0de1895c
CPU threads: 8; OS: Linux 6.5; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: CL threaded

Same on Windows.

Rafael, not sure if it's something you'd be interested in?
Comment 5 Rafael Lima 2024-03-20 14:30:05 UTC
(In reply to Stéphane Guillou (stragu) from comment #4)
> Rafael, not sure if it's something you'd be interested in?

Indeed... this seems like an implementation error. Also it does not make sense to have 2 check boxes that behave as radio buttons.

I'll take a look into it.
Comment 6 Rafael Lima 2024-03-21 00:24:07 UTC
Apparently this issue is "intentional".

The method ScSamplingDialog::PerformRandomSamplingKeepOrder [1] assumes that when "Keep Order" is checked, the sampling must be "without replacement". The implementation does not consider the possibility of replacement.

To fix this issue, we would have to implement a new method to perform sampling with replacement while keeping the original order. This is doable, but not trivial. I'll add this to my to-do list, hopefully in time for 24.8.

[1] https://opengrok.libreoffice.org/xref/core/sc/source/ui/StatisticsDialogs/SamplingDialog.cxx