Bug 107779 - Sorting breaks rows data alignement when functions include columns with no data
Summary: Sorting breaks rows data alignement when functions include columns with no data
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
5.3.3.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Sorting Calc-Function
  Show dependency treegraph
 
Reported: 2017-05-11 17:21 UTC by Luca
Modified: 2020-06-26 05:43 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
Empty cells not sorted, even if have a color background (34.03 KB, image/png)
2020-06-13 15:58 UTC, Xavier Guillot
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Luca 2017-05-11 17:21:09 UTC
In Tools / Options / LibreOffice Calc / General
select:

[X] Update references when sorting range of cells.


1) Setup a Calc spreadsheet this way:

      A             B          C        D

1     a       =sum(C1:D1)      1        1
2     b       =sum(C2:D2)      2        2
3     c       =sum(C3:D3)      3        3


This way a is worth 2, b is worth 4, c is worth 6, you see:

      A             B            C        D

1     a             2            1        1
2     b             4            2        2
3     c             6            3        3

2) Now select all then sort by column A descending using Data / Sort. You see, as expected:

      A             B            C        D

1     c             6            3        3
2     b             4            2        2
3     a             2            1        1


3) Now revert to original sorting (ctrl+z), then modify function in B, as to include column E - which doesn't contain data - in the sum:

      A             B            C        D

1     a       =sum(C1:E1)      1        1
2     b       =sum(C2:E2)      2        2
3     c       =sum(C3:E3)      3        3

The result is the same as in 1, as expected.

4) Now repeat 2): select all then sort by column A descending using Data / Sort. You get the wrong result in column B:

      A             B            C        D

1     c             2            3        3
2     b             4            2        2
3     a             6            1        1

That's because the references in column B have been wrongfully reordered,
following column A but ignoring C and  D:

      A             B            C        D

1     c       =somma(C3:E3)      3        3
2     b       =somma(C2:E2)      2        2
3     a       =somma(C1:E1)      1        1


5) If (step 3) you add any value in column D, then repeat step 2), sorting order is correct. That's to say, starting from:

      A             B            C        D       E

1     a       =somma(C1:E1)      1        1
2     b       =somma(C2:E2)      2        2
3     c       =somma(C3:E3)      3        3       3

The descending order on column A produces, as expected:

      A             B            C        D       E

1     c             9            3        3       3
2     b             4            2        2
3     a             2            1        1

=================================================================

This is bad. And it's not the only case when Calc breaks the raws integrity without any warning. Also see bug # 107385:
https://bugs.documentfoundation.org/show_bug.cgi?id=107385

--
Comment 1 Eike Rathke 2017-05-11 18:05:43 UTC
With "Update references when sorting range of cells" this is expected. Without a value in column E the sort range is A1:D3, but a reference like C3:E3 points outside that range so is not moved with the data but adjusted to point to the original location. That's exactly why the option exists.
Comment 2 Luca 2017-05-12 10:08:36 UTC
(In reply to Eike Rathke from comment #1)
> With "Update references when sorting range of cells" this is expected.
> Without a value in column E the sort range is A1:D3, but a reference like C3:E3 
> points outside that range so is not moved with the data but adjusted to point 
> to the original location. That's exactly why the option exists. 

That doesn't make sense to me: if I select all (step 2) I explicity tell the program to sort all the rows and columns. That's how the sorting function normally works in Calc. 

If this behaviour is intentional, then the bug is not to warn users that their sorting area has been arbitrary restricted and rows integrity will be lost. 

I also see that warnings about the selected sorting area extention seem to have disappeared in last versions of Calc. I think they should be reinstated. There should be no situation when the records integrity is at risk without any warning.

Please, understand that not intuitive, dangerous and incoherent behaviours cannot be considered as "normal".
Comment 3 Xisco Faulí 2017-07-26 17:38:10 UTC
Moving back to UNCONFIRMED. It needs to be confirmed by a third person
Comment 4 Jean-Baptiste Faure 2017-08-05 20:58:06 UTC
(In reply to Luca from comment #0)
> In Tools / Options / LibreOffice Calc / General
> select:
> 
> [X] Update references when sorting range of cells.

Why did you check this option ?
Please, have a look at bug 81309 and bug 81633 to understand where this option comes from.
bug 81309: Sorting should automatically adjust references
bug 81633: Sorting shouldn't always automatically adjust references

Set status to NEEDINFO, please set it back to UNCONFIRMED once requested
informations are provided.

Best regards. JBF
Comment 5 Luca 2017-12-26 10:05:13 UTC
(In reply to Jean-Baptiste Faure from comment #4)
> (In reply to Luca from comment #0)
> > In Tools / Options / LibreOffice Calc / General
> > select:
> > 
> > [X] Update references when sorting range of cells.
> 
> Why did you check this option ?
> Please, have a look at bug 81309 and bug 81633 to understand where this
> option comes from.
> bug 81309: Sorting should automatically adjust references
> bug 81633: Sorting shouldn't always automatically adjust references
> 
> Set status to NEEDINFO, please set it back to UNCONFIRMED once requested
> informations are provided.

(Sorry for my delayed reply)

The option [X] Update references when sorting range of cells
is necessary when you have two sheets in the same file with cells in a column of the first sheet pointing to corresponding cells in a column of the second sheet, because without that option, resorting one sheet would result in loosing data rows integrity:

Sheet 1:                             Sheet 2:

A                   B                A            B   

=$'Sheet 2'.A1      1a               a            2a 
=$'Sheet 2'.A2      1b               b            2b
=$'Sheet 2'.A3      1c               c            2c

(values a, b, c, must appear in column As of Sheet 1 and Sheet 2, always connected to corresponding values in column Bs, no matter how you sort one sheet or the other)

But I think you're missing the point. There are two issues here: 1) lacking of warnings, whenever a sorting operation would result in data loss by breaking the rows integrity; 2) arbitrary restriction of a specifically selected area to perform an operation that will result in data loss (this is the worst problem, see my previous messages for details). 

As for the issue in 1): when you select one column and then try a sorting operation, the program already issues a warning and proposes an extension of the selection. The case I described here are far more dangerous and less intuitive, all the more reason they should need a warning.
Comment 6 Buovjaga 2017-12-26 19:32:44 UTC
Luca: would bug 96464 solve it for your use case?
Comment 7 Jean-Baptiste Faure 2018-06-24 19:43:19 UTC
Please answer question in comment #6.

Status set to NEEDINFO, please set it back to UNCONFIRMED once requested
informations are provided.

Best regards. JBF
Comment 8 Luca 2019-01-02 14:07:05 UTC
(In reply to Buovjaga from comment #6)
> Luca: would bug 96464 solve it for your use case?

I guess not, if I understand how it works. What is needed is:

1) an option to never ever brake the integrity of a line without a warning
2) if the user selects a range of lines/rows/cells, then applies any sorting operation, the sorting must affect completely and exclusively the selected area, no exceptions.
3) In the help file, a clear, complete and exhaustive documentation of Calc behaviours in every case.

I have yet another simple case where I would need the option
[X] Update references when sorting range of cells

      A           B           C      

1     value1      value3             
2     value2      value4
3                             =B2 (value4)

If I modify sorting of rows 1 and 2 according to the column B values, I want the C3 reference updated so that it always points to value4, no matter if it's in B1 or B2. But I can't use the options because it's too dangerous. Without 1) and 2) as I described here above, it's very likely that you will end up with loosing your datasheet completely without even knowing, because of the messed up lines.
Comment 9 Luca 2019-01-05 16:38:41 UTC
(In reply to Luca from comment #8)

> I have yet another simple case where I would need the option
> [X] Update references when sorting range of cells
> 
>       A           B           C      
> 
> 1     value1      value3             
> 2     value2      value4
> 3                             =B2 (value4)
> 
> If I modify sorting of rows 1 and 2 according to the column B values, I want
> the C3 reference updated so that it always points to value4, no matter if
> it's in B1 or B2. But I can't use the options because it's too dangerous.
> Without 1) and 2) as I described here above, it's very likely that you will
> end up with loosing your datasheet completely without even knowing, because
> of the messed up lines.

I would add: even the more confusing all this is, if you consider that when manually resorting rows (alt+drag) the reference in C3 gets updated, but if you use the sorting function instead to obtain the same result, it does not.  

If all this is intentionally by design, well, it's very bad design, in my opinion. Counterintuitive and dangerous. For sure it would require extensive and exhaustive description of all possible cases, in the help file.
Comment 10 Oliver Grimm 2019-07-09 09:08:37 UTC
when I was trying to confirm this bug, I did not find the option "[X] Update references when sorting range of cells." in the german (de-de) version of the sort dialog. Has it been removed recently? my version here is 6.1.5.2.
Comment 11 Buovjaga 2019-07-09 10:46:39 UTC
(In reply to Oliver Grimm from comment #10)
> when I was trying to confirm this bug, I did not find the option "[X] Update
> references when sorting range of cells." in the german (de-de) version of
> the sort dialog. Has it been removed recently? my version here is 6.1.5.2.

It is still in Tools / Options / LibreOffice Calc / General: Input settings.
Above it is "Expand references when new columns/rows are inserted".
Below it is "Highlight selection in column/row headers".
Comment 12 Oliver Grimm 2019-07-10 09:21:57 UTC
ok, thanks, found it now B-]

While I do understand the two use cases for setting or unsetting this option, I do not understand Eikes reasoning in comment #1. Why should a cell be left out of sorting only because if refers to something outside the selected range?

When I read "[X] Update references when sorting range of cells." I assume that references should be updated after sorting all non-formula data. Nothing else.

Therefore I tend to second Lucas point here:
from comment #5: "2) arbitrary restriction of a specifically selected area" is a problem, if not bug, in my opinion. It does not affect my daily work since I do not use that option. But it needs to be solved.
Comment 13 b. 2020-05-11 19:20:07 UTC
from a short check: 

it holds with not checking the option 'update references' and including the empty colum E in the range to be sorted ...
Comment 14 Luca 2020-05-12 08:02:13 UTC
(In reply to b. from comment #13)
> it holds with not checking the option 'update references' and including the
> empty colum E in the range to be sorted ...

Yes, it's alreade been told in comment #2. That's exactly why it should happen as well with the option 'update references' checked.

It must be understood that any uncommon behaviour that breaks the integrity of rows is *dangerous*, it causes loss of data! All the more when the uncommon behaviour (in this case, a silent reduction of the area selected by the user) gets applied only in some specific situation and not all the similar ones, by the same application. This is worst than bad design: it's intentional sneaky sabotage of user's intentions.
Comment 15 Xavier Guillot 2020-06-13 15:58:58 UTC
Created attachment 161956 [details]
Empty cells not sorted, even if have a color background

I confirm the bug : some cells, if they are on a border of a table and the column has no value, even if they are selected to be sorted and have a background color, stay at the same place and are not moved with the next ones, which creates a visual break and adds work to rearrange the presentation.

For example in my screenshot : often I have incomplete tables - ie an agronomic test after harvest, I enter the grain yield and sort from best to lowest productive (except row 1), but I keep an empty column to add later the oil content. I set to (some or all) rows different colors.

If the column with empty cells is on the right of the table / selection, they are not sorted (columns E-F-G in my test).

If they are on the middle, they are well moved with the others (M-N-O).

Also, if I add on first example, on column C only one value on a cell without any colored background and keep all others empty, all the rows are well sorted and "empty" cells moved.

Cells should not considered as empty if a background color has been set ; if one cell is not seen as empty, the whole column must not be seen as empty, too, and be sorted. Or more logical, if the cells are selected for sorting, they should be sorted, empty or not...