Bug 107779 - Sorting breaks rows data alignement when functions include columns with no data
Sorting breaks rows data alignement when functions include columns with no data
 Status: NEW None LibreOffice Unclassified Calc (show other bugs) 5.3.3.2 release All All medium enhancement Not Assigned Sorting Calc-Function Show dependency tree / graph

 Reported: 2017-05-11 17:21 UTC by Luca 2022-02-12 10:44 UTC (History) 8 users (show) heiko.tietze ilmari.lauhakangas jbfaure logistikka newbie-02 olivier.hallot Valeryan_24 xiscofauli

Attachments
Empty cells not sorted, even if have a color background (34.03 KB, image/png)
2020-06-13 15:58 UTC, Xavier Guillot
Details

 Note You need to log in before you can comment on or make changes to this bug.
 Luca 2017-05-11 17:21:09 UTC ```In Tools / Options / LibreOffice Calc / General select: [X] Update references when sorting range of cells. 1) Setup a Calc spreadsheet this way: A B C D 1 a =sum(C1:D1) 1 1 2 b =sum(C2:D2) 2 2 3 c =sum(C3:D3) 3 3 This way a is worth 2, b is worth 4, c is worth 6, you see: A B C D 1 a 2 1 1 2 b 4 2 2 3 c 6 3 3 2) Now select all then sort by column A descending using Data / Sort. You see, as expected: A B C D 1 c 6 3 3 2 b 4 2 2 3 a 2 1 1 3) Now revert to original sorting (ctrl+z), then modify function in B, as to include column E - which doesn't contain data - in the sum: A B C D 1 a =sum(C1:E1) 1 1 2 b =sum(C2:E2) 2 2 3 c =sum(C3:E3) 3 3 The result is the same as in 1, as expected. 4) Now repeat 2): select all then sort by column A descending using Data / Sort. You get the wrong result in column B: A B C D 1 c 2 3 3 2 b 4 2 2 3 a 6 1 1 That's because the references in column B have been wrongfully reordered, following column A but ignoring C and D: A B C D 1 c =somma(C3:E3) 3 3 2 b =somma(C2:E2) 2 2 3 a =somma(C1:E1) 1 1 5) If (step 3) you add any value in column D, then repeat step 2), sorting order is correct. That's to say, starting from: A B C D E 1 a =somma(C1:E1) 1 1 2 b =somma(C2:E2) 2 2 3 c =somma(C3:E3) 3 3 3 The descending order on column A produces, as expected: A B C D E 1 c 9 3 3 3 2 b 4 2 2 3 a 2 1 1 ================================================================= This is bad. And it's not the only case when Calc breaks the raws integrity without any warning. Also see bug # 107385: https://bugs.documentfoundation.org/show_bug.cgi?id=107385 --``` Eike Rathke 2017-05-11 18:05:43 UTC `With "Update references when sorting range of cells" this is expected. Without a value in column E the sort range is A1:D3, but a reference like C3:E3 points outside that range so is not moved with the data but adjusted to point to the original location. That's exactly why the option exists.` Luca 2017-05-12 10:08:36 UTC ```(In reply to Eike Rathke from comment #1) > With "Update references when sorting range of cells" this is expected. > Without a value in column E the sort range is A1:D3, but a reference like C3:E3 > points outside that range so is not moved with the data but adjusted to point > to the original location. That's exactly why the option exists. That doesn't make sense to me: if I select all (step 2) I explicity tell the program to sort all the rows and columns. That's how the sorting function normally works in Calc. If this behaviour is intentional, then the bug is not to warn users that their sorting area has been arbitrary restricted and rows integrity will be lost. I also see that warnings about the selected sorting area extention seem to have disappeared in last versions of Calc. I think they should be reinstated. There should be no situation when the records integrity is at risk without any warning. Please, understand that not intuitive, dangerous and incoherent behaviours cannot be considered as "normal".``` Xisco Faulí 2017-07-26 17:38:10 UTC `Moving back to UNCONFIRMED. It needs to be confirmed by a third person` Jean-Baptiste Faure 2017-08-05 20:58:06 UTC ```(In reply to Luca from comment #0) > In Tools / Options / LibreOffice Calc / General > select: > > [X] Update references when sorting range of cells. Why did you check this option ? Please, have a look at bug 81309 and bug 81633 to understand where this option comes from. bug 81309: Sorting should automatically adjust references bug 81633: Sorting shouldn't always automatically adjust references Set status to NEEDINFO, please set it back to UNCONFIRMED once requested informations are provided. Best regards. JBF``` Luca 2017-12-26 10:05:13 UTC ```(In reply to Jean-Baptiste Faure from comment #4) > (In reply to Luca from comment #0) > > In Tools / Options / LibreOffice Calc / General > > select: > > > > [X] Update references when sorting range of cells. > > Why did you check this option ? > Please, have a look at bug 81309 and bug 81633 to understand where this > option comes from. > bug 81309: Sorting should automatically adjust references > bug 81633: Sorting shouldn't always automatically adjust references > > Set status to NEEDINFO, please set it back to UNCONFIRMED once requested > informations are provided. (Sorry for my delayed reply) The option [X] Update references when sorting range of cells is necessary when you have two sheets in the same file with cells in a column of the first sheet pointing to corresponding cells in a column of the second sheet, because without that option, resorting one sheet would result in loosing data rows integrity: Sheet 1: Sheet 2: A B A B =\$'Sheet 2'.A1 1a a 2a =\$'Sheet 2'.A2 1b b 2b =\$'Sheet 2'.A3 1c c 2c (values a, b, c, must appear in column As of Sheet 1 and Sheet 2, always connected to corresponding values in column Bs, no matter how you sort one sheet or the other) But I think you're missing the point. There are two issues here: 1) lacking of warnings, whenever a sorting operation would result in data loss by breaking the rows integrity; 2) arbitrary restriction of a specifically selected area to perform an operation that will result in data loss (this is the worst problem, see my previous messages for details). As for the issue in 1): when you select one column and then try a sorting operation, the program already issues a warning and proposes an extension of the selection. The case I described here are far more dangerous and less intuitive, all the more reason they should need a warning.``` Buovjaga 2017-12-26 19:32:44 UTC `Luca: would bug 96464 solve it for your use case?` Jean-Baptiste Faure 2018-06-24 19:43:19 UTC ```Please answer question in comment #6. Status set to NEEDINFO, please set it back to UNCONFIRMED once requested informations are provided. Best regards. JBF``` Luca 2019-01-02 14:07:05 UTC ```(In reply to Buovjaga from comment #6) > Luca: would bug 96464 solve it for your use case? I guess not, if I understand how it works. What is needed is: 1) an option to never ever brake the integrity of a line without a warning 2) if the user selects a range of lines/rows/cells, then applies any sorting operation, the sorting must affect completely and exclusively the selected area, no exceptions. 3) In the help file, a clear, complete and exhaustive documentation of Calc behaviours in every case. I have yet another simple case where I would need the option [X] Update references when sorting range of cells A B C 1 value1 value3 2 value2 value4 3 =B2 (value4) If I modify sorting of rows 1 and 2 according to the column B values, I want the C3 reference updated so that it always points to value4, no matter if it's in B1 or B2. But I can't use the options because it's too dangerous. Without 1) and 2) as I described here above, it's very likely that you will end up with loosing your datasheet completely without even knowing, because of the messed up lines.``` Luca 2019-01-05 16:38:41 UTC ```(In reply to Luca from comment #8) > I have yet another simple case where I would need the option > [X] Update references when sorting range of cells > > A B C > > 1 value1 value3 > 2 value2 value4 > 3 =B2 (value4) > > If I modify sorting of rows 1 and 2 according to the column B values, I want > the C3 reference updated so that it always points to value4, no matter if > it's in B1 or B2. But I can't use the options because it's too dangerous. > Without 1) and 2) as I described here above, it's very likely that you will > end up with loosing your datasheet completely without even knowing, because > of the messed up lines. I would add: even the more confusing all this is, if you consider that when manually resorting rows (alt+drag) the reference in C3 gets updated, but if you use the sorting function instead to obtain the same result, it does not. If all this is intentionally by design, well, it's very bad design, in my opinion. Counterintuitive and dangerous. For sure it would require extensive and exhaustive description of all possible cases, in the help file.``` Oliver Grimm 2019-07-09 09:08:37 UTC `when I was trying to confirm this bug, I did not find the option "[X] Update references when sorting range of cells." in the german (de-de) version of the sort dialog. Has it been removed recently? my version here is 6.1.5.2.` Buovjaga 2019-07-09 10:46:39 UTC ```(In reply to Oliver Grimm from comment #10) > when I was trying to confirm this bug, I did not find the option "[X] Update > references when sorting range of cells." in the german (de-de) version of > the sort dialog. Has it been removed recently? my version here is 6.1.5.2. It is still in Tools / Options / LibreOffice Calc / General: Input settings. Above it is "Expand references when new columns/rows are inserted". Below it is "Highlight selection in column/row headers".``` Oliver Grimm 2019-07-10 09:21:57 UTC ```ok, thanks, found it now B-] While I do understand the two use cases for setting or unsetting this option, I do not understand Eikes reasoning in comment #1. Why should a cell be left out of sorting only because if refers to something outside the selected range? When I read "[X] Update references when sorting range of cells." I assume that references should be updated after sorting all non-formula data. Nothing else. Therefore I tend to second Lucas point here: from comment #5: "2) arbitrary restriction of a specifically selected area" is a problem, if not bug, in my opinion. It does not affect my daily work since I do not use that option. But it needs to be solved.``` b. 2020-05-11 19:20:07 UTC ```from a short check: it holds with not checking the option 'update references' and including the empty colum E in the range to be sorted ...``` Luca 2020-05-12 08:02:13 UTC Comment hidden (no-value) ```(In reply to b. from comment #13) > it holds with not checking the option 'update references' and including the > empty colum E in the range to be sorted ... Yes, it's alreade been told in comment #2. That's exactly why it should happen as well with the option 'update references' checked. It must be understood that any uncommon behaviour that breaks the integrity of rows is *dangerous*, it causes loss of data! All the more when the uncommon behaviour (in this case, a silent reduction of the area selected by the user) gets applied only in some specific situation and not all the similar ones, by the same application. This is worst than bad design: it's intentional sneaky sabotage of user's intentions.``` Xavier Guillot 2020-06-13 15:58:58 UTC Comment hidden (off-topic) ```Created attachment 161956 [details] Empty cells not sorted, even if have a color background I confirm the bug : some cells, if they are on a border of a table and the column has no value, even if they are selected to be sorted and have a background color, stay at the same place and are not moved with the next ones, which creates a visual break and adds work to rearrange the presentation. For example in my screenshot : often I have incomplete tables - ie an agronomic test after harvest, I enter the grain yield and sort from best to lowest productive (except row 1), but I keep an empty column to add later the oil content. I set to (some or all) rows different colors. If the column with empty cells is on the right of the table / selection, they are not sorted (columns E-F-G in my test). If they are on the middle, they are well moved with the others (M-N-O). Also, if I add on first example, on column C only one value on a cell without any colored background and keep all others empty, all the rows are well sorted and "empty" cells moved. Cells should not considered as empty if a background color has been set ; if one cell is not seen as empty, the whole column must not be seen as empty, too, and be sorted. Or more logical, if the cells are selected for sorting, they should be sorted, empty or not...``` Robert Lacroix 2020-08-25 20:59:16 UTC Comment hidden (off-topic) ```(In reply to Xavier Guillot from comment #15) > I confirm the bug : some cells, if they are on a border of a table and the > column has no value, even if they are selected to be sorted and have a > background color, stay at the same place and are not moved with the next > ones, which creates a visual break and adds work to rearrange the > presentation. > > [...] > > If the column with empty cells is on the right of the table / selection, > they are not sorted (columns E-F-G in my test). > > [...] > > Cells should not considered as empty if a background color has been set ; if > one cell is not seen as empty, the whole column must not be seen as empty, > too, and be sorted. Or more logical, if the cells are selected for sorting, > they should be sorted, empty or not... I confirm that LO-Calc 5.2.7.2 has the OP's mentioned behaviour. I also use colour-coded cells for missing data, but I avoid extending tables on the right into empty columns for keyboarding ergonomics. I tried this anyway in Excel today (version 1808 - Microsoft Office 365 ProPlus) and I grudgingly acknowledge that Excel sorts empty columns with background colour on the right side of a table when they are included in the sort range. Excel even offers a sort option to sort by background colour. So the behaviour desired by the OP is that of Excel. For the sake of compatibility, I am setting this to NEW. The change will affect only users who create sort ranges with empty cells on the right side. The rest of us won't even notice the difference.``` Robert Lacroix 2020-08-25 21:27:32 UTC Comment hidden (obsolete) ``` *** This bug has been marked as a duplicate of bug 88002 ***``` Jean-Baptiste Faure 2022-02-09 15:11:23 UTC Comment hidden (obsolete) ```I(In reply to Robert Lacroix from comment #17) > > *** This bug has been marked as a duplicate of bug 88002 *** No, this bug report can't be a duplicate of bug 88002 because its description does not make any reference to empty cells with format. In the case of empty cells with format, you can use the sort option "include formats". Setting back to NEW. Set importance to enhancement because it currently works as expected and the OP want a change in the current behavior. I do not support such a change. For me you can't expect define a set of sort options that satisfies all users when you create tables with mutual dependencies. For me example from comment #8 is completely arbitrary, some other user may want another behavior. So my proposition is to close this bug report as WontFix. Best regards. JBF``` Buovjaga 2022-02-09 15:16:57 UTC Comment hidden (obsolete) ```(In reply to Jean-Baptiste Faure from comment #18) > I(In reply to Robert Lacroix from comment #17) > > > > *** This bug has been marked as a duplicate of bug 88002 *** > > No, this bug report can't be a duplicate of bug 88002 because its > description does not make any reference to empty cells with format. In the > case of empty cells with format, you can use the sort option "include > formats". > > Setting back to NEW. Set importance to enhancement because it currently > works as expected and the OP want a change in the current behavior. > > I do not support such a change. For me you can't expect define a set of sort > options that satisfies all users when you create tables with mutual > dependencies. > For me example from comment #8 is completely arbitrary, some other user may > want another behavior. > > So my proposition is to close this bug report as WontFix. Let's ask for a blessing for wontfix from UX team.``` Katie Chan 2022-02-09 19:00:09 UTC Comment hidden (obsolete) ```(In reply to Jean-Baptiste Faure from comment #18) > I(In reply to Robert Lacroix from comment #17) > > > > *** This bug has been marked as a duplicate of bug 88002 *** > > No, this bug report can't be a duplicate of bug 88002 because its > description does not make any reference to empty cells with format. In the > case of empty cells with format, you can use the sort option "include > formats". The behaviour described in comment #16 is a duplicate of bug 88002. It's exactly the behaviour of the example given in comment 2 over there. > Setting back to NEW. Set importance to enhancement because it currently > works as expected and the OP want a change in the current behavior. The behaviour asked of at least in comment #16 is not an enhancement but a regression. It is the current behaviour but that itself was a change from the behaviour previously shown prior to v4.3.5. I'll take further comment back over there. Regards``` Jean-Baptiste Faure 2022-02-09 20:34:46 UTC Comment hidden (obsolete) ```(In reply to Katie Chan from comment #20) > (In reply to Jean-Baptiste Faure from comment #18). > [...] > The behaviour described in comment #16 is a duplicate of bug 88002. It's > exactly the behaviour of the example given in comment 2 over there. Yes, but this behavior is not the problem described by the bug reporter. So it has nothing to do in this bug report. Best regards. JBF``` Heiko Tietze 2022-02-10 08:56:33 UTC `To me, "Update references when sorting range of cells" is clearly a switch to not sort the cells but the references. But at least we have a documentation gap.` Eike Rathke 2022-02-10 16:13:39 UTC ```Clearly sort always sorts, regardless of the option.. but if and how the references are adjusted is exactly the point of that option, which was introduced at some time because people don't agree what a sort should actually do with references. If unchecked, references are not touched, so relative references still point to the same relative positions if their formula cell moved, i.e. to within the same sorted row stay on that row after the sort. If checked, references pointing (partially) outside of the sorted area are adjusted to still point to the original range, which for relative references *changes* (hence updates) the reference. The behaviour the bug submitter wants is the option *unchecked* (which is the default), i.e. after the sort on row 1 there's =SUM(C1:E1) and then all works as expected. However, with the option checked the current behaviour comes to a surprise because the sort range is shrunk to the actual used area and thus the formula points outside. Shrinking is done because sorting empty areas (no data and no attribution) is superfluous and sorting those areas as well would just slow down things significantly. This is especially important *because* people hit Ctrl+A select all or select entire rows or columns to sort an actually smaller range. In case the option is checked, shrinking the sort range must take references in formula cells into account, if they point to within the selected area but outside the used area, then stop shrinking there if so to have the references adjusted (or rather not adjusted) as expected.``` Luca 2022-02-12 10:44:56 UTC ```(In reply to Eike Rathke from comment #23) > However, with the option checked the current behaviour comes to a surprise > because the sort range is shrunk to the actual used area and thus the > formula points outside. Shrinking is done because sorting empty areas (no > data and no attribution) is superfluous It's the logic of "arbitrarily shrinking user selected areas" that's plain dangerous. When a user specifically selects something, then the program should execute. If there is some reason not to execute, the program should issue a warning (and user should be able to disable that warning). Program's own arbitrary decision not to comply with user's disposition, without any warning, is dangerously bad design. In this particular case it causes data loss. This is priority against any other consideration. > and sorting those areas as well would just slow down things significantly. Is the slowing down caused by empty cells sorting really an issue? Maybe not a very scientific test, but I just added a value in cell ZZ10000 to extend the "select all" area in one of my spreadsheets to a ~5 million empty cells area and I didn't notice any significant slowdown when sorting. > This is especially important *because* people hit Ctrl+A select all or > select entire rows or columns to sort an actually smaller range. That's true, and why that happens? Right because of general lack of transparency about how areas to be sorted are arbitrarily selected by the program without warning, nor appropriate documentation. E.g., if there are empty rows or columns creating gaps between data, and I hit the sorting button without selecting anything, who says that I want to sort only the area within the gaps around the cell that has focus? That's Calc's arbitrary decision, breaking integrity of my records, no warning. If gaps are outside the window and I don't remember that there are some, here again, I lose data. This cause insecurity. To be sure and avoid data loss, users just hit ctr+a all the times. In the example case, the program should instead select all or nothing at all. If I need to partially sort a sheet, I will select the area myself. > In case the option is checked, shrinking the sort range must take references > in formula cells into account, if they point to within the selected area but > outside the used area, then stop shrinking there if so to have the > references adjusted (or rather not adjusted) as expected. Yes, that would solve this problem, but not the others. Till breaking records is considered a normal harmless behaviour, there will always be problems with the sorting functions.```