Description: The Data Filter option for deduplicating a selected list will reproduce the second iteration of the first item at the sequential location where it is encountered and then complete the deduplication correctly Steps to Reproduce: With the enclosed sheet select the data in column E MENU>DATA>More Filters>Advanced filter>Options check No Duplicates Left mouse in the upper address box adjacent to "undefined" Select the data in column E - Bug 1 it doesn't auto fill the selection Check Copy results to Left mouse in the lower address box adjacent to "undefined" Focus H1 and "OK" Observe that the second iteration of "One" is replicated at its unsorted location and then the remaining entries are deduped. The limitation of the three groups of seven is not significant - merely a short list to demonstrate. All other duplications have been removed when I intentionally replicated 119 full-length names ten times with only the first item duplicating once. Actual Results: The first item is duplicated when found in the list but only the 2nd iteration of that item. Expected Results: deduplicated list Reproducible: Always User Profile Reset: No OpenGL enabled: Yes Additional Info: Version: 7.3.5.2 (x64) / LibreOffice Community Build ID: 184fe81b8c8c30d8b5082578aee2fed2ea847c01 CPU threads: 4; OS: Windows 10.0 Build 19043; UI render: Skia/Raster; VCL: win Locale: sv-SE (en_GB); UI: en-GB Calc: threaded
Created attachment 182425 [details] Small sample ready to prove the error
Try following the steps described here. They worked for me: https://help.libreoffice.org/latest/en-US/text/scalc/guide/remove_duplicates.html?DbPAR=CALC#bm_id3153896 As for your workflow, notice that for some reason the option "Range contains column labels" is checked by default and cannot be unchecked (at least on my end). I really could not figure out why it can't be unchecked, but this is the reason "One" appears twice... the first time it is considered as the column label.
As for the issue that the selected range does not appear in the dialog, I created a separate bug report to discuss it (see bug 150937).
(In reply to Rafael Lima from comment #3) > As for the issue that the selected range does not appear in the dialog, I > created a separate bug report to discuss it (see bug 150937). I was busy typing a response when you posted the additional info - quite topical really. I will try to identify my earlier report of the same ilk. I still think this excerpt from my abandoned comment is worth consideration; Whilst I never tried the method defined in the link, I'm quite happy to accept that it functions under some circumstances but not when approaching it by another regular menu access. ........ I have reported earlier in a different bug report scenario where the selection isn't passed to the process but needs to be re-selected from within the process. I guess I was just unlucky
Is there a need to keep this report open? Sorry, just quickly looked at the conversation and I'm wondering about it.
(In reply to Buovjaga from comment #5) > Is there a need to keep this report open? Sorry, just quickly looked at the > conversation and I'm wondering about it. As I understand it, the discussions have identified numerous ways in which it fails and numerous workarounds. If you believe another report adequately covers the issue, feel the pros and cons are cancelling each other out or you're happy to see duplicates in what purports to be a deduplicated list then I won't lose any sleep over this report being abandoned. I imagine deduping is rarely a repeating process in any spreadsheet. I certainly only ever attempt it -as a standalone process - when I have a potentially duplicated list that needs cleaning up before being incorporated into the working environment. As It's simple enough to get Excel to dedupe and paste the data back to LO that's what I have utilised as my workaround.
So you use a closed source piece of software to address a limitation of Calc? Just a reminder that the RemoveDuplicates extension exists, and that it has a compatible license with LibO. This means it could be implemented by default without issues.
(In reply to Pedro from comment #7) > So you use a closed source piece of software to address a limitation of Calc? Not Any More > Just a reminder that the RemoveDuplicates extension exists, and that it has > a compatible license with LibO. This means it could be implemented by > default without issues. Sweet, Thanks. Is it my lucky day and you're going to tell me there's a setting to make it case-insensitive? I noticed two similar extensions, Remove and Remove...Fast. I grabbed the first as it has a better user rating - The right choice? (without liability:) ) Should I "pull" the report as there's an authorised extension for the job?
(In reply to Buovjaga from comment #5) > Is there a need to keep this report open? Sorry, just quickly looked at the > conversation and I'm wondering about it. What about the: "Range contains column labels" being forcefully checked by default and grayed out (so it cannot be unchecked)? Moreover, there _seems_ to be no way to select a range that doesn't include the first non-empty cell (E1 in this case). For example, in the first box we could type-in (or select) the range $E$2:$E$33, but it would ultimately result in "This range doesn't contain a valid query." The workaround for this seems to be to select the range $E$2:$E$33 beforehand (i.e. before opening the advance filter dialog). In any case, the first box is not pre-filled with the selection range. A minor side note. In LO, the (pre)selection is not exactly (quote): "Select the data in column E" I mean that the selection cannot be the entire column E, but only the specific cells with the relevant data ($E$1:$E$33, or $E$2:$E$33, or whatever specific range), but not the entire column. Emphasis on _data_ (range), not in (entire) column.