Calc still can't remove duplicates?

Can Calc seriously still not remove duplicates easily?

I have an order sheet I exported from a management program that came to me with a ton of duplicates (there’s a row for each item being shipped to a person, even though a lot of their items will be shipped in one package).

I thought I’d give libreoffice a shot again since I just reinstalled windows, and I have found nothing but non-solution after non-solution for removing duplicate rows based on the data in the order number tab.

Every tutorial I come across just has some made-up spreadsheet with 6-10 rows in it that has one column in it (who would realistically ever have one column?) and only removes duplicate rows that are entirely duplicates of each other. On top of that they actually expect you to manually click and drag over all the data you want to filter with/out, or type in a range. I have over 500 orders. Ridiculous.

Please tell me there is some way to remove rows based on duplication in one column with this program. There must be a way. If not, just point me to a program that can do what I need.

(Please don’t ask questions about why I need the data a certain way or anything else that isn’t helpful, all I literally need is a way to remove rows from a sheet based on duplicate data in one column.)

Removing duplicate data is simple, powerful and fast in Calc.

Menu Data->Standard filter
Options, mark No duplicates

A high percentage of errors are caused by inconsistent source data.

I see you didnt read my post.

Please: Either don’t ask questions at all, or specify everything to the last detail making sure that no questions-in-return are needed. The statements and questions by @Villeroy (e.g.) simply are necesseray to avoid bad and dangerous advice, because the mentioned kind of specification is missing.

One case Villeroy obviously didn’t consider: Do you want to remove all rows for which a second row with the same content in the chosen column does exist?
If not, it’s indispensable to unambiguously specify which one of all those rows shall be kept.

Is the “packaged” shipping an item of its own?
If so: In what way are “packaging” rows labeled?

Try with Ctrl+*.

I think that Creating Pivot Tables could help.
Would you share a reduced sample file to test? Edit your question, and use the Upload icon (in the middle of the toolbar).
Thanks.

I see you did not read my answer carefully.

Seriously, no. And this is how it should be.

Again: Shall all the rows having a duplicate in the relevant column be removed or shall one “unicate” be kept? In second case: Which one?

I did but it is very confusing trying to identify and interpret the single sentence describing the situation amongst the complaint.

But none of the records you describe are actually duplicates. When certain columns are ignored they become duplicates, in which case the first comment (or any of the tutorials you’ve come across) will likely work. If a small sample data set was included I’d confirm.

hey flywire. feel free to see yourself out. thanks.

Wow! First time here? Nothing ever contributed? Great.

Clarifications concerning my questions, and the remarks by @Villeroy would have been useful.
You helped us a lot to waste time.

Maybe my lack of knowledge of the English language made me misunderstand some of your expressions, but I think I will give you the answer you deserve.

I don’t think LibreOffice need you give it a shot. You are free to use LibreOffice or not, but you don’t have to do it any favors by using it.

Surely it would be better to make a tutorial with hundreds of example rows and that would fit your needs… But the fools who write the tutorials are lazy and don’t want to write so many lines.

Of course. You need it!!!

Surely many of those who help here have a computer consultancy, and even they probably charge for advice, but I think they are in this forum to help (for free) and to recommend the use of LibreOffice.

Of course!. You can’t waste your time answering the questions of those who want to help you!

What a fool, he can’t even read, when you explain it so well!

Yes, let him go, he has been helping for a long time, and you stay, you have just arrived demanding.

Oh! really! Wasting time on something that may not be useful for you!

thanks for nothing. I’ll just try the actual solutions people have posted in this thread.

Hello,

Just tested this extension:

Remove Duplicates

with:

Version: 7.2.4.1 / LibreOffice Community
Build ID: 27d75539669ac387bb498e35313b970b7fe9c4f9
CPU threads: 8; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

Test had:

2758 Records to start with 13 columns

Criteria was one column to check for dups

Result was 206 rows remaining

Time was less than 1 minute

If the duplicates are only duplicates in some column(s), any deletion removes information. How does that extension decide which duplicates are to be deleted and which one is to stay?

i will give this a try!

If we assume that both records describe the same person, how can a software decide which postal address is the right one and delete the other one?

Doe | John | 1996-12-30 | 33 Nutbush Rd.
Doe | John | 1996-12-30 | 66 Pineapple Av.

exactly…

that is what I meant when I said: a high percentage of errors are due to “inconsistencies in the data”.

In my test case it did not matter. Information looking for was common in all and simply wanted only one record per MAC address.

It depends upon what you need. Easier to review 206 rows vs 2758. This may be what OP wants and I can see where this is useful at times.

Test was with one column (OP requirement) but can include any or all.

Without extension and duplicates in column A:
Sort the entire list by column A.
Apply formula =OR(A1=A2;A2=A3) and filter by that column.
The resulting row set should display all duplicates.
Finally, decide which rows should be deleted and which ones should be preserved.