Remove duplicates in column A while keeping only the first value in column B?

If I have a table where the first two columns contain:

FIRST 23
FIRST 25
SECOND 17
SECOND 44
THIRD 33

How do I do a remove duplicates where I keep only the first occurance of column A without regard to column B, so that for the above data the resulting data is:

FIRST 23
SECOND 17
THIRD 33

The table actually has a few hundred thousand rows, so it is not practical to do this manually.

I tried based on watching some videos but they only cover when the whole row is a duplicate, not just one column.

1 Like

There is this extension: Remove Duplicates Fast » Libreoffice Extensions

And tdf#85976 is implemented in upcoming v.25.2 (comment 72; some polishing like UI fine-tuning and documenting is still in progress).

2 Likes

Thank you. I see the tdf#85976 is being pushed to the dailies for comment. I’ll be sure to try it out and see if it does what I’m seeking.

In the meantime I was able to figure out a hack. In a table with no heading row:

  1. Move column B contents to column C.
  2. Place the following formula in B2:
    =IF(A1=A2,B1,C2)
  3. Copy that formula all the way down column B to the last data row.
  4. Copy column A.
  5. Paste column A into column D.
  6. Copy column B.
  7. Paste special, numbers, into column E.
  8. Copy C1 to E1. With steps 7 and 8, column E now contains only the first number value for anything that is duplicated in column D.
  9. Select columns D and E.
  10. Remove duplicates using the current standard LibreOffice menu method for removing duplicates, using the Data, Filter, Standard Filter, remove duplicates, and placing it into G1.
  11. Copy columns G and H and paste it wherever you wish to use the cleaned contents.

image

@mariosv That doesnt work, »no duplicates« compares the whole row not only Column_A (except you select explizit only Column_A for the filter )

My error.
.