REMOVEDUPLICTAES functionality

Im looking for a simple Basic macro that does exactly what (‘MS’) RemoveDuplicates does. Ive played around with createFilterDescriptor & filter(oFilterDesc) for most of last night and today, it just doesnt want to play ball.
I’ve 5k account codes (string) is column ‘A’. If i do it thru the UI, i end up with a list of around 150. The 150 ‘feels’ correct. I just can’t replicate the code in a macro.

Version: 24.2.6.2 (X86_64) / LibreOffice Community
Build ID: 420(Build:2)
CPU threads: 16; OS: Linux 6.8; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Ubuntu package version: 4:24.2.6-0ubuntu0.24.04.1
Calc: threaded

Any help would be appreciated.

And for bonus points, why is there TableFilterField, TableFilterField2 and TableFilterField3. How do i know when / which one to use.

thx Team.

Data>Filter>More>Standard Filter…
Define the columns that make up a duplicate and choose < not empty > as criterion for each of them.
Under “Options” choose “No duplicates”.

For a working macro, inspect the FilterDescriptor you have generated with the above steps.

First of all, “simple Basic macros” hardly exist.
What does the Microsoft product do actually?
How does it identify what makes up a duplicate? Artificial Intelligence?
Which record is a duplicate to be removed, and which is the single record to keep?

A true database can be set up to not accept any duplicate records (defining what makes up a duplicate on your own), so all your evaluations and aggregations can presume that duplicates do not exist in that record set.

If “functionally” is meaning " by a function" (instead of an interactively used tool) you need a helper function deciding if a strip repeats a previous one (above).
It’s simpler to do this with VBAsupport.
Subsequently the function FILTER() (new in V24.8) can be used.
See also:
disask115747OutFilterRepetitions.ods (19.3 KB)

Thanks for your reply. When i dumped MS for Ubuntu, i saw an opportunity to de-MS myself, and skill up. Initially I thought to install Oracle RDBMS / SQL-DEVELOPER. Yes, I appreciate you reply and am aware that life would be so much easier if i used a DB.

private sub cWsRmveDupsAcctCodeCol( fWSName as string )

dim owSAcct				as object
dim owSSmmy				as object
dim owSSmmyColA			as object
dim oCsr 				as object
dim oCritRange 			as object
dim oDataRange 			as object
dim oFiltDesc  			as object
dim oCopyTo				as object
Dim oFilterFld(0)	 	As New com.sun.star.sheet.TableFilterField
Dim oFilterDesc			AS Object	

dim cSummaryIdx			as integer
dim cSummaryRows		as long
dim cSummaryCols		as long	
dim cSMOutputTo			as object
			  
set owSSmmy = ThisComponent.Sheets.getbyname( fWSName )
set oCsr = owSSmmy.createCursor( owSSmmy )
oCsr.gotoEndOfUsedArea( True )
cSummaryRows = oCsr.RangeAddress.EndRow
cSummaryCols = oCsr.RangeAddress.EndColumn

set owSSmmyColA = owSSmmy.getCellRangeByPosition( c0, c0, c0, cSummaryRows )

FilterColA:
oFilterDesc = owSSmmy.createFilterDescriptor( True )

oFilterDesc.IsCaseSensitive = False
oFilterDesc.SkipDuplicates = True
oFilterDesc.UseRegularExpressions = False
oFilterDesc.SaveOutputPosition = False
oFilterDesc.Orientation = com.sun.star.table.TableOrientation.ROWS
oFilterDesc.ContainsHeader = False
oFilterDesc.CopyOutputData = True
cSMOutputTo	= oFilterDesc.OutputPosition
cSMOutputTo.Sheet = 2
cSMOutputTo.Column = 5
cSMOutputTo.Row = 0
oFilterDesc.OutputPosition = cSMOutputTo

oFilterFld(0).Field = 0
oFilterFld(0).IsNumeric = False 
oFilterFld(0).Operator = com.sun.star.sheet.FilterOperator.NOT_EMPTY   		
oFilterDesc.setFilterFields(oFilterFld())
owSSmmy.filter(oFilterDesc)	

set oCsr = nothing
set owSSmmy = nothing	
set owSAcct = nothing

end sub

Ive spent most of my time tinkering with the FILTER() method, and I can;t get it to work. But having read thru your ods code, a thought occurred to me. Maybe an easier way to approach the problem. two sweeps of the range. first, to identify dups, second to delete them. This approach also suits as the end result is still on Column A. From the Filter() method, it would appear that i need to copy the results somewhere else.
Thx for your reply.

A ‘Handle Duplicate Records’ dialog was added in 25.2 to select/remove duplicate records in Calc.

Thanks and Merry Christmas, MikeKaganski. But still on 24.

The new function was implemented in frame of tdf#85976. Starting from comment 8 there, it discussed “Remove Duplicates” extension (and in comment 15, its improvement “Remove Duplicates Fast”). You may install either, and inspect their code (both are written in Basic).

But in general, your original question asks exactly this: “I’m looking at some code; help me with what I don’t show you”.

Thx again.