Macro to remove duplicates in a column

Select the Column and go to

→Data→Filter→StandardFilter:
Criteria: <not empty>
Options:
    [x]no Duplicates

thanks but Macro please

but example of Spreadsheet please

u need example ??? for this ?
just take any 1 column in a sheet with some duplicate values and i just need a macro to remove the duplicates and leave the column with unique values
rows should not be deleted or hidden coz there could be other columns next to our column and they should not be affected

Do you want to get an answer or do you prefer to blame experienced volunteers for remarks based on their experience.
Unfortunately I can’t point you to a site for community support meeting your expectations.
Farewell !

aber bitteschön:
please select Cellrange in Question and run:

def remove_duplicates(*_):
    doc = XSCRIPTCONTEXT.getDocument()
    selection = doc.CurrentSelection
    sheet = selection.Spreadsheet
    data = selection.DataArray
    out = []
    for row in data:
        if not row in out and row[0]:
            out.append(row)
    cursor = sheet.createCursorByRange(selection)
    cursor.clearContents( 2**10-1 ) #sum of Cellflags[1]
    cursor.collapseToSize(1,len(out) )
    cursor.setDataArray(out)

[1]Constants' Group CellFlags

gerngeschehn

2 Likes

Some Basic example for filter is here: LibreOffice Calc: Filter via Macro?
I also know the good article, but in Czech (you can use some translator) OpenOffice.cz | Filtrování dat v buňkách

Example:
standard-filter.ods (242.8 kB)

1 Like

thanks Kammy

Based on python function by @karolus.
LO Basic. Well, there will be a little more lines.
Note 1: The Collection object here is responsible for “uniqueness”.
Note 2: All values are treated as strings. If the data contains the same numeric values in the form of a number and a string, the first one will be left.
Note 3: Case insensitive. The first occurrence will also be left.
Note 4: The cursor.DataArray can accept both an array of arrays and a 2D array.
Note 5: Formulas will be replaced with their values.
Note 6: All formats and comments on cells will be deleted.
Note 7: The merged cells will be converted to a single cell.
If there are merged cells, the current range can be interrupted when selecting using Ctrl+Shift+↓.

The advantages of Python data structures and operators (list, in) are obvious.

Before running the macro, select the required range (not the entire column).

Sub RemoveDuplicates
	Dim col As New Collection
	Dim selection As Object, sheet As Object, cursor As Object
	Dim data(), out()
	Dim i&, item
	Dim prompt$

	prompt = Chr(10) _
	 & "To remove duplicates, select the current region yourself" & Chr(10) _
	 & "and call this utility again."

	selection = ThisComponent.CurrentSelection
	If Not selection.supportsService("com.sun.star.sheet.SheetCellRange") Then
		MsgBox "The cell range is not selected." & prompt _
		 , MB_ICONEXCLAMATION, "Selection Error"
		Exit Sub
	ElseIf selection.Columns.Count > 1 Then
		MsgBox "Only a single-column cell range is supported." & prompt _
		 , MB_ICONEXCLAMATION, "Selection Error"
		Exit Sub
	End If
	sheet = selection.Spreadsheet
	If selection.Rows.Count = sheet.Rows.Count Then
		MsgBox "The entire column selection is not allowed." & prompt _
		 , MB_ICONEXCLAMATION, "Selection Error"
		Exit Sub		
	End If

	data = selection.DataArray

	On Error Resume Next
	For i = 0 To UBound(data)
		item = data(i)(0)
		col.Add item, CStr(item)
	Next
	On Error GoTo 0

	out = CollectionTo2DArray(col)
	cursor = sheet.createCursorByRange(selection)
	With cursor
		.clearContents(2^10 - 1)  'sum of CellFlags (1023)
		.collapseToSize(1, UBound(out) + 1)  'nColumns:=1, nRows
		.DataArray = out
	End With
End Sub

Function CollectionTo2DArray(c As Collection)
	Dim i&
	Dim a(): ReDim a(0 To c.Count - 1, 0)

	For i = 1 To c.Count
		a(i - 1, 0) = c(i)
	Next
	CollectionTo2DArray = a
End Function

Fixed an error:
col.Additem, CStr(item)

Quite a decent result. Usually you should allow the selection of the entire column and in the code intersect the selection with the area used current region.

2 Likes

Everything is as you asked.

1 Like

To get the unique values is also possible with the ScriptForge library like this.

Sub getUniques
	dim oDoc as object, oSheet as object, oRange as object, data(), dataUniques(), i&
	oDoc=ThisComponent
	oSheet=oDoc.CurrentController.ActiveSheet
	oRange=oSheet.getCellRangeByName("A1:A100") 'your column
	data=oRange.getDataArray()
	dim data1D(ubound(data))
	for i=lbound(data) to ubound(data) 'transform array from array(array(value)) to array(value)
		data1d(i)=data(i)(0)
	next i
	GlobalScope.BasicLibraries.loadLibrary("ScriptForge") 'load ScriptForge library
	dataUniques=Unique(data1D, true) 'true=case sensitive
	xray dataUniques
End Sub

But I think it will be faster in Python.

1 Like

LO Basic. This is an advantage, if it is important. But the Сollection object will cope with duplicates faster (no sorting & ReDim Preserve). Python doesn’t count.


Edit: And it looks like Unique() to change the order of the rows due to sorting (did not check).

1 Like

I tested the Standard Filter to remove duplicates. Numbers are treated as strings, the first occurrence remains. The formats are copied. The ‘Case sensitive’ checkbox has no effect (bug?): it is always case sensitive. But this is inconvenient!

remove-duplicates.ods (11.4 KB)


Version: 7.2.0.4 / LibreOffice Community Build ID: fa0a9af5681a0acc7909a705f92c1ada8a658552 CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3 Locale: ru-RU (ru_RU.UTF-8); UI: en-US Calc: threaded
1 Like

yes thanks Kamil that was quite interesting. but i need to do some python learning to test it that way. i assume it wont be so hard coz lots of coding experience in some other languages like vba
the libre application on our client system doesnt support python etc. and no plugins etc. can be downloaded installed.
caught up with some other modules in this libre app im building so havent tried this yet. will let u know ok

Many thanks Eeigor
will try that along with dear Kamil’s solution but please confirm 1 thing
the worksheets i will be performing these macros on, will be hidden from the user’s view.
so the quest is - will these sort of codes need bringing the sheet into view and selecting it etc ?
I have written several of these macros in vba involving worksheets that dont need to mimic things like first select the sheet or range etc.
i mean those macros run fine without such actions.
wondering if Libre has similiar capabilities especially in this solution u’ve given

Yes, it is necessary. This is a typical example. If you want to work with a certain range (whether on a hidden sheet), but not with the range selected by the user, add the appropriate parameter to the procedure and pass it.
Sub RemoveDuplicates(oRange As Object)

1 Like

@bloke, if you are satisfied with the solution, mark it. By the way, I use the VBA Collection object. LO Basic supports it in native mode. This should be familiar to you.

1 Like

yes just saw Collection earlier today . same as VBA. will try ur code and some r&d
yes will mark it and thanks for your interest

Please note that both StandardFilter and AdvancedFilter cannot delete duplicates in case-insensitive mode! No tricks helped.
remove-duplicates.ods (11.7 KB)

@eeigor:
If you found a bug to report then please report a bug and link back to it here when created. Thanks.