Ask Your Question
1

Find distinct values in a column formula for calc

asked 2017-06-19 11:20:22 +0200

adrianbucks gravatar image

Hi there! I am trying to find a formula that will find all the distinct values from a column where the values repeat in random order. The values are text. I have found something for Excel but it gives me just empty cell.

edit retag flag offensive close merge delete

Comments

And the "something for Excel" is ...?

erAck gravatar imageerAck ( 2017-06-19 11:40:27 +0200 )edit

=IFERROR(INDEX($A$2:$A$10, MATCH(0, COUNTIF($B$1:B1,$A$2:$A$10) + (COUNTIF($A$2:$A$10, $A$2:$A$10)<>1), 0)), "")

adrianbucks gravatar imageadrianbucks ( 2017-06-19 11:50:19 +0200 )edit

Where $A$2:$A$10 is the source list and $B$1:B1 is the column header where the formula is inserted starting from B2

adrianbucks gravatar imageadrianbucks ( 2017-06-19 11:54:37 +0200 )edit

Would you mind to explain what column is the one from which you want to get the distinct values, and what contents are expected to reside in the other column of the two occurring in the formula?
The above comment crossed mine in time. I cannot get a reasonable meaning, however.

Lupp gravatar imageLupp ( 2017-06-19 11:58:48 +0200 )edit

An explanation of the structure of this Excel formula can be found here.

librebel gravatar imagelibrebel ( 2017-06-20 12:10:06 +0200 )edit

Finally some explanation, thanks.

erAck gravatar imageerAck ( 2017-06-20 19:46:06 +0200 )edit

Fwiw, all the formula expressions given there work for me with the example of "Ronnie";"David";... if it doesn't work in your case you're likely doing something different (or you're using an older version in which it doesn't work, I tried 5.2 and 5.3)

erAck gravatar imageerAck ( 2017-06-20 20:12:02 +0200 )edit

@erAck: Suppose the formula was not entered for array-evaluation. As a one -cell-array-formula correctly filled down as far as needed it works for me.

Lupp gravatar imageLupp ( 2017-06-20 21:51:34 +0200 )edit

2 Answers

Sort by » oldest newest most voted
0

answered 2017-06-20 02:11:20 +0200

librebel gravatar image

updated 2017-06-20 02:12:35 +0200

Hello @adrianbucks, you could use a Filter to list, or copy, all the distinct rows from a range in Calc.

Manually:

1. Select the entire Range to find distinct rows in ( e.g. "A1:B99" ),
2. Choose the menu "Data : More Filters : Standard Filter...",
3. In the dialog box that appears, in the first row of the Filter Criteria, set the Field Name to "- none -",
4. Expand the Options by clicking on the small triangle, 
5. Check the checkbox "No duplications",
6. Check the checkbox "Copy results to:" and enter a full Target CellAddress into the textbox ( e.g. "Sheet1.D1" ),
7. Uncheck the checkbox "Keep filter criteria",
8. If your source range does not contain a header, uncheck the checkbox "Range contains column labels",
9. If case matters while searching for distinct rows, check the checkbox "Case sensitive",
10. Click OK.

By Macro:

Sub filterDistinct( strSourceRange As String, strTargetCell As String, Optional bContainsHeader As Boolean, Optional bCaseSensitive As Boolean )
REM Uses a Filter to copy distinct rows from the specified Source Range into a new Range that starts from the specified Target Cell.
REM <strSourceRange>    : specifies the Range to find distinct rows in, e.g. "A1:B99".
REM <strTargetCell>     : specifies the Cell to put the first found distinct row in, e.g. "D1".
REM <bContainsHeader>   : OPTIONAL - pass TRUE if the Source Range contains a Header.
REM <bCaseSensitive>    : OPTIONAL - pass TRUE if case matters while searching for distinct rows.
    Dim oSheet As Object, oSourceRange As Object, oFilter As Object
    oSheet = ThisComponent.CurrentController.ActiveSheet
    oSourceRange = oSheet.getCellRangebyName( strSourceRange )
    oFilter = oSourceRange.createFilterDescriptor( True )
    oFilter.SkipDuplicates = True
    oFilter.CopyOutputData = True
    oFilter.OutputPosition = oSheet.getCellRangebyName( strTargetCell ).CellAddress
    If Not IsMissing( bContainsHeader ) Then oFilter.ContainsHeader = bContainsHeader
    If Not IsMissing( bCaseSensitive ) Then oFilter.IsCaseSensitive = bCaseSensitive
    oSourceRange.filter( oFilter )
End Sub

Example call : filterDistinct( "A2:A10", "B2" )

edit flag offensive delete link more
0

answered 2017-06-19 20:31:38 +0200

Lupp gravatar image

updated 2017-06-20 11:41:04 +0200

I would suggest to not use convoluted formulae. Introducing a helper column which may be hidden in the working sheet makes things clear and simple, and avoids problems with maintenance, in specific with scaling and enhancements. A solution regarding this suggestion you find on Sheet2 of this attached demo. You dont need to permit macros to see how it works.

I also demonstrate a solution based on user functions on Sheet1 of the attached demo. As already said I would not recommend it though.

(Editing:)
Meanwhile I studied once more the "Excel-Formula" given in the OQ, as it does not contain anything that might be suspected to work differently in LibO Calc. The results:
1 - The formula is convoluted and will be rather inefficient (relevant only for much more rows).
2 - The formula can only work in a reasonable sense if entered for array-evaluation (Ctrl+Shift+Enter).
3 - Put into B2 the formula must be filled down into additional 9 cells. (If dragging: Press Ctrl.)
4 - The formula will work then also in Calc.
5 - The formula does not return all the distinct elements contained in the source range.
6 - It only returns the elements occurring exactly once in the source range.
7 - To also get contents occurring more than once (without repetition) cannot be achieved by a simple variation of the formula.

There may be a problem with an ambiguity of the term "distinct values".
However, also @librebel interpreted the term the same way I did.
The above statements are demonstrated in this new version of the demo (Sheet2). Instead of column B I used column P for this. See also columns N an R.

edit flag offensive delete link more

Comments

as i understand the semantic difference between "Unique values" and "Distinct values" from the same article linked to in the comments above:

Unique values are values that occur exactly once in the original list.
Distinct values are ( Unique values plus all 1st-occurrences of duplicate values ).
In other words, Distinct values are all values from the original list, after removal of all duplicate values.
librebel gravatar imagelibrebel ( 2017-06-20 13:12:57 +0200 )edit

@librebel: Ok.Concerning the terminology I agree with the quoted explanation. However, I often found "unique" used in the sense of "distinct" in forum contributions. With respect to the formula given in the linked article for the selection of the distinct values, I would state as a flaw that blank cells in the source range result in a 0 (zero) element in the selection. ("No blanks except at the end" must be assured.)

Lupp gravatar imageLupp ( 2017-06-20 14:15:03 +0200 )edit

Anyway I would still recommend to solve tasks of the kind using either helper columns with formulae of low complexity - or interactive means as @librebel already suggested.
Inventing formulae of the kind discussed in the linked article is an intelectual adventure rather than a way to a recommendable solution, imo.
(BTW: The solution based on "my" user functions by
{=TRANSPOSE(XTEXTSPLIT("&|&";-103;XTEXTJOIN("&|&";1;A2:A100);100))} is rather efficient.)

Lupp gravatar imageLupp ( 2017-06-20 14:19:00 +0200 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2017-06-19 11:20:22 +0200

Seen: 548 times

Last updated: Jun 20