Count unique values in one column per uniques in another column

Hi,
I have a 3 column dataset. Column A has the data point names, Column B has class names for each data point. Column C has sub-class names for each data point. Every class has its own unique set of sub-classes. I would like to know the number of sub-classes for each class.

Put more succinctly, I’d like to count every unique value in column C for every unique value in column B.

In the example, I have a list of 11 students who have classes in Sports and Music categories. In sports there are Football, and Basketball sub-classes. In Music, Guitar, Piano, and Drums… The numbers I want to find is 2 for Sports and 3 for Music.
In the real data all relevant values are in string format. There are over 20k data points, 11 classes and over 200 sub-classes.
I know that I can make lists for classes and sub-classes and use countifs, but this is impractical as, sadly, there is no straightforward way of filtering unique values in a column in LO calc.
I tried to do it with pivot tables but couldn’t, I’m not very competent with them. May be an array formula would work but I don’t know how to use them well either.
Thanks a lot in advance and have a very nice day!
(LO v.25.2.4.3)
ask_lo_sample_count_unique_class.ods (16.1 KB)

ask_lo_sample_count_unique_class_Pivot.ods (22.6 KB)

1 Like

Thanks a lot for the quick reply. This counts values in column A (Name). I’d like to count values in column C (Sub-Classes).

Right-click>Properties
remove “Count Name”
Drag “Sub-class” from the right box to “Data fields”
Double-click, change Sum to Count

1 Like

Nope. This gives the same counts because it counts every value, not unique values. Refer to the description please. I want the number of unique subclasses which should be 3 for music and 2 for sports.

You need to separate a record set of unique classes and subclasses and analyze that by means of formulas or some pivot table. Since a spreadsheet is not a database, dealing with record sets is difficult and prone to errors.
Possible approaches:
0. Transferring the sheet data to a newly created true database would be the fool-proof and solid solution to problems like this.

  1. LibreOffice comes with a database component which can use your spreadsheet as a data source and do that job without creating a new database.
  2. A spreadsheet filter can be set up to copy filter results to separate cell range (Select 2 columns, menu:Data>Filter>Standard Filter, Options)
  3. Recent versions of LO come with a new array function FILTER, however this is extremely prone to errors unless you are familiar with maintainance of array functions.
1 Like

Regarding point 1:
Create a database connection to your spreadsheet as I describe in How to remember sort criterion in Calc? - #3 by Villeroy
When it comes to queries follow this path:
Create a new query in SQL view. The “formula” is simply and self-explaining: SELECT DISTINCT "Class", "Sub-class" FROM "SourceRange" “SourceRange” is the name of the spreadsheet’s database range explained in the other topic.

Save the query under some descriptive name and then the database. Close the database. Nothing has been converted, copied nor imported. Your data are still in the spreadsheet document and nowhere else.
Back in the Calc window, hit Ctrl+Shift+F4 for the data source window and drag the icon of your query from the left pane onto a spreadsheet cell.
Now you have the record set to analyze the set of unique (sub-)categories.
When the source data on your original sheet have changed, you have to save and reload the spreadsheet document before you can refresh (update) the linked import range via Data>Refresh

1 Like

Formula alternatives:

  1. Place the following formula in cell C2 and drag it down:
=COUNTA(UNIQUE(FILTER($C$2:$C$12; $B$2:$B$12=B2)))

For a large number of rows this is not efficient.

  1. Place the following formula in cell F1 and press Enter:
=UNIQUE(B1:C12; 0)

Now you have only unique class and subclass values ​​in columns F:G.

1 Like

and now add to A12:C13:

Karim	Sports	Tennis
Karim	Sports	Violin

FILTER, UNIQUE, SORT suffer from the same problem. Array formulas do not expand with growing data sets. You’ve got to maintain these formulas.

In the conditions of this topic, we can set the interval with a reserve and not adjust the formula for 20 years: :slight_smile:

=UNIQUE(B1:C99999; 0)
1 Like

ask_lo_sample_count_unique_class_PKG.ods (17,1 KB)

2 Likes

Thank you for taking the time to explain and pointing me new directions. This seems to be the better way going forward. I’m flagging PKG’s reply as solution as it does exactly what I required.

Thanks for the answer to my hidden question :slight_smile: Unique function is surely nifty.