[Calc] How would I make a generic count of items not in a master list?

JosephStaleknight · June 21, 2019, 8:12am

So I have a spreadsheet in Calc that has a master list in column A followed by a random list in column B, going something like this:

   | A      | B
1  | RED    | RED
2  | YELLOW | YELLOW
3  | GREEN  | YELLOW
4  | BLUE   | GREEN
5  |        | BLUE
6  |        | RED
7  |        | YELLOW
8  |        | WHITE
9  |        | RED
10 |        | GREEN
11 |        | BLUE
12 |        | BLACK
13 |        | BROWN
14 |        | GREEN
15 |        | BLUE
16 |        | RED
17 |        | YELLOW
18 |        | GREEN
19 |        | BLUE

And I know how to use COUNTIF to count how many times each color from A occurs in B. For example, I’d use COUNTIF(B:B,A1) in cell C1 for the number of times RED shows up in B, COUNTIF(B:B,A2) in C2 for YELLOW, etc. But there are also values that are in B but not in A (e.g. BLACK, WHITE, BROWN), and I want to lump the count of those together into a single count of “Other” colors since they don’t appear as often. How would I go about doing that with an auto-adapting formula?

I’ve tried using COUNTA(B:B) - SUM($C1:$C4), but I have to manually change that when increasing the size of A–say, to add BLACK and WHITE but let BROWN remain unlisted. For that same reason, using COUNTIFS(B:B, “<>” & A1, B:B, “<>” & A2,…) was not feasible for me (especially if A were to end up massive). I also tried working with SUMPRODUCT and arraying ISERROR(MATCH()), but both give me “0” instead of the right count. Finally, COUNTIF(B:B,"<>" & A:A) just gives me the length of list B, which I definitely don’t want.

Essentially, what I want to do is almost the same as this, but instead of adding arbitrary sums I just want to count values not found in a master list.

Thanks in advance for the help!

Lupp · June 21, 2019, 9:15am

Have a look into this example.

JosephStaleknight · June 21, 2019, 5:25pm

Okay, that works. I also found out a way to make the formula account for changes in master list size so I don’t have to manually update it.

Thanks for the help!

Lupp · June 21, 2019, 5:49pm

If you found something valuable, you should post it. Forums arem’t one-way, but peer-to-peer.

JosephStaleknight · June 21, 2019, 6:15pm

Basically, on the spreadsheet you provided I added COUNTA(D:D)-1 in place of $A$1 in the formula in cell C1. That way, it would count the number of master list items dynamically and reduce the number of times the list length in A1 would need to be referenced. (The “-1” accounts for the header.) And for that matter, I could replace the first instance of A1 with the formula in that cell to further minimize the number of references needed.

In truth, this is for a pie chart I’ve been making for a related spreadsheet–I wanted to have a “miscellaneous” category for items that don’t appear frequently enough to warrant their own count, such that it would be reflected in the chart automatically.

Lupp · June 21, 2019, 6:34pm

Why do you think reducing the number of references to a single cell might be preferrable over reducing the number of times a subexpession like COUNTA(D:D)-1 needs to be recalculated?
(Yes the “Others” piece of the cake is common.)

JosephStaleknight · June 21, 2019, 8:51pm

What was going through my head at the time was that I didn’t want to leave any extraneous cells filled in for a messy spreadsheet. Then again, I see the point you’re making about not needing to recalculate subexpressions too often–especially if doing so requires excess memory and resources in the process.