Calc: count cells that contain a given string, but are not equal to it

Imagine you wan to count cells containing the word “cat” in them.

The cells containing sentences like:

“Cats like milk”
“Dogs don’t like cats”
“Every cat has nine lives”
“A dog is a man’s best friend”

You want to know how many cells contain the string “cat” - this should give a value of 3 for the above cells.

I know COUNTIF can match exact cell contents, but what about cells that contain other data aswell?
I guess something like a regular expression would be handy, but I have no idea how to do that.

Also imagine you want to use another cell as source for the keyword to check against:

“cat”
“dog”
“man”

This should produce respectively:

3
2
1

How could this be done?

The definition of word is indeed ambiguous, as in the example singular and plural should be found. If it is only about sub strings then this “simple” approach would do, without the not neglectable overhead of regular expressions, assuming string to be found in cell A1 and data in B1:B4

=SUMPRODUCT(IFERROR(SEARCH(A1;B1:B4)>=1;0);LEN(B1:B4)>LEN(A1))

This would also find cat in concatenate of course.

What is the purpose of LEN(B1:B4)>LEN(A1) in the formula?

Contained, but not equal to.

Ah! Thanks! I have read the caption slightly differently: “Also count cells that contain a given string, but are not equal to it”… possibly my interpretation is wrong, I agree.

As you mentioned, COUNTIF supports regular expressions - which depends on a setting.

So given the use of regular expressions is enabled, and also that COUNTIF uses case-insensitive comparison, using the following formula will count all “Cat/cat/cAts/…” that are whole words somewhere in texts in column A:

=COUNTIF(A:A; ".*\bcats?\b.*")

If you don’t need to limit yourself to only count whole words, but also want to count words like “concatenate”, then you’d remove both \b from the expression. If you want to find whole word given in B1, then you might use this:

=COUNTIF(A:A; ".*\b\Q" & B1 & "\E\b.*")

First version finds also cat in cat (i.e. equal to) that was to be exempted.

Second version should be

=COUNTIF(A:A; ".*\b\Q" & B1 & "\E\b.*")

instead, note the swapped first .* vs *., but also that finds equal occurrences.

Corrected the error noted by @erAck. Thanks!

Hello

your question is a bit ambiguous but I try an answer using an approach with regular expressions:

Assumption(s) for the example:

  • For better readability the range of sentences is named: CLAIMS (e.g. CLAIMS is the range of cells A1:A4
  • Since you consider various find strings, the pattern to be searched for is in a separate cell (in my example in B7
  • The result will be into C7
  • You want to find words Cat, cat, Cat, Cats, cat's, Cat's

then a (! - there might be others and smarter ones) solution may look like this:

Cell B7: \b[Cc]at[s’]{0,}\b (this is the regular expression)
Cell C7: =SUMPRODUCT(ISTEXT(REGEX(CLAIMS,B7,,"g"))*ISTEXT(REGEX(CLAIMS,B7,,"g")))

Hope that helps

@mikekaganski’s answer simplifies my answer to a minimum: Assure Tools -> Options -> LibreOffice Calc -> Calculate -> Option: Enable regular expression in formulas is set and change cells to:
Cell B7: .*\b[Cc]at[s’]{0,}\b.*
Cell C7: =COUNTIF(CLAIMS,B7)