# text search in multi text cells

Sample text:

fred, bill Agnes joy sofi andy  <|> wen bill OJ <|> krig huff Bill plwer, Snuffer
sin Trent, fordy summy, oka  <|> jin huter Bill
dred dir
orka bill, lorry
bill dass
erta
pop lanmc


10 cells, find the total number of Bills, case insensitive C:\fakepath\Untitled 1.ods

Found out and uploaded file. Interesting suggestion from Michel but doesn't work, however, counta runs but gives wrong answer

201019 attached spreadsheet shows various attempts, the cell with 13 is a COUNTA version but the correct answer is 8. Only bill Bill bill, Bill, should be picked up, not embedded bill like hillbilly [good point Lupp]

I wasn't aware that LO had differently named commands in different culture languages [I'm Czech, Australian citizen, knowledge of French, German, Swedish, Japanese is seriously bad but entschuldige is a start :-)

(Edited to format and see shared file.)

edit retag close merge delete

Hello @luber

Kind regards, Michel

( 2020-10-17 14:31:14 +0100 )edit

Thanks for clearly describing what <|> and , means.

( 2020-10-17 14:57:39 +0100 )edit

Sorry, can't paste a spreadsheet segment <|> is a cell break , is just a random comma. Probably 10 cells etc is likely to be easiest while I make up a new spreadsheet. Thanks for the great ideas. Lu

( 2020-10-17 15:40:42 +0100 )edit

Should "hillbilly" be found as an occurrence of "Bill"? If not, in what way must the substrings to find be delimited?
Can you assure that the substrings to search for not contain any reserved characters concerning RegEx?
Information of this kind is substantial if you want to get suggested reliable solutions.

( 2020-10-18 14:58:57 +0100 )edit

Hello @luber1

Could you be also more specific about what went wrong ? Edit your question and provide the data set, the related result and the expected result.

( 2020-10-18 16:02:33 +0100 )edit

@luber1, Don't forget to check the mark () to the left of the answer that solves your question.

( 2020-10-19 00:32:26 +0100 )edit

Hello @luber1

In the newly appended file, the formula close to the solution is in F3=COUNTIF(B2:E9;".*(B|b)ill.* " ). Just remove the two white spaces : the first one between final * and " and the second one between " and )

The right formula is =COUNTIF(B2:E9;".*(B|b)ill.*")

Kind regards, Michel

( 2020-10-19 00:37:27 +0100 )edit

The newly appended file actually was made with the "wildcard" setting under
>Tools>Options>LibreOffice Calc>Calculate>>Formulae wildcards
It must be changed to Enable regular expressins in formulae.
(Wildcards are by far not powerful enough to do real work. They are just promoted by MS.)
With sufficiently recent versions you may use the switch (?i) to make the search case insensitive independent of the global setting. As opposed to the already suggested way this will apply, however, not to just one character, but to all characters following, as long as not a (?-i) switches back.

You may also take the opportunity to learn about Regex, and about a few additional techniques the playful way. I will post a respective attachment Imade "just for fun" to an otherwise probably obsolete answer.

( 2020-10-19 01:34:28 +0100 )edit

Sort by » oldest newest most voted

Formula in F2 =SUMPRODUCT(IF(ISNUMBER(SEARCH("bill";B2:E9))))

C:\fakepath\Untitled 2.ods

EDIT 2020-10-26 08:40:00

This will also count compoundwords like "hillbilly".

Then a correction to the formula : =SUMPRODUCT(IF(ISNUMBER(SEARCH("\<bill\>";B2:E9))))

or a new formula: =COUNTIF(B2:E9;".*\<bill\>.*")

more

Nice oneliner, just keep its limitations in mind:

1. This will only count one accurance of the word bill for each cell.
2. This will also count compoundwords like "hillbilly".

Still a nice and crisp oneliner. I'll give it an upvote. :-)

( 2020-10-25 17:43:09 +0100 )edit

like kiss solutions :-) upvoting

( 2020-10-25 19:24:22 +0100 )edit

The Statements

=SUMPRODUCT(IF(ISNUMBER(SEARCH("\<bill\>";B2:E9))))
=COUNTIF(B2:E9;".*\<bill\>.*")


both give me Zero results. Is there a trick to this?

( 2020-10-26 11:33:38 +0100 )edit
1

Is the following set under Tools → Options → LibreOffice Calc → Calculate?

( 2020-10-26 12:00:19 +0100 )edit

Indeed, that is what i was missing. Thanks. And other might also be glad for this hint. :-)
That only leaves the one count per cell limitation.

( 2020-10-26 12:16:58 +0100 )edit
1

Great, thank you Gentlefolk. With "enable regular expressions in formulas" both formulas worked [although I got some weird answers but closed and reopened and retoggled "Enable.." both worked and in a different sheet but only 1 bill per cell which was OK for me anyway

( 2020-10-27 12:43:04 +0100 )edit

Hello @luber1

I try a proposal : use COUNTIF with a regular expression, like =COUNTIF(A3:A12;"=.*(B|b)ill.*") if the source data are in range A3:A12

.* looks for any number of any kind of characters

(B|b) allows either choice between B or b

Allow for regular expressions to be used in formulas by Alt+F12>LibreOffice Calc> Calculate> Enable regular expressions in formulas

Based on my own confusion between COUNTA and COUNTIF as, in French, they are translated respectively NBVAL and NB.SI, which do not look exactly the same but close by, @luber1, if your working language is not English, in order to catch the right function name in your own language, transit temporarily by English. Close all other LO documents. Open a new blank spreadsheet. Activate the option Alt+F12>LibreOffice Calc> Formula>Use English function names>OK. Then type in a set of data and two formulas, one with COUNTA(...) and the other one with COUNTIF(...). Save the spreadsheet. De-activate the option for Use of English function names. Re-open the spreadsheet and compare the names of both functions in you own language.

Hope this helps too.

To show the community that the question has found its answer, please click on the ✓ aside the top of the correct answer and vote by clicking the ^ caret of all usefull answer.

Kind regards, Michel

more

@mgl, If do you share a sample file with the formula, it will be automatically translated to the user language.

( 2020-10-19 00:29:43 +0100 )edit

RegEx are a very powerful means for searching and manipulating strings. Concerning the usage in Calc this power was significantly improved by the implementation of the REGEX() function since version 6.2.
You may learn about ways to use RegEx for purposes like addressed by the question in a playful way using the attached spreadsheet. C:\fakepath\ask271841FindAndCountStringsInCellTexts.ods

The Regex engine used by LibreOffice is the one by ICU.
A very recommendable site concerning RegEx is (imo) https://www.regular-expressions.info/....

more

Just 4fun,

here is a macro function solution. Just place it into Tools > Macros ... > Edit Macros and you can call it like any other calc function via:

=CMIR("Bill",B2:D10)

• 1st argument is the string to search for (case insensitiv)
• 2nd is a selected cell range
• Return value is the count of found matches

Here the macro function code:

rem (c)ount (m)atches (i)n  (r)ange
function CMIR(needle as string, data() as array) as integer
count=0
for i = lbound(data,1)to ubound(data,1)
for j = lbound(data,2)to ubound(data,2)
parts() = Split(lcase(data(i,j)), lcase(needle))
blub = UBound(parts()) - LBound(parts())
if ( blub > 0 ) then
count=count+blub
endif
next
next
CMIR=count
end function


--- update ---

As promised, here is the regex version (2v1) had to fix some bugs sorry

usage:

=CMIR2("\b[B|b][Ii][Ll][Ll]\b",B2:D9)


code:

rem (c)ount (m)atches (i)n (r)ange version2v1
function CMIR2(needle as string, data() as array) as integer
Dim oTextSearch               ' TextSearch service.
Dim sStrToSearch As String    ' String to search.
Dim aSearchResult             ' com.sun.star.util.SearchResult
Dim aSrcOpt As New com.sun.star.util.SearchOptions
Dim enLocale As New com.sun.star.lang.Locale
Dim rank As Long
Dim count as long
enLocale.Language = "en"
enLocale.Country = "US"
oTextSearch = CreateUnoService("com.sun.star.util.TextSearch")
With aSrcOpt
'https://api.libreoffice.org/docs/idl/ref/SearchFlags_8idl.html
.searchFlag = com.sun.star.util.SearchFlags.REG_EXTENDED
.Locale = enLocale
'Supports ABSOLUTE, REGEXP, and APPROXIMATE
.algorithmType = com.sun.star.util.SearchAlgorithms.REGEXP
.searchString = needle 'rem define word bounds
'.transliterateFlags = com.sun.star.i18n.TransliterationModulesNew.UPPERCASE_LOWERCASE
End With
oTextSearch.setOptions(aSrcOpt)
count=0
rank = 0
for i = lbound(data,1)to ubound(data,1)
for j = lbound(data,2)to ubound(data,2)
sStrToSearch = data(i,j)
aSearchResult = oTextSearch.searchForward(sStrToSearch, 0, Len(sStrToSearch)  )
rank=0
Do While aSearchResult.subRegExpressions > 0
count=count+1
sStrToSearch = mid(sStrToSearch, aSearchResult.endOffset(0) +1 , Len(sStrToSearch))
aSearchResult = oTextSearch.searchForward(sStrToSearch, 0, Len(sStrToSearch)  )
Loop
next
next
CMIR2=count
end function


This Function can handel the hillbilly problem very gracefully. By using word boundarys \b

Hope it helps (even more)

more

1

interesting loop - I'll have to get my head around some of the variables..

( 2020-10-19 00:27:35 +0100 )edit
1

did a short check, both solutions work, @mgl and @igorlius, both dynamic, different results: igorlius counts the occurence of the search term in the range, including multiple occurences in a cell, while mgl counts 'cells containing the search term', and thus neglects dupli- or multicates in one cell,
'total number of Bills' from the question isn't cleanly unambigous, but sounds more like 'all' to me, thus question: 3, mgl: 2, igorlius: 1

( 2020-10-19 01:42:59 +0100 )edit

@luber1 Sorry my solution does not handle the whole word search (aka. hillbilly) I might comeback later and create a differnt solution / macro which actually uses the regex service. Hope it still helps a little for now.

( 2020-10-19 11:39:00 +0100 )edit

Currently counta variations result in 0 and the countif variations give a number result but a higher number than actual - it seems to get confused by commas

( 2020-10-22 09:10:48 +0100 )edit

@luber1 i updated my answer with the promised regex version. Hope it helps. - Had to patch the function, because it had a bug. Sorry.

( 2020-10-25 15:16:47 +0100 )edit