Ask Your Question
1

text search in multi text cells

asked 2020-10-17 14:24:55 +0100

luber1 gravatar image

updated 2020-10-19 00:46:13 +0100

Sample text:

fred, bill Agnes joy sofi andy  <|> wen bill OJ <|> krig huff Bill plwer, Snuffer
sin Trent, fordy summy, oka  <|> jin huter Bill
dred dir
orka bill, lorry
bill dass
erta
pop lanmc

10 cells, find the total number of Bills, case insensitive C:\fakepath\Untitled 1.ods

Found out and uploaded file. Interesting suggestion from Michel but doesn't work, however, counta runs but gives wrong answer

201019 attached spreadsheet shows various attempts, the cell with 13 is a COUNTA version but the correct answer is 8. Only bill Bill bill, Bill, should be picked up, not embedded bill like hillbilly [good point Lupp]

I wasn't aware that LO had differently named commands in different culture languages [I'm Czech, Australian citizen, knowledge of French, German, Swedish, Japanese is seriously bad but entschuldige is a start :-)

(Edited to format and see shared file.)

edit retag flag offensive close merge delete

Comments

Hello @luber

Please edit your question and attach the spreadsheet.

Kind regards, Michel

mgl gravatar imagemgl ( 2020-10-17 14:31:14 +0100 )edit

Thanks for clearly describing what <|> and , means.

Opaque gravatar imageOpaque ( 2020-10-17 14:57:39 +0100 )edit

Sorry, can't paste a spreadsheet segment <|> is a cell break , is just a random comma. Probably 10 cells etc is likely to be easiest while I make up a new spreadsheet. Thanks for the great ideas. Lu

luber1 gravatar imageluber1 ( 2020-10-17 15:40:42 +0100 )edit

Should "hillbilly" be found as an occurrence of "Bill"? If not, in what way must the substrings to find be delimited?
Can you assure that the substrings to search for not contain any reserved characters concerning RegEx?
Information of this kind is substantial if you want to get suggested reliable solutions.

Lupp gravatar imageLupp ( 2020-10-18 14:58:57 +0100 )edit

Hello @luber1

Could you be also more specific about what went wrong ? Edit your question and provide the data set, the related result and the expected result.

mgl gravatar imagemgl ( 2020-10-18 16:02:33 +0100 )edit

@luber1, Don't forget to check the mark (Correct answer mark) to the left of the answer that solves your question.

LeroyG gravatar imageLeroyG ( 2020-10-19 00:32:26 +0100 )edit

Hello @luber1

In the newly appended file, the formula close to the solution is in F3=COUNTIF(B2:E9;".*(B|b)ill.* " ). Just remove the two white spaces : the first one between final * and " and the second one between " and )

The right formula is =COUNTIF(B2:E9;".*(B|b)ill.*")

Kind regards, Michel

mgl gravatar imagemgl ( 2020-10-19 00:37:27 +0100 )edit

The newly appended file actually was made with the "wildcard" setting under
>Tools>Options>LibreOffice Calc>Calculate>>Formulae wildcards
It must be changed to Enable regular expressins in formulae.
(Wildcards are by far not powerful enough to do real work. They are just promoted by MS.)
With sufficiently recent versions you may use the switch (?i) to make the search case insensitive independent of the global setting. As opposed to the already suggested way this will apply, however, not to just one character, but to all characters following, as long as not a (?-i) switches back.

You may also take the opportunity to learn about Regex, and about a few additional techniques the playful way. I will post a respective attachment Imade "just for fun" to an otherwise probably obsolete answer.

Lupp gravatar imageLupp ( 2020-10-19 01:34:28 +0100 )edit

4 Answers

Sort by » oldest newest most voted
2

answered 2020-10-25 15:59:14 +0100

PKG gravatar image

updated 2020-10-26 08:42:10 +0100

Formula in F2 =SUMPRODUCT(IF(ISNUMBER(SEARCH("bill";B2:E9))))

C:\fakepath\Untitled 2.ods


EDIT 2020-10-26 08:40:00

This will also count compoundwords like "hillbilly".

Then a correction to the formula : =SUMPRODUCT(IF(ISNUMBER(SEARCH("\<bill\>";B2:E9))))

or a new formula: =COUNTIF(B2:E9;".*\<bill\>.*")

edit flag offensive delete link more

Comments

Nice oneliner, just keep its limitations in mind:

  1. This will only count one accurance of the word bill for each cell.
  2. This will also count compoundwords like "hillbilly".

Still a nice and crisp oneliner. I'll give it an upvote. :-)

igorlius gravatar imageigorlius ( 2020-10-25 17:43:09 +0100 )edit

like kiss solutions :-) upvoting

newbie-02 gravatar imagenewbie-02 ( 2020-10-25 19:24:22 +0100 )edit
1

answered 2020-10-17 14:40:47 +0100

mgl gravatar image

updated 2020-10-19 00:00:53 +0100

Hello @luber1

I try a proposal : use COUNTIF with a regular expression, like =COUNTIF(A3:A12;"=.*(B|b)ill.*") if the source data are in range A3:A12

.* looks for any number of any kind of characters

(B|b) allows either choice between B or b

Allow for regular expressions to be used in formulas by Alt+F12>LibreOffice Calc> Calculate> Enable regular expressions in formulas

Complement to my answer

Based on my own confusion between COUNTA and COUNTIF as, in French, they are translated respectively NBVAL and NB.SI, which do not look exactly the same but close by, @luber1, if your working language is not English, in order to catch the right function name in your own language, transit temporarily by English. Close all other LO documents. Open a new blank spreadsheet. Activate the option Alt+F12>LibreOffice Calc> Formula>Use English function names>OK. Then type in a set of data and two formulas, one with COUNTA(...) and the other one with COUNTIF(...). Save the spreadsheet. De-activate the option for Use of English function names. Re-open the spreadsheet and compare the names of both functions in you own language.

Hope this helps too.


To show the community that the question has found its answer, please click on the ✓ aside the top of the correct answer and vote by clicking the ^ caret of all usefull answer.

Please do not use Add Answer but edit your original question to enhance the details of your question (answers are reserved for solutions to a problem on this Q&A site). Thanks in advance …

Kind regards, Michel

edit flag offensive delete link more

Comments

@mgl, If do you share a sample file with the formula, it will be automatically translated to the user language.

LeroyG gravatar imageLeroyG ( 2020-10-19 00:29:43 +0100 )edit
0

answered 2020-10-19 01:47:08 +0100

Lupp gravatar image

updated 2020-10-19 02:01:08 +0100

RegEx are a very powerful means for searching and manipulating strings. Concerning the usage in Calc this power was significantly improved by the implementation of the REGEX() function since version 6.2.
You may learn about ways to use RegEx for purposes like addressed by the question in a playful way using the attached spreadsheet. C:\fakepath\ask271841FindAndCountStringsInCellTexts.ods

The Regex engine used by LibreOffice is the one by ICU.
A very recommendable site concerning RegEx is (imo) https://www.regular-expressions.info/....

edit flag offensive delete link more
0

answered 2020-10-18 21:23:19 +0100

updated 2020-10-25 18:12:07 +0100

Just 4fun,

here is a macro function solution. Just place it into Tools > Macros ... > Edit Macros and you can call it like any other calc function via:

=CMIR("Bill",B2:D10)
  • 1st argument is the string to search for (case insensitiv)
  • 2nd is a selected cell range
  • Return value is the count of found matches

Here the macro function code:

rem (c)ount (m)atches (i)n  (r)ange
function CMIR(needle as string, data() as array) as integer
    count=0
    for i = lbound(data,1)to ubound(data,1) 
        for j = lbound(data,2)to ubound(data,2)
            parts() = Split(lcase(data(i,j)), lcase(needle))
            blub = UBound(parts()) - LBound(parts())
            if ( blub > 0 ) then
                count=count+blub
            endif
        next 
    next 
    CMIR=count
end function

--- update ---

As promised, here is the regex version (2v1) had to fix some bugs sorry

usage:

=CMIR2("\b[B|b][Ii][Ll][Ll]\b",B2:D9)

code:

rem (c)ount (m)atches (i)n (r)ange version2v1
function CMIR2(needle as string, data() as array) as integer
  Dim oTextSearch               ' TextSearch service.
  Dim sStrToSearch As String    ' String to search.
  Dim aSearchResult             ' com.sun.star.util.SearchResult
  Dim aSrcOpt As New com.sun.star.util.SearchOptions
  Dim enLocale As New com.sun.star.lang.Locale
  Dim rank As Long
  Dim count as long
  enLocale.Language = "en"
  enLocale.Country = "US"
  oTextSearch = CreateUnoService("com.sun.star.util.TextSearch")
  With aSrcOpt
    'https://api.libreoffice.org/docs/idl/ref/SearchFlags_8idl.html
    .searchFlag = com.sun.star.util.SearchFlags.REG_EXTENDED
    .Locale = enLocale
    'Supports ABSOLUTE, REGEXP, and APPROXIMATE
    .algorithmType = com.sun.star.util.SearchAlgorithms.REGEXP
    .searchString = needle 'rem define word bounds
    '.transliterateFlags = com.sun.star.i18n.TransliterationModulesNew.UPPERCASE_LOWERCASE
  End With
    oTextSearch.setOptions(aSrcOpt)
    count=0
    rank = 0
    for i = lbound(data,1)to ubound(data,1) 
        for j = lbound(data,2)to ubound(data,2)
              sStrToSearch = data(i,j)
               aSearchResult = oTextSearch.searchForward(sStrToSearch, 0, Len(sStrToSearch)  )
                rank=0
                    Do While aSearchResult.subRegExpressions > 0
                        count=count+1
                        sStrToSearch = mid(sStrToSearch, aSearchResult.endOffset(0) +1 , Len(sStrToSearch))
                        aSearchResult = oTextSearch.searchForward(sStrToSearch, 0, Len(sStrToSearch)  )                     
                  Loop
        next 
    next 
    CMIR2=count
end function

This Function can handel the hillbilly problem very gracefully. By using word boundarys \b

Hope it helps (even more)

edit flag offensive delete link more

Comments

1

interesting loop - I'll have to get my head around some of the variables..

luber1 gravatar imageluber1 ( 2020-10-19 00:27:35 +0100 )edit
1

did a short check, both solutions work, @mgl and @igorlius, both dynamic, different results: igorlius counts the occurence of the search term in the range, including multiple occurences in a cell, while mgl counts 'cells containing the search term', and thus neglects dupli- or multicates in one cell,
'total number of Bills' from the question isn't cleanly unambigous, but sounds more like 'all' to me, thus question: 3, mgl: 2, igorlius: 1

newbie-02 gravatar imagenewbie-02 ( 2020-10-19 01:42:59 +0100 )edit

@luber1 Sorry my solution does not handle the whole word search (aka. hillbilly) I might comeback later and create a differnt solution / macro which actually uses the regex service. Hope it still helps a little for now.

igorlius gravatar imageigorlius ( 2020-10-19 11:39:00 +0100 )edit

Currently counta variations result in 0 and the countif variations give a number result but a higher number than actual - it seems to get confused by commas

luber1 gravatar imageluber1 ( 2020-10-22 09:10:48 +0100 )edit

@luber1 i updated my answer with the promised regex version. Hope it helps. - Had to patch the function, because it had a bug. Sorry.

igorlius gravatar imageigorlius ( 2020-10-25 15:16:47 +0100 )edit
Login/Signup to Answer

Question Tools

2 followers

Stats

Asked: 2020-10-17 14:24:55 +0100

Seen: 91 times

Last updated: 34 mins ago