How do I find number of specific instances of a specific word in a large document

I am attempting to see how many times an author uses certain specific words in his book. it is a vocabulary analysis. The “find” function reports where they are located, but provides no total count of them.

Hallo

→Edit→Find and replace→findall followed by →Tools→Word count.. should exactly fit to your needs

yes - of course - the following lines of code are completly out of scope that question, but just for fun some python-code to make the whole number of each unique word statistic for a given writer-doc and stores the output sorted by most common-words first into a new calc-document. (take the challenge my dear basic-guys :wink: )

from collections import Counter

def word_stats():
    doc = XSCRIPTCONTEXT.getDocument()
    desktop = XSCRIPTCONTEXT.getDesktop()
    load = desktop.loadComponentFromURL
    text = doc.Text.String
    out = Counter(text.split())
    out = sorted(out.items(), key=lambda x: x[1], reverse=True)
    outdoc = load("private:factory/scalc",
                  "_blanc",
                  0,
                  (),)

    sheet = outdoc.Sheets.getByIndex(0)
    outrange = sheet.getCellRangeByPosition(0,
                                            1,
                                            len(out[0])-1,
                                            len(out))
    outrange.setFormulaArray(tuple(out))
1 Like

Use ‘F & R’ with RegEx.
‘Search For:’ (\bThisword\b) where Thisword is the literal word you want to count the occurrences of independent of the letter case.
‘Replace With:’ $1 which will replace the found word with exactly itself.

‘Replace All’ will run the replacement not actually changing anything - and will output how often it replaced an occurrence.

Warning! I just tried again this method and it was broken. The $1 meaning the found occurence of the word and using it for the replacement was inserted as a literal wrongly. Do not apply this! At least apply Ctrl+Z action immediately after the ‘F & R’.

@Lupp

What - do you think - is the purpose of the button find all ?

I thought it was to find all the occurrences. However I was not aware of the way it is showing the number of occurrences now in the status bar. Just found it. Did you tell me and iamlocutus? If so: Thanks! My ‘Replace All’ always showed the number in a message box as I liked it.

Hi

if your goal is the analysis of the vocabulary you will find perhaps interesting this extension.

Regards