I am attempting to see how many times an author uses certain specific words in his book. it is a vocabulary analysis. The “find” function reports where they are located, but provides no total count of them.
→Edit→Find and replace …
→findall followed by
→Tools→Word count.. should exactly fit to your needs
yes - of course - the following lines of code are completly out of scope that question, but just for fun some python-code to make the whole
number of each unique word statistic for a given writer-doc and stores the output sorted by most common-words first into a new calc-document. (take the challenge my dear basic-guys )
from collections import Counter def word_stats(): doc = XSCRIPTCONTEXT.getDocument() desktop = XSCRIPTCONTEXT.getDesktop() load = desktop.loadComponentFromURL text = doc.Text.String out = Counter(text.split()) out = sorted(out.items(), key=lambda x: x, reverse=True) outdoc = load("private:factory/scalc", "_blanc", 0, (),) sheet = outdoc.Sheets.getByIndex(0) outrange = sheet.getCellRangeByPosition(0, 1, len(out)-1, len(out)) outrange.setFormulaArray(tuple(out))
Use ‘F & R’ with RegEx.
(\bThisword\b) where Thisword is the literal word you want to count the occurrences of independent of the letter case.
$1 which will replace the found word with exactly itself.
‘Replace All’ will run the replacement not actually changing anything - and will output how often it replaced an occurrence.
Warning! I just tried again this method and it was broken. The $1 meaning the found occurence of the word and using it for the replacement was inserted as a literal wrongly. Do not apply this! At least apply Ctrl+Z action immediately after the ‘F & R’.
What - do you think - is the purpose of the button
find all ?
I thought it was to find all the occurrences. However I was not aware of the way it is showing the number of occurrences now in the status bar. Just found it. Did you tell me and iamlocutus? If so: Thanks! My ‘Replace All’ always showed the number in a message box as I liked it.
if your goal is the analysis of the vocabulary you will find perhaps interesting this extension.