How can I count the number of occurences of individual Chinese characters in a document?

I want to count the number of occurrences of each separate word or Chinese character in a document. The words have already been sorted into groups. I would like a word count of the words within that group. I am using LibreOffice (can be Writer or Calc). I would prefer not to do this in terminal mode.
For example:
the = 12,
with = 5,
take = 2.

Writer displays Chinese characters as words on the bottom of the Writer window for the entire text or for a text selection. Selection has priority

Alternatively you can use Tools > Word Count. When using this function you can the number of a selection of the text and the entire text the same time.

Be careful and understand how Arabic numbers and symbols like are counted.

Thank you, however I am not looking for the total count, but the individual count for each word. For example, how many times does apple occur, and how many times does orange occur, and so on.

Then follow the answer of @karolus

I can’t seem to find out where this answer is, on Apr 4, by @karolus – please clarify how to find it.


What about →Edit→Search&replace →search for … →find all

I want to find the frequency of many different characters. I would have to type each one individually this way. Any way to count the frequency of different characters all at the same time? In English or Roman letters, this would be the same as finding the frequency of A, B, C, and so on, to Z.

inside the Searchfield concatenate the Chinese-symbols similar to apple|orange|A|B|C and click more options→[x]regular Expression

but IHMO for detailed Report for every Word|Symbol you need some Kind of Makro.

The above answers do not really answer the question. Clearly, this feature is missing and should be added.

I have found a workaround to use in the mean while. Use the Search and Replace menu instead. Then click Replace All. When this is done, a popup box will say that the phrase was replaced X times, giving you the count. After this, you undo the replacement by CTRL+Z/CMD+Z.

If you replace the found specimen of the ‘Search For’ with exactly the same text you won’t need the ‘Undo’. Using RegEx the special character & is used in ‘Replace With’ with the meaning “whatever was found”.

This is about counting in Calc.
Assuming one of the “groups” mentioned in the original question is contained in A1, and in B1 a character, a word or a phrase to search for, then the formula

will return the number of occurrences. Overlapping occurences will be counted each. “Words” occurring inside words consisting of more characters will also be counted. Knowing nothing about Chinese I cannot tell if this will be acceptable. Using RegEx and a construct with SEARCH instead of the direct comparison of strings, additional control of the counting may be achieved.
The construct OFFSET(INDIRECT("$A$1");0; ... is used to avoid errors caused by lost references after deletion of a row or a column in specific cases. Omitting this, INDIRECT("$A$1") can be replaced by $A$1.
The content of any cell is limited to a maximum of 65535 characters.

Hi @Lupp, if I’m not wrong, Chinese is DBCS text (double byte), and for those in calc there are some special functions like MIDB() LEFTB() RIGHTB() LENB().

Might someone supply an example file? I think, a solution based on my recent suggestion will work anyway, marginal adaption possibly needed. Of course I do not oppose to any more efficient solution based on an extension / add-in or on specialised functions. (If I had the problem, I surely considered a solution based on a general-purpose language/IDE.)


I do not know for Chinese characters but for separate words you can try the linguist extension:

  • Download then
  • ToolsExtension ManagerAdd▸select from you download folder
  • Quit LibreOffice.

The extension is deployed the next launch. New menu Linguist added in Writer.