Searching for words in single column that are similar, not exact duplicates

Trying to search for duplicates of company names in a single column. However, sometimes punctuation and spacing are different, so my formula does not always work. Please see my example below from my list of ~4k company names.

The formula used works for the first four company names listed — the result is that they are not duplicates and the word ‘DUPLICATE’ does not appear in the formula column. The formula also works for the last two listed — result is that they ARE duplicates and the word DUPLICATE appears in the formula column.

However, the formula does not work for the middle three company names listed (Dasher Dancer & Prancer PC). It does not identify them as duplicates, even though they are pretty much duplicates except for punctuation, spacing and caps.

What could I modify on my formula to make it work for those three names? Or what other function and/or formula would identify those three names as duplicates, or otherwise identify how similar they are? Am I asking the impossible?

Formula is: =IF(H557=558,“DUPLICATE”,"").

Here’s the example from my list:

Blitzen, C.P.A., P.C.
Comet LLC
Christen M. Cupid, CPA, CFP
Dasher & Co.
Dasher Dancer & Prancer, PC
Dasher, Dancer & Prancer, P.C.
Donner & Associates CPAs, PC
Donner & Associates CPAs, PC


Hello Bea,

I have built a solution which, however, requires an help column. First, the company names are freed from the characters “.&” and additional spaces. If you want to remove more characters, add the character separated by a “|” character to the first part in the regex formula. Caution: Some characters must be masked with a “” character.
In the second column I count the occurrences in all cells and give the hint DUPLICATE if the entry occurs more than 1 time.
Please test the formulas in the example document to see if this corresponds to your wishes.


Small improvements are certainly possible. However, I would need more details.
If this answer has answered the question sufficiently, then please mark your question as answered. The tick next to the answer that turns green with mouse over.


Hello and thanks, Jurgen. Your solution definitely works for numbers, and I agree - a help column is needed to remove the & and additional spaces, and I’m now on the right path.