Formula to Return TRUE if a string has at least N unique characters, FALSE otherwise?

I have a large number of strings of arbetrary length.

I would like to filter out any strings which contain fewer than three unique characters.
For example “121212,” and “ABABABABB” would fail but “123” and “ABC” would pass.

Is there a formula I can run down a column with my strings as inputs and get the desired true/false output.

1 Like
=SUMPRODUCT(
    COUNTIF(
        MID($A1;COLUMN($A1:$I1);1);
        MID($A1;COLUMN($A1:$I1);1)
    )=1;
    LEN(
        MID($A1;COLUMN($A1:$I1);1)
    )
)

Any version with strings up to 8 characters. Replace $I1 for more columns.

=SUBSTITUTE(SUBSTITUTE(A1;LEFT(A1););LEFT(SUBSTITUTE(A1;LEFT(A1);));)<>""
3 Likes

Thank you, this is so close. Looking at this and the other solutions, I think my use of the word “unique” has caused some confusion about my question.

When I said, “unique characters in a string,” rather than the number of characters in the string of which there is exactly one, I meant the size of the smallest set of characters that are needed to compose that string.

Some examples:

12345, trivially returns 5.
123451, returns 5 because you need the set of characters {1, 2, 3, 4, 5} to make the string.
assassin, returns 4 because you need the set of characters {a, s, i, n} to make that string.

I would prefer to use “distinct” in such a case, but a lot of experience with open office sites made me accept “unique” with that meaning.
However, I’m not a native English speaker, and I still don’t know what to say if actually “unique” is meant as “occurring exactly once”.

I think I’d use “Non-repeating” to mean “occuring exactly once.”

is the most elegant solution. It removes all occurances of the first character, then all occurances of the first remaining character, and if there is still something left, there must be at least 3 different characters.

2 Likes

What about the “N” in the subject?
I would take the “three” in the first sentence of the question as an example. What if you replace it by (e.g.) 9?

seems the relevant adjective would rather be … " monstruous " :wink:

You can still use the spreadsheet in manual mode. ask115939.ods (21.6 KB) (up to 100 characters)

For better usability you may use my generic list keeping macro InsertCalcRows_LO.odt, which expands all formulas as you insert new rows within or directly below some list.

If user code is acceptable, the following (reusable) macro would help

Function repeatedCharactersRemoved(pString As String) As String
Dim out As String, w_char As String
out     = ""
Do While Len(pString)>0
 w_char  = Left(pString, 1)
 pString = Join(Split(pString, w_char), "")
 out     = out & w_char
Loop
repeatedCharactersRemoved = out
End Function

The sheet would then contain formulas like

=LEN(REPEATEDCHARACTERSREMOVED(A13))>=3

Or

=LEN(REPEATEDCHARACTERSREMOVED((UPPER(A13))>=3
2 Likes

Oh praise algortheos, using user code is super easy in Libre Calc. I was worried it was gonna be an ordeal to try this but it’s about as simple plug and chug as it gets. This is working perfectly, thank you.

aka replace(pString, w_char, "") :wink:

1 Like

Yes.
But that’s -4 characters longer.:wink:

  :+1:

and to keep an eye on UNO :

Function repeatedCharactersRemoved(pString As String) As String
  uniq  = com.sun.star.container.EnumerableMap.create("string","boolean")
    
  for  i = 1 to len(pString)
    w_char = mid(pString,  i, 1)
    if not uniq.containsKey(w_char)    then
      uniq.put(w_char, true)
      repeatedCharactersRemoved = repeatedCharactersRemoved & w_char
    end if
  next i
End Function