The find() function doesn't find the space between two words in a csv file

I imported a csv file into calc. The 2nd column contains people’s first and last names, separated by a space. For example, “John Smith”, but without the quotes. When I try to use the find() function to get the position of the
space between the names, ie: =find(" ",b2), it doesn’t work. But if I manually edit the field to delete whatever char is between the two names and put a space there, then my =find(" ";b2) correctly returns the position of the space. Obviously I don’t want to edit every name to manually insert a space. Why is it doing this and how do I fix it?
I suspect it has something to do with whether charactors are stored as single bytes or double-bytes, but I’m just speculating.

If there is a different space, e.g. em-space (U+2003) then your formula won’t find it. You could do one of the following:

  • Copy the space from your version of John Smith and use that in the FIND formula
  • Click Edit > Find & Replace, tick Regular Expressions box. In the Find field enter \s and in the Replace field enter a normal space, then click Replace All and your existing formulas will work because all variations of space will be replaced by a standard space.
  • Copy the space from your version of John Smith then select your column and add a column beside it, click Data > Text to Columns and as separators tick Other and paste in the space that you copied, maybe tick Space as well for good measure. Click OK. Your column will be separated into columns at the spaces
3 Likes

How doesn’t it work? Do you get an error (which one), wrong position, or is the formula displayed verbatim?

Have you enabled a locale which uses comma as decimal separator? I see that your sample “working formula” uses semicolon (which always should work), while the “not working” one uses comma.

Thanks for your questions regarding my issue with using find() on text

that includes a space.

On further investigation I don’t think it’s a problem with Calc. Here’s why I

say that.

The csv files in which the issue arises are files I downloaded from a website named

fangraphs.com. I’ve downloaded files from there hundreds of times before and

opened them with Calc and did not have this issue. Yesterday I discovered that

when I open these recently downloaded files using NOTEPAD and try to find text

which contains a space, it tells me it could not find any matches.

That shows that the problem is in the file itself. The files appear identical to the ones

I’ve downloaded from fangraphs in the past but it appears that the way spaces are

represented has somehow changed internally. I’m going to look for a way to examine

the csv file in byte format and see if that is helpful.

If you have further suggestions I’d appreciate hearing from you again.

Bill Bell

1 Like

And if you gave a link to the file or uploaded an excerpt here, one could take a look and tell what the actual content was.

In LibreOffice if I click Insert - Special character and then in the search box I type space I get 13 different space types found.

Presumably, one of those other types of space has been used. It might be so that first name and last name can be separated easily, for example Dick van Dyke might have an em space between Dick and van but a normal space between van and Dyke

1 Like

If you use SEARCH() instead of FIND() you could use regular expressions and then search for unicode-whitespace instead of only the “usual” ASCII-space.

2 Likes

Thanks for the suggestion. I’ll try it out.

Those suggestions sound very promising. I like the way you think. Thanks.

Simply selecting a problematic character in Calc, and pressing Alt+X would turn it into its Unicode U+NNNN notation.

3 Likes

Okay thanks, I tried your suggestion (alt x) and the result was U+00a0.
Google tells me that is a non-breaking space, so at least now I know what it
is. That’s progress. Now I just need to do some research into how to tell calc
functions to look for a non-breaking space.

Thanks!

Using alt-x revealed that the mysterious spaces were unicode U-00a0, which is a

non-breaking space. I can now do find and replace them with normal spaces.

Problem solved! By the way, as this was my first time using the libreoffice help

system, do I need to close the topic? Is it sufficient that I marked your reply

as the solution or is there another step needed to close it?

1 Like

Marking an answer (or comment) as solution is fine; most contributors here generally dislike closing questions, because that prevents other answers in a year, from someone who found them, and so prevents making this topic a better source of information for others. So you did it perfectly.

Thank you!