We will be migrating from Ask to Discourse on the first week of August, read the details here

# Replace Full Space with Zero Width Space using Regular Expressions

I need to replace all full-spaces between می and any word with a zero-width space like U+FEFF (Zero Width Non-Breaking Space). For example, the sentence: می روم must change to می‌روم (The space is entered on keyboard by pressing Shift + Space on Fedora Linux). Any similar combination must follow the same rule.

I am trying to use the Find-Replace Dialog with the following fields:

This just changes sentences to:

How can I tell it to just replace the full-space with zero-width space?

edit retag close merge delete

Sort by » oldest newest most voted

RegularExpression doesn't help in this case. Disable the option, and choose Current Selection only if needed. You need to enter one ordinary space (the character you want to replace) into Find:, and a literal ZERO WIDTH SPACE into Replace:. Then run Replace All.

A specimen of ZERO WIDTH SPACE you get by typing 200B somewhere and hitting Alt+X immediately at the end. Then select and cut that strange character and paste it into Replace:.

Sorry. I dont know if the 200BAlt+X works exactly the same way in a right to left layout.

Studied the images to more detail.
The Arabic word (or particle) in front of the space and the opening square bracket will be part of the finding, and therefore re-inserted for the "&". In addition you insert it explicitly together with a trailing ZEROWIDTHSPACE (supposedl). It consequently is doubled. Let me give a respective RegEx and a replacement WesternStyle being capable of doing what you seem to want if you correctly tranliterate it right-to-left to your needs. RegEx for Find: (myWord) ([:alpha:)]) A space between (myWord) and ([:alpha:])! Replace: $1​$2 Now misleading a bit, because there must be -and actually is- the ZEROWITHSPACE between $1 and$2.in the correct place.

more

Thanks for the input. I usually use Ctrl + Shift + U + [Unicode Code] to enter unicode characters on Linux. It seems Alt + X does not work for me.

( 2020-07-21 13:17:53 +0200 )edit

I'm concerned that replacing all spaces will connect all my words in the document, which is not pleasant. This should be doable via regex, e.g. every space that is surrounded by می and another word should be replaced with the desired character.

( 2020-07-21 13:20:23 +0200 )edit
1

Of course you may find the specific spaces you want to replace by RegEx.
My respective comment should probably read "RegEx cannot help you with getting a ZeroWidthSpace in Replace:". A case of "narrow thinking" on my behalf.
However, I can hardly help you with a proper RegEx neither clearly seing what you tried nor having an example document for testing nor having any experience with right-to-left texts (in documents and/or UI dialogs).
How to define the needed Find:context with a RegEx, you may learn from https://www.regular-expressions.info/..., and if you prefer to not use lookbehind/lookahead, you may have an alternative to include the context strings with parentheses, and to re-insert them with the help of references in the Replace: using the $-character for this purpose as only supported here. If you supply an example document with a sufficient set of clear eamples, it may ...(more) ( 2020-07-21 13:41:31 +0200 )edit Studied the images to more detail. The Arabic word (or particle) in front of the space and the opening square bracket will be part of the finding, and therefore re-inserted for the "&". In addition you insert it explicitly together with a trailing ZEROWIDTHSPACE (supposedl). It consequently is doubled. Let me give a respective RegEx and a replacement WesternStyle being capable of doing what you seem to want if you correctly tranliterate it right-to-left to your needs. RegEx for Find: (myWord) ([:alpha:)]) A space between (myWord) and ([:alpha:])! Replace: $1​$2 Now misleading a bit, because there must be -and actually is- the ZEROWITHSPACE between$1 and \$2.in the correct place.

( 2020-07-21 14:49:15 +0200 )edit

Thanks Lupp. I just added a text with Persian text. In fact, Arabic and Persian use similar script, but they are different languages. See https://pastebin.com/tAZkgLix

( 2020-07-21 15:24:50 +0200 )edit

( 2020-07-21 15:27:51 +0200 )edit

Just used this technique to replace 500 combinations in a long text!

( 2020-07-21 15:30:08 +0200 )edit

Yes. I knew that Farsi/Persian is a group of "Indogermanic" languages. I didn't know how to distinjguish. Arabic and similar scripts ire just like a ravel of nematodes to me - though respecting tha fact that it can be used suberbly to create works of art. The world is so rich. But these many scripts are annoying nonetheless ...

( 2020-07-21 15:54:35 +0200 )edit