In Calc (Version 6.2.4.2) there is a specific function available which is called »regex«.
(REGEX Function)
Basically this is a neat feature, but the documentation doesn’t help much of WHICH format the regex should be. The link to the ICU irritates more than it helps.
What I want to accomplish:
Searching for a string (DOI format) in a cell.
The regex in Python 3.7 for this:
\b(10[.][0-9]{4,}(?:[.][0-9]+)*\/(?:(?![\"&\'<>])\S)+)\b
Test strings:
Comput Methods Programs Biomed. 2018 Nov;166:33-38. doi: 10.1016/j.cmpb.2018.09.006. Epub 2018 Sep 12.
Forensic Sci Int Genet. 2019 Jan;38:39-47. doi: 10.1016/j.fsigen.2018.10.005. Epub 2018 Oct 9.
Surv Ophthalmol. 2019 Mar - Apr;64(2):233-240. doi: 10.1016/j.survophthal.2018.09.002. Epub 2018 Sep 22. Review.
Regex101 matches correctly.
Calc returns error code 508 after this formula:
=REGEX(D2;"\b(10[.][0-9]{4,}(?:[.][0-9]+)*\/(?:(?![\"&\'<>])\s)+)\b"))
Which »Regex language« is needed for this function?