Word characters are [\p{Alphabetic}\p{Mark}\p{Decimal_Number}\p{Connector_Punctuation}\u200c\u200d].
Can anyone translate this ^ for me?
What is in the classes \p{Mark}, and \p{Connector_Punctuation}?
Reason for asking:
I have this string:
The Old Gymnasium, Vicarage Playing Fields, Llanbadarn Road.
I want to return Llandbadarn Road only
I’m trying to tell the regex I want “a string starting just after comma-space and ending just before full stop/period, but ignore all commas except the last one”.
I thought I could do that by returning everything between ,_ and . if the only letters found are 0-9A-Za-z (but would be nice to add ' and - too)
"(?<=,\s).*(?=\.)" returns Vicarage Playing Fields, Llanbadarn Road
"(?<=,\s)\w*(?=\.)" returns #NA
"(?<=,\s)[0-9A-Za-z]*(?=\.)" returns #NA
Any tips? Thanks.
*(underscore is a space: markdown is unhappy without it)
 
      
    