RegEx object pattern .*? being greedy

RegEx object pattern .*? seems to be greedy when followed by \(

Think this greedy behavior begain in 25.8.(2, 3 or 4) and is also present in 26.2.1

Here is an example macro:

Public Sub Main()

    html = "a-percent"">(+2.57%) another-percent"">(-0.75%)"

    Dim regEx As Object
    Set regEx = CreateObject("VBScript.RegExp")
    regEx.Global = False
    regEx.IgnoreCase = True


    ' Works as expected
    regEx.Pattern = "percent"".*?(\-?[0-9,]+(?:\.[0-9]+)?)\%\)"

    Set Matches = regEx.Execute(html)
    If Matches.Count > 0 Then
        msgbox(Matches.Item(0))
        Percent = Replace(Matches.Item(0).SubMatches.Item(0), ",", "")
        msgbox(Percent)
    End If


    ' Does not work as expected... .*? seems to be greedy when followed by \(
    regEx.Pattern = "percent"".*?\((\-?[0-9,]+(?:\.[0-9]+)?)\%\)"

    Set Matches = regEx.Execute(html)
    If Matches.Count > 0 Then
        msgbox(Matches.Item(0))
        Percent = Replace(Matches.Item(0).SubMatches.Item(0), ",", "")
        msgbox(Percent)
    End If

End Sub

Oh - you are writing to LibreOffice Ask site about a bug (?) you found in a Microsoft component.
You are using a COM object VBScript.RegExp, which is a deprecated part of MS Windows. Its behavior is defined by its developers. So it is incorrect to say “this problem began in this or that version of LibreOffice” - the correct statement would be “it began after this or that update of my Windows”.

If it wasn’t deprecated there, I would suggest you to file a bug to MS.

On the other hand, why do you consider that “greedy”? Your second regex can only match the percent">(+2.57%) another-percent">(-0.75%), because it starts at percent", and goes until it finds an opening parenthesis followed by optional minus followed by a mandatory digit (and something else); starting from the first percent", it can only find that sequence in the second parenthesized piece (in first one, the opening parenthesis is not followed by a minus, neither by a digit). It seems to work as expected.

1 Like