Regular Expressions in Basic

Good Morning

I would like to implement some regular expressions in Basic. Is there RegExp object the way there is in VBA or do I use the VBScript.RegExp object? Thank you as usual for all of your help.

Regards

Zafar

there is a regex-builtin-module in python
so the question remains: why do you buy wood screws and try to hammer them into concrete?

1 Like

I accept that. Maybe i will translate my code from Basic to python. I will see how much my knowledge in python has increased.

In the LibreOffice help page ScriptForge.String service (SF_String) (or any other page), select either the Basic module and enter “regex” in the search box.

You will read that the String class of the ScriptForge library implements a.o. next methods:

  • FindRegex()
  • IsRegex()
  • ReplaceRegex()

NB: behind the scene the implementation uses the com.sun.star.util.TextSearch UNO service.

1 Like

I suppose you need a search of string. It is not impossible in Basic, only masochistic :slight_smile: - based on service com.sun.star.util.TextSearch

Sub regexpStringSearch
	dim oSearch as object, oSearchParam as object, oFound as object, iStart&, iEnd&, sentence$, reg$, s$
	
	sentence="not easy but possible" 'string to search
	reg="[a-z0-9.-]+" 'regexp
	
	oSearch=CreateUnoService("com.sun.star.util.TextSearch")
	oSearchParam=CreateUnoStruct("com.sun.star.util.SearchOptions")
	with oSearchParam
	  .algorithmType=com.sun.star.util.SearchAlgorithms.REGEXP 'regular expression
	  .searchString=reg
	end with
	oSearch.setOptions(oSearchParam)
	oFound=oSearch.searchForward(sentence, iEnd, len(sentence)) 'search string
	do while oFound.subRegExpressions>0
		iStart=oFound.startOffset(0) 'start position in string
		iEnd=oFound.endOffset(0) 'end position in string
		s=mid(sentence, iStart+1, iEnd-iStart) 'found string
		msgbox s
		iEnd=iEnd+1
		oFound=oSearch.searchForward(sentence, iEnd, len(sentence)) 'search string
	loop
End Sub
2 Likes

Well that is interesting. I might implement it or I might go with Python.

(This is not an objection to the answer already given. However, I have never used this method myself. , and I don’t know if it can be used with components/objects other than Writer/Text. )

  1. Some components or objects have the power to create an instance of the service com.sun.star.util.SearchDescriptor. Among them are the Writer model and the Calc related object types SheetCellRange (including any complete sheet and any single cell), and SheetCellRanges. Depending on the “creator” (text or sheet related) they act differenetly. There is a Boolean property .SearchRegularExpression (This isn’t the place for a tutorial.)
  2. LibreOffice V6.2 or higher have implemented the spreadsheet function REGEX(). Like any such function it can be called with the help of an instance of the service com.sun.star.sheet.FunctionAccess. This function works with strings regardless of the source. You can use it independent of any Calc model (e.g. from code running for a Writer model being the current ThisComponent). An annoying complication is the fact that a relevant variant of the call requires the missing of a parameter as a positive information.
    Which tool or method may be preferrable will depend on the use-case. Anyway you may have to consider that there is little regular concerning regular expressions (a proverb going back to Owen Genat).
    Python surely and VBA (most likely, donno) come with a different RegEx flavor, and there is no generally accepted standard. Regexp Tutorial - Shorthand Character Classes can help with a first study. I personally only use ICU RegEx (Regular Expressions | ICU Documentation) which also is the flavor used by LibreOffice.
2 Likes

I would not recommend using the VBScript.RegExp object because this object is not Unicode friendly.
As @Lupp pointed out, LO’s implementation of regular expressions is based on ICU.

ICU’s Regular Expressions package provides applications with the ability to apply regular expression matching to Unicode string data.

2 Likes