REGEX support in Base queries and in Basic search and replace string functions

I have looked for ways to use REGEX in queries in Base (internal database) and in Basic macros to do string replacements. None of them seems available.
REGEX works quite well in Writer and is expanded in the beautiful Alt and Search and Replace extension. So I would hope these features would be implemented in Base and in Basic too…

Don’t know much about REGEX but this seems to work in a macro - see this post .

As far as Base goes, Base is just a front end to a database. The database included in LO is HSQLDB embedded. This and all databases operate via SQL commands. REGEX is not in any SQL I know of. However, there are other means to do string manipulation using SQL.

Hello @silvain-dupertuis,

Firstly, your question concerning Regular Expressions in Base Queries.

According to the official HSQLDB documentation, there are some regex functions supported via the “HyperSQL Built-in Functions”:
REGEXP_MATCHES(), REGEXP_REPLACE(), REGEXP_SUBSTRING(), and REGEXP_SUBSTRING_ARRAY().

However, when i tried to use these particular regexp functions in my Base Query definition, it caused a runtime error ( LibreOffice Version: 5.3.1.2 Build ID: 1:5.3.1-0ubuntu2 ).

It appears there are 3 different kinds of built-in functions in HSQLDB:

  • SQL Standard Functions
  • JDBC Open Group CLI Functions
  • HyperSQL Built-In Functions

The above mentioned regexp functions are stated to belong to the third category of built-in functions, supplied by HyperSQL. This is puzzling to me, since i thought that HyperSQL was included in LibreOffice. I intend to raise a separate question about this.

Another idea that might perhaps work, ( i never did this myself ), is to define your own regexp function in SQL using SQL Language Routines or Java Language Routines.

Some really basic pattern-matching for strings using escape sequences and wildcards ( also known as the “poor man’s regexp” ), is provided by the LIKE -predicate which can be used in WHERE -clauses in your Base Query definition.

As for the second part of your question, concerning Regular Expressions in BASIC Search & Replace string functions, this could be easily accomplished by summoning Python’s powerful “re” -module from BASIC.
Example code:

Function callPythonFunction( strPyFunction$, strPyFile$, aParams() ) As Variant
REM Invokes a Python function and returns the function result.
REM <strPyFunction>:	Name of the Python function to be invoked, e.g. "my_function".
REM <strPyFile>:		Name of the Python module that contains the function to be invoked, e.g. "test.py".
REM <aParams>:			Array of parameters to pass onto the function to be invoked.
	Dim oMasterScriptProviderFactory As Object, oMasterScriptProvider As Object, oScript As Object
	Dim aOutParamIndex(), aOutParam()
	oMasterScriptProviderFactory = createUnoService( "com.sun.star.script.provider.MasterScriptProviderFactory" ) 
    oMasterScriptProvider = oMasterScriptProviderFactory.createScriptProvider( "" ) 
	oScript = oMasterScriptProvider.getScript( "vnd.sun.star.script:" & strPyFile & "$" & strPyFunction & "?language=Python&location=user" )
	callPythonFunction = oScript.invoke( aParams(), aOutParamIndex(), aOutParam() )
End Function

Function regexp_ReplaceAll( strFind As String, strReplace As String, inString As String, flags As Long ) As String
REM This Function invokes a Python function to perform a RegExp Find & Replace on the input string, and returns the resulting string.
REM <strFind>	: The Regular Expression to find in the input string.
REM <strReplace>	: The Regular Expression to place instead of the found occurrences in the input string.
REM <inString>	: The input string on which to perform the RegExp Find & Replace.
REM <flags>		: Integer specfying the Search Flags to be used; Add these flags to combine them:
REM				2=IGNORECASE; 4=LOCALE; 8=MULTILINE; 16=DOTALL; 32=UNICODE; 64=VERBOSE; 128=DEBUG; 256=ASCII.
REM				( see https://docs.python.org/3/library/re.html )
REM The Python function is named "regexp_replaceall" and is stored inside a file called "test.py" inside the LibreOffice User Scripts Folder for Python.
REM ( In Ubuntu this folder is located at: "/home/username/.config/libreoffice/4/user/Scripts/python" ).
REM If your own Python regexp function is named differently, or is stored in another .py file,
REM then you should alter either or both of the two constants below:

	Const cPyFunction As String = "regexp_replaceall"
	Const cPyModule As String = "test.py"

REM Inside the file "test.py" there should be these 5 lines ( without the apostrophes ' at the start ):

'#!/usr/bin/env python3
'def regexp_replaceall( find, replace, instring, flags ):
'	import re
'	return re.sub( find, replace, instring, 0, flags )
'g_exportedScripts = regexp_replaceall,

	REM Now we can call the Python function via BASIC as follows:
	Dim aParams(3) : aParams(0) = strFind : aParams(1) = strReplace : aParams(2) = inString : aParams(3) = flags
	regexp_ReplaceAll = callPythonFunction( cPyFunction, cPyModule, aParams() )
	
End Function

Function regexp_FindAll( strFind As String, inString As String, flags As Long )
REM This Function invokes a Python function to perform a regexp Find, and returns an array of found strings.
REM <strFind>:	The Regular Expression to find in the input string.
REM <inString>:	The input string in which to search for the Regular Expression.
REM <flags>:	Integer specfying the Search Flags to be used; Add these flags to combine them:
REM				2=IGNORECASE; 4=LOCALE; 8=MULTILINE; 16=DOTALL; 32=UNICODE; 64=VERBOSE; 128=DEBUG; 256=ASCII.
REM				( see https://docs.python.org/3/library/re.html )
REM The Python function is named "regexp_findall" and is stored inside a file called "test.py" inside the LibreOffice User Scripts Folder for Python.
REM ( In Ubuntu this folder is located at: "/home/username/.config/libreoffice/4/user/Scripts/python" ).
REM If your own Python regexp function is named differently, or is stored in another .py file,
REM then you should alter either or both of the two constants below:

	Const cPyFunction As String = "regexp_findall"
	Const cPyModule As String = "test.py"

REM Inside the file "test.py" there should be these 5 lines ( without the apostrophes ' at the start ):

'#!/usr/bin/env python3
'def regexp_findall( find, instring, flags ):
'	import re
'	return re.findall( find, instring, flags )
'g_exportedScripts = regexp_findall,

	REM Now we can call the Python function via BASIC as follows:
	Dim aParams(2) : aParams(0) = strFind : aParams(1) = inString : aParams(2) = flags
	regexp_FindAll = callPythonFunction( cPyFunction, cPyModule, aParams() )
	
End Function

@librebel You are right about REGEXP in HyperSQL. However it is not available in the version 1.8 embedded which comes with LO. Have tested it in v2.3.2 and it works.

Thanks for confirming that the HyperSQL REGEXP functions work in version 2.3.2 @Ratslinger.
Unfortunately i don’t know if that version can also be used for embedded databases yet…