Headless server mode: Not able to provide data into document context

Hello, our product uses LibreOffice Macro to add custom functions that send requests to the backend to calculate results. This scheme works perfectly fine for a while, but now we need to process a huge amount of documents as fast as possible to calculate all formulas in the file.

The first problem is that starting the LibreOffice instance (headless) takes too long, ~0.8-1.5 seconds. What we have found is that we can start one headless LibreOffice in server mode.
But there is a second problem: we can not provide any information (like JWT token and internal object ID) when trying to open the document, maybe we just don’t know how.

Is it possible to provide any data to the document without editing it when we invoke the macro?
Or possibly can we optimize the startup time of the LibreOffice instance?

There is an example request to run our macro on Python:

def run_macro(file_url, token, internal_id):
    ctx = uno.getComponentContext()
    resolver = ctx.ServiceManager.createInstanceWithContext(
        "com.sun.star.bridge.UnoUrlResolver", ctx
    )

    ctx_remote = resolver.resolve(
        "uno:socket,host=127.0.0.1,port=2002;urp;StarOffice.ComponentContext"
    )
    smgr = ctx_remote.ServiceManager
    desktop = smgr.createInstanceWithContext("com.sun.star.frame.Desktop", ctx_remote)
    props = [
        prop("Hidden", True),
        prop("ReadOnly", True),
        prop("AsTemplate", False),
        prop("UpdateDocMode", 0),
        prop("MacroExecutionMode", 0),
        prop("OpenNewView", False),
        prop("SkipImages", True),
        prop("FilterOptions", "calc8"),
    ]
    # We need to load the document with token and internal_id to get access to it in the Macro
    doc = desktop.loadComponentFromURL(file_url, "_blank", 0, props)
    doc.calculateAll()

Also, there is a macro code structure:

class CompanyExt(unohelper.Base, XINTERFACE):
    version = "1.5"
    def __init__(self, ctx):
        self.internal_id = ...
        self.token = ...
        ...
    def CUST_FORMULA(self, *args):
        ...
        # request here
        ...

def createInstance(ctx):
    return CompanyExt(ctx)

g_ImplementationHelper = unohelper.ImplementationHelper()
g_ImplementationHelper.addImplementation(
    createInstance, 'com.company.company.oxt',
    ('com.sun.star.sheet.AddIn',),)

LibreOffice quick-start in Windows…

I may have read something about passing parameters to macros on command line, but I don’t remember the result.
But your macro could read a companion file.

:thinking:

The usual pattern is not starting the process for each task, but having it open all the time, listening on a port; and calling that instance from scripts. Or even just having an instance open without listening; and calling new instances for the tasks, using the same user profile path - but those new instances would detect the already running primary instance, and would simply delegate the job to it, so the overhead would be minimal.

2 Likes

So after a few hours of vibe coding, I found a working solution
First of all, before starting calculations, I add parameters in UserDefinedProperties like this:

    doc = desktop.loadComponentFromURL(file_url, "_blank", 0, props)
    user_props = doc.getDocumentProperties().getUserDefinedProperties()
    user_props.addProperty(
        "internal_id", 0, internal_id
    )
    user_props.addProperty(
        "token", 0, token
    )

Then, in the Macro code (Add-in, as I understand correctly), I get these parameters in every function that I call:

    def get_active_calc_doc(self):
        smgr = self.ctx.ServiceManager
        desktop = smgr.createInstanceWithContext("com.sun.star.frame.Desktop", self.ctx)
        comps = desktop.getComponents()
        enum = comps.createEnumeration()
        while enum.hasMoreElements():
            comp = enum.nextElement()
            url = comp.getURL()
            # Check if we are not taking the same document that is being processed
            if self.url != url:
                self.url = url
                return comp
            self.logger.info("Props not changed")
        return None

    def getParams(self):
        # Access current document
        doc = self.get_active_calc_doc()
        if not doc:
            return
        props = doc.getDocumentProperties().getUserDefinedProperties()

        self.internal_id = props.getPropertyValue("internal_id")
        self.token = props.getPropertyValue("token")

    def FormulaRequest(self, *args):
        self.getParams()
        try:
            res = request(self.internal_id, self.token, self.host, *args)
        except Exception as e:
            self.logger.error(f"Request Exception {args=} {e=}")
            return "#VALUE"
        self.logger.debug(f"Request {args=} {res=}")
        return res

I’m not sure if that code optimal as possible. If someone has more details on how to make it more reliable, please add as reply