Is it possible to convert vnd.sun.star.tdoc paths?

is it possible to convert vnd.sun.star.tdoc:/1/Scripts/python/MyFile into the actual file path?
It would be very helpful for an extension that I am writing if this is possible.

I am working with python but I will take any example available.

There is no file path for anything inside the ODF package. A content of a package is not mounted as a filesystem, so you can’t have a file path to it.

I would not say convert, but you may work on the problem @mikekaganski named by “mounting” the zip-file. There are already tools for this like mount-zip, but to put the open file to the file system may not really be your intention.

.
Maybe a lib for python to make the contents available to python like ZipFS/PyFilesystem is more what you seek, but remember the condition named by @karolus : there is not always a saved file…
https://docs.pyfilesystem.org/en/latest/reference/zipfs.html

Actually, you might want to avoid an XY problem, by specifying what specifically this question tries to solve in your extension, so that alternatives could be explored, like using TransientDocumentsContentProvider Service and friends.

Not what you asked for, but given an URL like vnd.sun.star.tdoc:/1, this Basic code produces the document’s URL (if any):

sub getDocURL(tdoc_url)
  xUCB = CreateUnoService("com.sun.star.ucb.UniversalContentBroker")
  xId = xUCB.createContentIdentifier(tdoc_url)
  xContent = xUCB.queryContent(xId)
  dim args(0 to 0) as new com.sun.star.beans.Property
  args(0).Name = "DocumentModel"
  args(0).Handle = -1
  dim command as new com.sun.star.ucb.Command
  command.Name = "getPropertyValues"
  command.Handle = -1
  command.Argument = args
  result = xContent.execute(command, 0, Nothing)
  model = result.getObject(1, Nothing)
  MsgBox model.URL
end sub

And given an URL like vnd.sun.star.tdoc:/1/Scripts/python/MyFile, this code would give you an instance of TransientDocumentsStreamContent service:

function getStream(tdoc_stream_url)
  xUCB = CreateUnoService("com.sun.star.ucb.UniversalContentBroker")
  xId = xUCB.createContentIdentifier(tdoc_stream_url)
  getStream = xUCB.queryContent(xId)
end function

And given a function to get the stream content, something like this Python code (untested!) could give you an instance of XInputStream Interface for the stream content:

import unohelper
from com.sun.star.ucb import Command
from com.sun.star.io import XActiveDataSink
from com.sun.star.ucb import OpenCommandArgument2
from com.sun.star.ucb.OpenMode import DOCUMENT

class MyDataSink(unohelper.Base, XActiveDataSink):
    def __init__(self):
    self.inputStream = None

    def setInputStream(self, aStream):
        self.inputStream = aStream

    def getInputStream(self):
        return self.inputStream

def getInputStream(tdoc_stream_url):
    tdocStream = getStream(tdoc_stream_url)
    sink = MyDataSink()
    arg = OpenCommandArgument2(DOCUMENT, 0, sink, (), ())
    command = Command("open", -1, arg)
    tdocStream.execute(command, 0, Nothing)
    return sink.getInputStream()
3 Likes

Goal

What I am try to accomplish is dynamic Python Creation and loading.
My extension is aiming to add features similar to what the PY mode brings to MS Excel.
See: YouTube Video.

I am still in the experimental stage. So far I can write a formula =PY() into a Calc sheet and it then pops up a dialog.
In the dialog I can write Python code.

When I exit the dialog it will automatically save the source code into a python library for the current document.

In the Read and Write Python Code example below you can see that it is possible to write the code and read it back. I could at this point load the source directly into python. In the Load source code Example example below you can see it it possible to load source code and get a result.

Read and Write Python code example

from __future__ import annotations
import logging
import uno

from ooodev.calc import CalcDoc
from ooodev.loader import Lo
from ooodev.loader.inst.options import Options
from ooodev.utils.string.str_list import StrList


def main():

    loader = Lo.load_office(connector=Lo.ConnectPipe(), opt=Options(log_level=logging.DEBUG))
    doc = CalcDoc.create_doc(loader=loader, visible=True)
    try:
        psa = doc.python_script
        assert psa is not None
        code = StrList(sep="\n")
        code.append("from __future__ import annotations")
        code.append("import inspect")
        code.append("import sys")
        code.append()
        code.append("def say_hello(*args) -> None:")
        with code.indented():
            code.append('print("Hello World!")')
        code.append()
        code_str = str(code)
        assert psa.is_valid_python(code_str)
        # vnd.sun.star.tdoc:/1/Scripts/python/MyFile
        psa.write_file("MyFile", code_str, allow_override=True)
        psa_code = psa.read_file("MyFile")
        assert psa_code == code_str
        # pth = uno.systemPathToFileUrl("vnd.sun.star.tdoc:/1/Scripts/python/MyFile")
        # print(pth)
    finally:
        doc.close()
        Lo.close_office()


if __name__ == "__main__":
    main()


Load source code Example

def numpy_script() -> str:
    return """
import sys
import numpy as np
# Create a 2-D array, set every second element in
# some rows and find max per row:

x = np.arange(15, dtype=np.int64).reshape(3, 5)
x[1:, ::2] = -99

x.max(axis=1)

# Generate normally distributed random numbers:
rng = np.random.default_rng()
samples = rng.normal(size=2500)
samples
"""

def run_script(script: str) -> Any:
    lines = script.split("\n")
    if not lines:
        return None
    # pop the last line
    last_line = ""
    while last_line == "":
        try:
            last_line = lines.pop().strip()
            if last_line.startswith("#"):
                last_line = ""
        except IndexError:
            return None
    glbs = globals()
    exec("\n".join(lines), glbs)
    return eval(last_line, glbs)

def main():
    script = numpy_script()
    result = run_script(script)
    print(result)
    # even though sys and numpy are not imported until run_script() is called this still works
    assert "numpy" in sys.modules


if __name__ == "__main__":
    main()

Some Consideration.

So, I have not figured it all out yet but the general idea is to be able to enter python code into a cell that then return a <> DataFrame string result, this will indicate to the user that the cell is contains a dataframe or dataseries or value or …

Once the python code for the cell has been run the user will see <> DataFrame in the cell or similar. The user can then right click on the cell and choose to expand the results into a cell value or a cell range. Then with the contents expanded the user can right click and chooose to contract the results. Similar to how it is done in the MS Excel PY.

I need to consider how Calc recalculates cell values and dynamically re-run the python code that give the results for the cell if needed.

Other python cells need to be able to access the code from previous python cells.

So if I create a =PY() in cell A1 that has a pandas dataframe named df then df needs to be available to cell A2 if it is a python cell.
This means the python code must be dynamically loaded ( and reloaded) as needed. In python it is possible to reload modules.

Also all the python code needs to be saved with the document and reloaded when the document it opened.
Because it is easy to read and write python code into the document library I thought this may be a good way to import the modules when the document is opened. Maybe I could copy the python module into a temp directory and add the temp directory into the python sys.path when the document is loaded or when a new python module is created.

I am till working it out.

Most of the code I have on this so far can be found here

Thanks for the feedback so far everybody. I will be examining this more intently after coffee.

IMO, the idea to manipulate files in the package is not good. The modules must be truly dynamic. You may take a look at how the similar task is solved in our unit tests, where we take the macro text from files, and create the Basic modules without making them any kind of “substorage”. I didn’t check if it’s all implemented in UNO API, or not.

1 Like

Thanks, There may be something there.
|-
All the python that may be written into the PY() formula must be stored somewhere in the document. I can store it in the /Scripts/python folder (or subfolder) or I can store it in the sheet via hidden controls like I am doing for sheet id.
|-
However the code is loaded it will need to be available across all sheets in a document.
If sheet1.A1 has code then it needs to also be available to sheet2.A1 if a PY() function is placed in both.
|-
So in short when a Calc document is opened all python code needs to be loaded that goes with the PY() formula. I am thinking that that best way to do this is implement some sort of dynamic loader in the PY() function that loads the python for that cell automatically when it is called on a recalculate if it is not already loaded.
|-
I think each python snippet would have to be treated like a seperate module and loaded into python separately. So the unit test may loading example may be a way of accomplishing this.
|-
I have to do more testing but I think if a module is loaded it will not be necessary to add it to the sys.path
|-
Any thoughts?

I do not know how PY() works in Excel. I was under an impression that the Python code is simply an argument to the PY(); it seems it’s different. If you can record a screencast of managing that in Excel, it would help understand the task.

Note that <br/> might be your friend when you want to separate paragraphs.

LOL I do not even have Excel. The video should convey the idea.

Thanks. What I see thus far, just reinforces my perception that the formula in the cell is similar to

=PY("python_code_here")

So why store the "python_code_here" elsewhere? It’s just the cell formula argument?

One issue I see with that off the top of my head is really hard to read and edit python code in the formula bar.


Also I would like to leave the possibility to pass other parameter into the PY() method.
I have to consider this a little more.


Where the python code is stored is not as important has how to dynamically load it. If I store it in the function args or as a hidden control or in the Scripts/python is not going to be the challenge. Although this must be a most correct way.


I can see that editing multi-line python inside a formula could be very frustrating for users.

Also I can see escaping ", ' (double and single quotes) would be a major issue if the code was to be entered into the formula bar.

not any document has already a file(path|URL)

desktop = XSCRIPTCONTEXT.getDesktop()
documents = desktop.Components
for doc in documents:
    print(f"{doc.RuntimeUID = }\n\t"
          f"{doc.Title = }\n\t"
          f"{doc.Namespace = }\n\t"
          f"{doc.StringValue =}")