is it possible to convert vnd.sun.star.tdoc:/1/Scripts/python/MyFile
into the actual file path?
It would be very helpful for an extension that I am writing if this is possible.
I am working with python but I will take any example available.
is it possible to convert vnd.sun.star.tdoc:/1/Scripts/python/MyFile
into the actual file path?
It would be very helpful for an extension that I am writing if this is possible.
I am working with python but I will take any example available.
There is no file path for anything inside the ODF package. A content of a package is not mounted as a filesystem, so you can’t have a file path to it.
I would not say convert, but you may work on the problem @mikekaganski named by “mounting” the zip-file. There are already tools for this like mount-zip, but to put the open file to the file system may not really be your intention.
.
Maybe a lib for python to make the contents available to python like ZipFS/PyFilesystem is more what you seek, but remember the condition named by @karolus : there is not always a saved file…
https://docs.pyfilesystem.org/en/latest/reference/zipfs.html
Actually, you might want to avoid an XY problem, by specifying what specifically this question tries to solve in your extension, so that alternatives could be explored, like using TransientDocumentsContentProvider
Service and friends.
Not what you asked for, but given an URL like vnd.sun.star.tdoc:/1
, this Basic code produces the document’s URL (if any):
sub getDocURL(tdoc_url)
xUCB = CreateUnoService("com.sun.star.ucb.UniversalContentBroker")
xId = xUCB.createContentIdentifier(tdoc_url)
xContent = xUCB.queryContent(xId)
dim args(0 to 0) as new com.sun.star.beans.Property
args(0).Name = "DocumentModel"
args(0).Handle = -1
dim command as new com.sun.star.ucb.Command
command.Name = "getPropertyValues"
command.Handle = -1
command.Argument = args
result = xContent.execute(command, 0, Nothing)
model = result.getObject(1, Nothing)
MsgBox model.URL
end sub
And given an URL like vnd.sun.star.tdoc:/1/Scripts/python/MyFile
, this code would give you an instance of TransientDocumentsStreamContent
service:
function getStream(tdoc_stream_url)
xUCB = CreateUnoService("com.sun.star.ucb.UniversalContentBroker")
xId = xUCB.createContentIdentifier(tdoc_stream_url)
getStream = xUCB.queryContent(xId)
end function
And given a function to get the stream content, something like this Python code (untested!) could give you an instance of XInputStream Interface
for the stream content:
import unohelper
from com.sun.star.ucb import Command
from com.sun.star.io import XActiveDataSink
from com.sun.star.ucb import OpenCommandArgument2
from com.sun.star.ucb.OpenMode import DOCUMENT
class MyDataSink(unohelper.Base, XActiveDataSink):
def __init__(self):
self.inputStream = None
def setInputStream(self, aStream):
self.inputStream = aStream
def getInputStream(self):
return self.inputStream
def getInputStream(tdoc_stream_url):
tdocStream = getStream(tdoc_stream_url)
sink = MyDataSink()
arg = OpenCommandArgument2(DOCUMENT, 0, sink, (), ())
command = Command("open", -1, arg)
tdocStream.execute(command, 0, Nothing)
return sink.getInputStream()
What I am try to accomplish is dynamic Python Creation and loading.
My extension is aiming to add features similar to what the PY
mode brings to MS Excel.
See: YouTube Video.
I am still in the experimental stage. So far I can write a formula =PY()
into a Calc sheet and it then pops up a dialog.
In the dialog I can write Python code.
When I exit the dialog it will automatically save the source code into a python library for the current document.
In the Read and Write Python Code
example below you can see that it is possible to write the code and read it back. I could at this point load the source directly into python. In the Load source code Example
example below you can see it it possible to load source code and get a result.
from __future__ import annotations
import logging
import uno
from ooodev.calc import CalcDoc
from ooodev.loader import Lo
from ooodev.loader.inst.options import Options
from ooodev.utils.string.str_list import StrList
def main():
loader = Lo.load_office(connector=Lo.ConnectPipe(), opt=Options(log_level=logging.DEBUG))
doc = CalcDoc.create_doc(loader=loader, visible=True)
try:
psa = doc.python_script
assert psa is not None
code = StrList(sep="\n")
code.append("from __future__ import annotations")
code.append("import inspect")
code.append("import sys")
code.append()
code.append("def say_hello(*args) -> None:")
with code.indented():
code.append('print("Hello World!")')
code.append()
code_str = str(code)
assert psa.is_valid_python(code_str)
# vnd.sun.star.tdoc:/1/Scripts/python/MyFile
psa.write_file("MyFile", code_str, allow_override=True)
psa_code = psa.read_file("MyFile")
assert psa_code == code_str
# pth = uno.systemPathToFileUrl("vnd.sun.star.tdoc:/1/Scripts/python/MyFile")
# print(pth)
finally:
doc.close()
Lo.close_office()
if __name__ == "__main__":
main()
def numpy_script() -> str:
return """
import sys
import numpy as np
# Create a 2-D array, set every second element in
# some rows and find max per row:
x = np.arange(15, dtype=np.int64).reshape(3, 5)
x[1:, ::2] = -99
x.max(axis=1)
# Generate normally distributed random numbers:
rng = np.random.default_rng()
samples = rng.normal(size=2500)
samples
"""
def run_script(script: str) -> Any:
lines = script.split("\n")
if not lines:
return None
# pop the last line
last_line = ""
while last_line == "":
try:
last_line = lines.pop().strip()
if last_line.startswith("#"):
last_line = ""
except IndexError:
return None
glbs = globals()
exec("\n".join(lines), glbs)
return eval(last_line, glbs)
def main():
script = numpy_script()
result = run_script(script)
print(result)
# even though sys and numpy are not imported until run_script() is called this still works
assert "numpy" in sys.modules
if __name__ == "__main__":
main()
So, I have not figured it all out yet but the general idea is to be able to enter python code into a cell that then return a <> DataFrame
string result, this will indicate to the user that the cell is contains a dataframe or dataseries or value or …
Once the python code for the cell has been run the user will see <> DataFrame
in the cell or similar. The user can then right click on the cell and choose to expand the results into a cell value or a cell range. Then with the contents expanded the user can right click and chooose to contract the results. Similar to how it is done in the MS Excel PY
.
I need to consider how Calc recalculates cell values and dynamically re-run the python code that give the results for the cell if needed.
Other python cells need to be able to access the code from previous python cells.
So if I create a =PY()
in cell A1
that has a pandas dataframe named df
then df
needs to be available to cell A2
if it is a python cell.
This means the python code must be dynamically loaded ( and reloaded) as needed. In python it is possible to reload modules.
Also all the python code needs to be saved with the document and reloaded when the document it opened.
Because it is easy to read and write python code into the document library I thought this may be a good way to import the modules when the document is opened. Maybe I could copy the python module into a temp directory and add the temp directory into the python sys.path
when the document is loaded or when a new python module is created.
I am till working it out.
Most of the code I have on this so far can be found here
Thanks for the feedback so far everybody. I will be examining this more intently after coffee.
IMO, the idea to manipulate files in the package is not good. The modules must be truly dynamic. You may take a look at how the similar task is solved in our unit tests, where we take the macro text from files, and create the Basic modules without making them any kind of “substorage”. I didn’t check if it’s all implemented in UNO API, or not.
Thanks, There may be something there.
|-
All the python that may be written into the PY()
formula must be stored somewhere in the document. I can store it in the /Scripts/python
folder (or subfolder) or I can store it in the sheet via hidden controls like I am doing for sheet id.
|-
However the code is loaded it will need to be available across all sheets in a document.
If sheet1.A1 has code then it needs to also be available to sheet2.A1 if a PY()
function is placed in both.
|-
So in short when a Calc document is opened all python code needs to be loaded that goes with the PY()
formula. I am thinking that that best way to do this is implement some sort of dynamic loader in the PY()
function that loads the python for that cell automatically when it is called on a recalculate if it is not already loaded.
|-
I think each python snippet would have to be treated like a seperate module and loaded into python separately. So the unit test may loading example may be a way of accomplishing this.
|-
I have to do more testing but I think if a module is loaded it will not be necessary to add it to the sys.path
|-
Any thoughts?
I do not know how PY() works in Excel. I was under an impression that the Python code is simply an argument to the PY(); it seems it’s different. If you can record a screencast of managing that in Excel, it would help understand the task.
Note that <br/>
might be your friend when you want to separate paragraphs.
LOL I do not even have Excel. The video should convey the idea.
Thanks. What I see thus far, just reinforces my perception that the formula in the cell is similar to
=PY("python_code_here")
So why store the "python_code_here"
elsewhere? It’s just the cell formula argument?
One issue I see with that off the top of my head is really hard to read and edit python code in the formula bar.
Also I would like to leave the possibility to pass other parameter into the PY()
method.
I have to consider this a little more.
Where the python code is stored is not as important has how to dynamically load it. If I store it in the function args or as a hidden control or in the Scripts/python
is not going to be the challenge. Although this must be a most correct way.
I can see that editing multi-line python inside a formula could be very frustrating for users.
Also I can see escaping "
, '
(double and single quotes) would be a major issue if the code was to be entered into the formula bar.
not any document has already a file(path|URL)
desktop = XSCRIPTCONTEXT.getDesktop()
documents = desktop.Components
for doc in documents:
print(f"{doc.RuntimeUID = }\n\t"
f"{doc.Title = }\n\t"
f"{doc.Namespace = }\n\t"
f"{doc.StringValue =}")