The LO 6.3 and 6.4 BASE Python environment does not encode emoji () properly. The string encoding for Python 3.7.6 installed separately is different from, it appears, that used by LibreOffice Python 3.7.6 in LibreOffice 6.4 or 6.3
The question is, how do I configure the LibreOffice Python 3.7.6 to be identical to a standalone install of Python 3.7.6?
or alternatively, can I configure LibreOffice to use a standalone Python instead of the one it is shipped with? I’ve read a good bit of doc on this and believe it to be impossible, but it would be great if I was wrong.
If you update a SQLite3 table with:
cur = conn.cursor()
sql = ‘’’ INSERT INTO saved_comments (project, comment_id, comment)
VALUES(?,?,?)’’’
record = (‘FS’, 1, “SWIFTIE:heart:”)
cur.execute(sql, record)
conn.commit()
Using Python 3.7.6 installed separately from LibreOffice, the code works fine. A SQLite3 db shows the emoji.
Using the Python 3.7.6 accessed through a Python macro in LibreOffice 6.3 or 6.4, the code updates the database with all the text, but replaces the emoji with question marks. It is difficult run a debugger in that environment, but I got PyCharm to work. I get messages that the emoji can’t be processed with the current code page (which handles encoding in Windows 10, my environment) (The message is too long to paste in here). So that must be a difference.
This project is for a linguist, the data source is social media comments, so it is full of alternate languages as well as emoji. I am doubtful that specifying a single encoding in the code will work, because the input strings, due to the many languages being used, more than likely use various encodings. Using “cur.execute(sql.encode(“utf-16”), record)” for example incorrectly translates the above emoji to wildly different shapes.
Any ideas? thanks in advance.