Ask Your Question
0

What is proper encoding for emoji in BASE?

asked 2020-02-25 21:57:07 +0200

fishingCoder gravatar image

The LO 6.3 and 6.4 BASE Python environment does not encode emoji (😍🖤) properly. The string encoding for Python 3.7.6 installed separately is different from, it appears, that used by LibreOffice Python 3.7.6 in LibreOffice 6.4 or 6.3

The question is, how do I configure the LibreOffice Python 3.7.6 to be identical to a standalone install of Python 3.7.6?

or alternatively, can I configure LibreOffice to use a standalone Python instead of the one it is shipped with? I've read a good bit of doc on this and believe it to be impossible, but it would be great if I was wrong.

If you update a SQLite3 table with: cur = conn.cursor() sql = ''' INSERT INTO saved_comments (project, comment_id, comment) VALUES(?,?,?)''' record = ('FS', 1, "❤❤SWIFTIE❤❤") cur.execute(sql, record) conn.commit() Using Python 3.7.6 installed separately from LibreOffice, the code works fine. A SQLite3 db shows the emoji.

Using the Python 3.7.6 accessed through a Python macro in LibreOffice 6.3 or 6.4, the code updates the database with all the text, but replaces the emoji with question marks. It is difficult run a debugger in that environment, but I got PyCharm to work. I get messages that the emoji can't be processed with the current code page (which handles encoding in Windows 10, my environment) (The message is too long to paste in here). So that must be a difference.

This project is for a linguist, the data source is social media comments, so it is full of alternate languages as well as emoji. I am doubtful that specifying a single encoding in the code will work, because the input strings, due to the many languages being used, more than likely use various encodings. Using "cur.execute(sql.encode("utf-16"), record)" for example incorrectly translates the above emoji to wildly different shapes.

Any ideas? thanks in advance.

edit retag flag offensive close merge delete

1 Answer

Sort by » oldest newest most voted
0

answered 2020-03-02 19:54:22 +0200

fishingCoder gravatar image

It turns out this is a problem with the ODBC driver.

See https://ask.libreoffice.org/en/questi... There Ratslinger tracked it down to the ODBC driver, and confirmed it by using a different ODBC driver. Please follow discussion there. Thanks again to Ratslinger.

edit flag offensive delete link more

Comments

HMMM. Further research proves that the ODBC driver does not seem to be the problem. You can do a direct edit of the table row in LibreOffice BASE and press the win key plus period to call a grid of emoji. Select one and it will insert it into the field. Move off the field and it will save it to the database. This is with the choosing the utf-8 for character set (Edit / Database / Connection, third screen Character set listbox). We are only seeing the problem in Windows, and only with SQLite as the back end, not with HSQL or Firebird.

fishingCoder gravatar imagefishingCoder ( 2020-03-10 19:49:33 +0200 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2020-02-25 21:57:07 +0200

Seen: 83 times

Last updated: Mar 02