Base not decoding Western charset correctly with sdbc:firebird

I am working with LibreOffice 6.0.7.3.

[Details:
Libre Office base (Linux/Ubuntu): Version: 6.0.7.3
Build-ID: 1:6.0.7-0ubuntu0.18.04.2
OS: Linux 4.15; UI-Render: Standard; VCL: gtk3;
language German (de_DE.UTF-8)]

I connect to an existing Firebird 3.0 database via the sdbc driver (direct connection, no jdbc or odbc). Connection string/database path: “file:///home/…/Databasename.fdb”. My firebird database is encoded in ISO8859_1 (“Western”, 8-bit charset), whereas my LibreOffice encodes in UTF-8.

Unfortunately, LibreOffice cannot read the special characters of the database (in German: the Umlauts). Instead it produces the question marks in black diamonds. It obviously does not translate between the ISO8859_1 charset of the database and its own UTF-8 charset.

Comment: With LibreOffice 5.x I had to connect via the JDBC driver (JAVA) to the database. More tedious to set up! But I could add “?encoding=ISO8859_1” to the connections string/database path and it would decode correctly. However, this does not work now.

How can I make it decode the ISO8859_1 character set correctly with the new sdbc connection? Or do I have to go back to the JAVA connection? I’d like to keep the new sdbc-connection, as it is easy to establish (no JAVA drivers necessary) and it is fast.

Thanks for an answer!

Xenon1

By the way - how do you connect to FB using SDBC? I thought it’s only possible using ODBC/JDBC. Or do you mean you open FB database file in file mode, not to an external FB server?

Mike, with LibreOffice 6.0 or maybe 6.1 you can connect directly: Open a new instance of LibreOffice Base: Choose Connect to an existing data base | open with a click on the downward triangle | Choose Firbird. There you are! It is really nice and quick.

I’d say - file a bug report,attaching a sample database (both ODB and FBD (or whatever extension the Firebird DB has)). That would allow to reproduce, check for possible solutions, or fix it if it’s a bug. Please create a new sample DB, or anonimize existing one before upload.

Unicode (UTF-8) is the Character Set used by the Internet and LibreOffice. It supports about 138,000 characters. Unicode contains US-ASCII which is supported by the first 127 characters and ISO8859-1, the first 255 characters. So characters such as ç, Ä, ä etc exist in Unicode in exactly the same way as genuine ISO-8859-1. You will see this if you INSERT > SPECIAL CHARACTERS as the first characters in the table grouped as BASIC LATIN (0-127)(X 1-7E) and LATIN-1 (128-255) (X A0-FF).

The main character outside ISO8859-1 that you may use in Europe is the EURO sign €. So if you have an error problem, check that it is not the €.

If you have the � (Replacement Character (U+FFFD)) in the text, this is the Unicode character to replace an unknown character. You will not see this to replace an ISO-8859-1 character. The problem is elsewhere. Why is the Firebird encoded in ISO-8859-1? This 1987 technology and restricted to Europe. Linux is based on Unicode, as is the Internet and supports all the major written languages and the € (euro)

If you are seeing the � character in LibreOffice, this is implying that you have some unknown or error characters in your ISO8859-1 data. It is this information that is being transferred and then being displayed as the � character. If the data is pure ISO8859-1 it will also be pure Unicode.

Thanks riosv and petermau. However, that was not the question. My Firebird database is already configured as ISO-8859-1. I want to connect to it as it is, not change my database’s character set to fit LibreOffice, but make LibreOffice work with my database. Thank you anyway.

If the data is pure ISO8859-1 it will also be pure Unicode

well - of course, it depends.
While the first 256 Unicode codepoints are exactly the same as in ISO8859-1, it doesn’t mean that all 256 ISO8859-1 characters are encoded exactly as in UTF-8 (which uses 2-byte sequences for ISO8859-1 characters 128-255). So if there is a mismatch in some settings (like LO treats byte sequence stored in DB as if it were UTF-8, while it’s in fact ISO8859-1), the string would contain invalid UTF-8 sequences.

If I’m not wrong Firebird can use UTF, take a look to this book Firebird character set

Hello,

Try adding ?charSet=ISO8859_1 to the end of the Datasource URL.

No, this does not work.