Can not open document with non-english characters in filename

My locale output looks like this:

  • LANG=en_US.UTF-8
  • LC_CTYPE=“en_US.UTF-8”
  • LC_NUMERIC=“en_US.UTF-8”
  • LC_TIME=“en_US.UTF-8”
  • LC_COLLATE=“en_US.UTF-8”
  • LC_MONETARY=“en_US.UTF-8”
  • LC_MESSAGES=“en_US.UTF-8”
  • LC_PAPER=“en_US.UTF-8”
  • LC_NAME=“en_US.UTF-8”
  • LC_ADDRESS=“en_US.UTF-8”
  • LC_TELEPHONE=“en_US.UTF-8”
  • LC_MEASUREMENT=“en_US.UTF-8”
  • LC_IDENTIFICATION=“en_US.UTF-8”
  • LC_ALL=

Weird. The locale is fine. Does the problem appear when you try to open files from LibreOffice dialog or from you file manager?

It happens in both

I don’t know the format you give your LibreOffice version in. By what means did you get it?
There isn’t a specific Writer version.
Can you confirm that your ö in the FileName actually is U+00F6 or U+00F8?
[Why did you use a German umlaut (French o with trema?) for a Swedish ø?]

It was Fedoras version, fixed it. LO version is 6.4.6.

I think you got Swedish confused with Danish/Norwegian :wink:
It is without doubt U+00F6 ö and supposed to be so.

Out of ideas. Just for clarity: is the file in question located on an ext4 partition or on an NTFS/FAT partition?

The file is located on an ext4 partition within a LUKS encyption. Can it be caused by LUKS?

Perhaps, I have no experience with LUKS. If you have an external disk or available USB stick, try to format a partition with ext4 without LUKS and copy your möbler.odt on it. Does it open? If so, reformat the partition with LUKS. Does it still work?

Quoting @TJRoh01: “I think you got … confused …”
You are surely right - not only concerning the nordic languages (though I had nive days in Sweden and in Norway as well last year).

When did you create the file möbler.odt? It is a common problem if you had an ISO-8859* encoding in the past, created files and then migrated to an UTF-8 encoded system. So the important question now is: What does ls -l or a file manger show? (personally fought years to get rid of files showing question marks in file names containing German umlauts or ß).

I have the same issue here on Fedora 33 with LibreOffice. This also happens even if the filename is purely English characters but is placed inside a directory with Swedish or Icelandic characters in the name. Every other application can open the files without any issue. I can even recreate this right now by creating a new directory and placing a file inside it. LibreOffice will not open that file.
This is on an ext4 partition that has never had anything but UTF-8 for filenames.

@Gulli: This can be easily explained as the full path is considered when opening a file (or trying so). If any path component has conflicting characters, you get the error. Now, try to remember how the directory was created. Was it years ago? In the early Fedora Core (former distro name) days, there were such issues with file names. Is was wise to avoid non-ASCII characters but this has been fixed for long unless this directory survived OS upgrades.

Is the directory on a separate partition? Which filesystem?

What happens if you create a fresh directory with non-ASCII chars like ä ö ü or ß? Do it preferentially on the / partition to eliminate filesystem issues.

Hi @TJRoh01,
I have exactly the same problem at least for a year now.
I have the German language pack installed.
Ext4 without LUKS

Another weird thing is, that I can not copy paste from some websites into an .odf document. If I copy from deeply.com for example: Möbler I get möbler in LO, super annoying. Is it the same for you?

This is my local if that helps anybody to make a connection:
LANG=en_US.UTF-8
LC_CTYPE=“en_US.UTF-8”
LC_NUMERIC=de_DE.UTF-8
LC_TIME=de_DE.UTF-8
LC_COLLATE=“en_US.UTF-8”
LC_MONETARY=de_DE.UTF-8
LC_MESSAGES=“en_US.UTF-8”
LC_PAPER=de_DE.UTF-8
LC_NAME=de_DE.UTF-8
LC_ADDRESS=de_DE.UTF-8
LC_TELEPHONE=de_DE.UTF-8
LC_MEASUREMENT=de_DE.UTF-8
LC_IDENTIFICATION=de_DE.UTF-8
LC_ALL=

Not completely out of ideas nonetheless:
You got two question marks in your error message.
There are the (strange) combining characters in unicode: The trema U+0308 e.g. combines with the character left of it. A U+006F (lower o) with a subsequent U+0308 (trema) combines visually to an ö which would be coded U+00F6 otherwise.
If your actual FilePaths contain combining glyphs some applications may handle that and others may not.
On the other hand there reamains the question who (human or software) caused the issue.

Interesting. Well I have a dedicated ö key on my keyboard. Will test around with character codes.

Just tested with filenames containing the sequence U+006F, U+0308 on Win 10.
The OS accepts that.
LibO V 7.0.1 (64bit) and LibO V 6.4.4 (x64) both handle it, too.
The behaviour of the cursor going over a U+0308 is slightly different, though. Anyway the effects may depend on the OS.

My old general and serious advice:
Use only syntactically clean names for folders and files: No spaces or special characters at all, whether the system and other software pretend to accept them or not. No use in localization spilling over!
Unfortunately we cannot make undone the lots of pointless stubborn “extensions” to the latin alphabet invented over centuries. They simply are a plague.

@Lupp wrote:

My old general and serious advice:

Use only syntactically clean names for folders and files: No spaces or special characters at all, whether the system and other software pretend to accept them or not. No use in localization spilling over!
+1

If you work in the cloud of an association you can not just go to everybody and tell them:
hey, btw I can not open files with an “ü” so please never use an “ü” again and rename all files and folders containing one, because I can’t fix my LO.

1 Like

And I can’t go to MS or any Clouder and invoke them to no longer try to turn the mess they made themselves into a weapon against free software.
What should be an accepted name in a technical context must clearly and bindingly be specified. A string used for a specific purpose must not be a letter to uncle أنور. Did you consider the big mess you would get as soon as right to left writing is mixed in?
Let’s tidy it up. “Asiatic” and “Complex” text layout may be needed in many places. An identifier is not the right place, and that’s not “Western arrogance”. I’m surely ready to start writing right to left as soon as there is a global agreement. Till then I can forswear special German characters not easily accessible from a majority of keyboards in the world when typing a technical name.