# Why do I get these strange symbols when uploading a file. [closed]

EDITED as I have concluded this is now more a Chromebook than a LO issue. I did manage to copy and paste as rtf but with a fresh download of LO it remained. I will add back if I find and learn anything on a Chromebook form that is relevant here.

Edited to say I have now attached two attachments made in LibreOffice and showing what happens. It gives a partial answer as I discovered pasting as RTF works.

It still begs the question of why all the other ways you can copy and paste have the strange symbols.

"Youâ€™ve achieved"

If I create a new ODT file and try to paste into that the same thing happens. In addition, if I simply copy and paste I get the problem and every sentence is numbered. When I try to paste special, then I lose the numbers and get the symbols.

This happens when I upload a file. Please, can someone tell me why and what to do? It also happens when I try to copy and paste into a downloaded template to another.

Thank you.

I am using a Chromebook. I uploaded to Prowritingaid.com (Cannot use the editor in libreoffice so have to use the web version.) Due to Chromebook limitations I can only get libreoffice 5.2

The problem happens even copying from one Libreoffice file to another. The document where I first found the problem is a libreoffice ODT file. I uploaded using the Chrome browser. The document was formatted, but the problem is in the body text, not the headers. I understand what you say about the apostrophe but why would it happen in libreoffice files when I am not uploading to anywhere, but copying and pasting?

If the problem happens in ODT files, then is there something that I can do to change things in my version of libreoffice. Reinstalling has not solved the problem. Chromebooks can only get to Libroffice 5.2.

(I do not know how to put a BOM mark before a document. I am lost there. For people like me, we need not just the information you give, but also the information as to how you would actually do it from the point of opening the file! I will ask another question about that!)

edit retag reopen merge delete

### Closed for the following reason question is not relevant or outdated by Alex Kemp close date 2020-07-24 13:34:41.647753

This seems unrelated to LibreOffice. Perhaps you could ask the website to which you are uploading.

( 2019-06-01 10:30:17 +0200 )edit

I second robleyd's comment. The issue seems to be irrelevant here, especially considering that you specify that

Cannot use the editor in libreoffice

Anyway, what kind of file are you trying to process?

( 2019-06-03 08:20:07 +0200 )edit

Please see my comment above. It IS relevant as I find the same thing happens when I copy and paste into a new libreoffice file. In fact I edited to say where I had first encountered the probllem in answer to the person who kindly did give me information.

( 2019-06-04 06:21:32 +0200 )edit

Strange it happens in a LO-to-LO process. Can you attach a sample file if not confidential? Or at least an excerpt of it with the mishap.

( 2019-06-04 07:56:23 +0200 )edit

I suppose that OP ( @JackyAnn ) needs to re-write the question from scratch. I tried to read the question several times, and couldn't understand what happens when.

A good question would look similar to this: [OS version; LO version]

I open pre-existing ODT file (link here) (or create a new text document); write there these characters: "ABC" (using that keyboard layout); save it back; then close LO, and try to upload the file to resource XYZ using browser NNN. Then I open the file in web view, and see this: ... (screenshot). Or I select characters from here to there, press Ctrl+C to copy to clipboard, switch to another newly created text document, and Ctrl+V, and see this ...

The level of details you give determines if others, not standing behind you, will be able to understand your problem in the first place - the neccessary step to be able to ...(more)

( 2019-06-04 08:28:39 +0200 )edit

The downvote here looks unfair - I upvote to compensate. If someone downvotes: please provide some explanation in cases there's nothing obvious.

( 2019-06-04 08:30:15 +0200 )edit

If the problem happens in ODT files

Exactly, if. I can see no proof to it. The problem may be about the server, we don't know how files are processed. In such a case, you can't do much.

By the way, I tried to upload a sample ODT file produced with LO Writer 6.1.2.1 containing a single phrase:

You’re right.


It has been processed correctly.

I've created and uploaded another one with the same content using LO 5.4.2.7. Still processed correctly.

Can you share the file?

( 2019-06-04 08:33:54 +0200 )edit

I have reinstalled LibeOffice. I have made a new file with different text and tried to copy and paste every way there is not uploading to anywhere and the problem remains. It happens wherever I copy or paste of upload but NOT if I copy from elsewhere and paste. There is no pont telling a chrome user anything to to with LO 6 and beyond. We cannot get it. We are limited to L0 5. It cannot be the server as the problem happens working in LO offline. I do not want to share a long file but will make and upload a short one. My current theory is this is related to LO on a Chromebook but I am away and cannot use other hardware to experiment.

( 2019-06-04 10:42:32 +0200 )edit

the problem happens working in LO offline

Really? As I understand, the problem happens when you upload a file to the server.

( 2019-06-04 10:53:58 +0200 )edit

@gabix: "and tried to copy and paste every way there is not uploading to anywhere and the problem remains"

( 2019-06-04 10:56:46 +0200 )edit

Sort by » oldest newest most voted

I get the strange characters when I copy from a web page and paste into LO. The character clusters represent the open and closed single and double quotes and the em dash. I have the AltSearch extension installed, and have saved a batch search-and-replace operation that returns these strange characters to what they should be. AltSearch has a Batch button that takes you to the Batch screen, which has an Edit button to edit the text file that holds saved batch operations. If you can insert these lines in that text file you can fix the pasted text in one operation.

Sorry but the following text should be on new lines and I can't seem to get it to show properly. A new line before every open square bracket. It should look the same as the other batch commands in the text file. The text file you have to edit is AltSearchScript.txt and it is in .config/libreoffice/4/user/config/ in your home folder if you are using Linux.

 [Name] Fix Strange Characters
[Find]â€“
[Replace]—
[Parameters]   MsgOff  Regular  CurrSelection
[Command] ReplaceAll

[Find]â€™
[Replace]’
[Parameters]   MsgOff  Regular  CurrSelection
[Command] ReplaceAll

[Find]â€˜
[Replace]‘
[Parameters]   MsgOff  Regular  CurrSelection
[Command] ReplaceAll

[Find]â€œ
[Replace]“
[Parameters]   MsgOff  Regular  CurrSelection
[Command] ReplaceAll

[Find]â€
[Replace]”
[Parameters]   MsgOff  Regular  CurrSelection
[Command] ReplaceAll


(edited by ajlittoz for proper formatting)

more

@GerardBuz: FYI use five spaces at start of line to disable line merging; lines will display as typed.

It is obviously a confusion ISO-8859-x/UTF-8. What happens if you "paste special" as unformatted text?

( 2020-05-15 07:55:51 +0200 )edit

use five spaces at start

A small correction: four spaces at start have this special meaning.

( 2020-05-15 08:32:29 +0200 )edit

@Mike Kaganski: thanks, I did it from memory and preferred to play safe

( 2020-05-15 09:35:15 +0200 )edit

can someone tell me why and what to do?

I can’t tell why. I can tell what to do: wipe that ChomeOS and install Linux. Will work.

more

Actually, I have done that before. The results were not great compared to installing on my Laptop (sadly the hardware is so old it failed.) I am used to Ubuntu and have always used the latest install for it. When I find a laptop with 64ARM that is sound but needs and OS I will go back to that.

I did get L0 6 but one night the cat sat on the keyboard, I awoke to find the CAT reset the entire Chromebook! Both the Linux install and the Chromebook OS were wiped as if I just bought the Chromebook.

I did not repeat the exercise as the results installing LInux were not great. I decided to stick with the Chromebook Beta version of Linux. I

Anyone taking your advice would need to be sure they were happy with voiding a guarantee. I may do it all again but decided ...(more)

( 2019-07-02 10:11:26 +0200 )edit

This looks like a confusion between Unicode and ISO-8859-x.

The source file is probably UTF-8 plain text. For some reason, the uploading process thought it was ISO-8859-x and converted it to Unicode giving the surprising text.

Edit your question to explain how you uploaded the text (which intermediate steps with which applications). Mention your OS, that could help to suggest tools.

EDIT 1

I'd like more technical details on the process of uploading.

How was the initial file content typed? Locally with a text editor (not a document processor like LO)?

Was this initial file uploaded via the Chrome browser? Or some other tool/protocol like ftp?

I guess that you had a plain text (without any formatting effect like bold or italics) file which was uploaded using an HTML tool. HTML protocol uses "headers" to describe the exchanged data. One of these headers tells the recipient the character encoding used at source. It is ISO-8859-1 by default. When source file is plain text, there is no marker inside it (*) to contradict this default. At the other end, this wrong encoding is remembered.

When file is opened by LO, the byte stream is erroneously taken for an ISO-8859-1 while it is in fact an UTF-8 stream. Your original text is probably "You've achieved" with a typographical apostrophe U+2019 RIGHT SINGLE QUOTATION MARK, which UTF-8 encoding is 0xE2 0x80 0x99. 0xE2 is "â" both in ISO-8859-1 and Unicode which explains the first strange character. 0x80 and 0x99 are control characters in the C1 set; they may display strangely, accounting for the other characters.

You must find a way to force the uploading mechanism to transmit the file as an UTF-8 stream. If you can't select the encoding in the utility, try to put a "BOM" (byte order mark or ZERO WIDTH NO-BREAK SPACE) at the start of your file.

BOM is U+FEFF, but included as such in a UTF-8 stream, it may disrupt correct interpretation. Its UTF-8 encoding is 0xEF 0xBB 0xBF. This may be quite difficult to insert unless you have an hexadecimal editor.

(*) The only marker which can flag a plain text file is BOM appearing as the first character in the file.

more

I uploaded from Google Chrome. ODT file from LO 5 via my chromebook.

It occurred to me the source file was originally L0 in ODT form, then uploaded to Google docs as this was how I worked with my editor, then downloaded to L) and the formatting of any kind cleared to begin formatting the document. I don't understand all you say as it is beyond my current technical understanding, but not get it that the download may have pulled in something from Google docs. The same problem does not happen if the file is transferred to a different OS. I would assume if I put a BOM it will help but would need the instructions as to how to do that. However, it is an ODT not a plain text fle. I have no idea if I have a hexadecimal editor.

( 2019-07-02 10:26:33 +0200 )edit

If the problem does not happen outside ChromeOS, forget my suggestion about BOM. It is valid only with plain text. And as you work with .odt, it will not help.

( 2019-07-02 11:11:55 +0200 )edit

Thanks. I found it interesting to learn about anyway!

( 2019-07-02 12:32:03 +0200 )edit

If you are using an editor or system that does not support Unicode you will need to EXPORT your LibO file to downgrade the support character set. For example .txt. This will allow you to select ISO-8859-1 which should be supported by your prowritingaid. If not you could try US-ASCII as the system appears to only understand English. (I think, we in Europe would say English-US, rather than English-GB) You will probably have the same problem if you COPY/PASTE to go over to Chrome. What keyboard setting, language setup, data types does prowriting support?

Libreoffice and the Internet default to Unicode, the International standard since 1997 which supports about 138,000 characters. Unicode includes ISO-8859-1 (1987 vintage) which supports the first 255 characters used in Western Europe including England and France. And it supports US-ASCII (1968 vintage) the first 127 characters used in America.

As I mentioned above, if you copy and paste the system must support Unicode. If not you will have this problem. Also, if you EXPORT to a RTF file you must specify the character set your system understands. Otherwise the file again will be still Unicode. Have you defined your Chrome system Keyboard to support Unicode (UTF-8)?

It is importance to understand that copy / paste from one file to another, even if both are Libreoffice .odt files is controlled by the operating system settings, including language. The fact that both files use Unicode (UTF-8) is overuled by your operating system. A bit like having a colour camera, television but black and white film. I assume that one of your problems is there. Language is also important. If you are in Europe and use the € (Euro) sign, for example, you would have the same type of problem. This is why I asked what keyboard and Language settings you have defined. You may have not set these things yourself but they are important to sort out your problem which must be frustrating for you.

By the way, most of my users are not programmers.

========================== Updated to consider Apostophie and Quotation Mark situation

One problem that can add slightly to this conversion problem and shows why it is important to know the operating system language, LibreOffice language setting and language used for cutting and pasting. Or to put it another way, what you think you are typing is not actually what you get.
I quote from the Unicode Manual. “Most keyboard layouts support only the U+0022 ( " ) QUOTATION MARK therefore word processors commonly offer a facility for automatically converting the U+0022 ( " ) QUOTATION MARK to a contextually selected curly quote glyph.” These conversions are language dependant. So, for example, English and Dutch are not the same a Danish and Finnish or French. Also, the Apostrophe used in publishing is often “improved” by upgrading it to a quotation mark. As the Old US-ASCII and ISO-8859-1 support only the 0022 and 0027 you can see this leads to problems highlighted in this question.
Apostrophe U+0027 ( ' ) Quotation Mark U+0022 ...

more