We're currently migrating from Ask to Discourse, read the details here

Ask Your Question

Issue with pasting text in writer

asked 2018-11-19 06:14:23 +0200

Karaquin gravatar image

updated 2018-11-19 08:03:01 +0200

gabix gravatar image

Hi, i'm trying to copy a few lines from a website and paste it into the writer but i keep getting this weird pattern: image description

I've tried pasting it with all different options, furthermore, I have also tried copying the text dirctly out of the websites source code. It always ends up looking like this and it only ends up looking like this in libreoffice-writer; the text looks fine when I paste it into editor. Can someone explain why this is happening and how I can fix it? Thanks in advance.

edit retag flag offensive close merge delete


Hard to tell from just an image but it seems you are getting some style information from the website. Have you tried Paste Special and choose unformatted text? Can yu give a link to the particular web page? I tried a search but got too many results to look at all.

robleyd gravatar imagerobleyd ( 2018-11-19 06:42:00 +0200 )edit

Yes, i have tried that. It doesn't change it. Here is the website: link text

Karaquin gravatar imageKaraquin ( 2018-11-19 07:06:06 +0200 )edit

My guess is that the website uses UTF-8 character set which doesn't work nicely with LibO as we have it configured. I did manage to successfully copy from the source - that gave me HTML tags as well but 'clean' text. I'm not sure what the definitive solution is, however.

robleyd gravatar imagerobleyd ( 2018-11-19 08:20:35 +0200 )edit

It's the ­ HTML element inserted between each and every syllable by the web designer/developer. See my comments below the answer for more info.

PhLo gravatar imagePhLo ( 2018-11-19 08:40:54 +0200 )edit

A note to OP: besides screenshots, it is good to describe exactly what you feel "weird": omitting it may make people guess what was that, like above: was that center-alignment, or was that soft hyphens? Please next time be more verbose.

Mike Kaganski gravatar imageMike Kaganski ( 2018-11-19 09:48:14 +0200 )edit

Thank you for the answers. Sure, I'll try to be more specific next time. truth to be told; I didn't even know that those things were called soft hyphens.

Karaquin gravatar imageKaraquin ( 2018-11-19 10:32:11 +0200 )edit

truth to be told; I didn't even know that those things were called soft hyphens

That was just an advise for the future; and we naturally ask when we don't know ourselves. It's OK to write something like "those grey-shadowed thingies on screenshot" :-)

Mike Kaganski gravatar imageMike Kaganski ( 2018-11-19 10:37:46 +0200 )edit

2 Answers

Sort by » oldest newest most voted

answered 2018-11-19 08:51:05 +0200

PhLo gravatar image

updated 2018-11-19 09:42:01 +0200

Below is an image showing easy steps to remove the soft hyphens within LibreOffice Writer. Open the Find & Replace dialog (CTRL+H or COMMAND+H) and follow the steps. In case it's hard to read in the image, the text in the Find box is \xAD. Yes, that's a backslash at the beginning.

Obviously your interface will vary slightly if you are working in German or another OS. Don't forget to uncheck Regular expressions after you are done removing the soft hyphens, otherwise you might get unexpected results on your next Find & Replace operation.

image description

On your linked German website, Inspect Element by right-clicking on a paragraph using Chrome browser's developer tools, and you can see the entities littering the code. Ew...

image description

edit flag offensive delete link more


More specifically, this website makes use of a jQuery plugin called Hyphenator 4.2.0, which inserts these soft hyphens programmatically. Again, I think it's a mistake, but to each his/her own I guess. Pointless overhead to cycle through EVERY word on the page when loading pages for a feature that doesn't serve an important or meaningful purpose. Perhaps the designer is REALLY into typography or something. :)

PhLo gravatar imagePhLo ( 2018-11-19 09:08:10 +0200 )edit

Searching for regex \xAD does the job.

Mike Kaganski gravatar imageMike Kaganski ( 2018-11-19 09:18:24 +0200 )edit

Good one, @Mike Kaganski. That would be just as easy if not more so, especially on a machine that doesn't support ALT+ codes. I'll update the answer to make it easier for others.

PhLo gravatar imagePhLo ( 2018-11-19 09:20:38 +0200 )edit

Thank you a lot for your comment! I checked the code with firefox but it didn't show me those "&shy" expressions - then again; maybe I just looked at the wrong place. Anyway, thanks.

Karaquin gravatar imageKaraquin ( 2018-11-19 10:41:31 +0200 )edit

Indeed. I did the same. Chrome seems to work better for this particular oddity. I was disappointed that Firefox didn't show them because it is my preferred browser. Oh well. They are indeed in the code though. The jQuery library plugin adds them as/after the page loads, which is cool because then if viewing on a phone, you can lose a bit more battery life while Javascript and CSS do their "invaluable" hyphenation :) The extra jQuery also adds to the page download size. Yay.

PhLo gravatar imagePhLo ( 2018-11-19 10:45:01 +0200 )edit

Sorry, don't mind me. I'm a verbose, former web designer/developer. :D

PhLo gravatar imagePhLo ( 2018-11-19 10:45:29 +0200 )edit

lmao, I like you

Karaquin gravatar imageKaraquin ( 2018-11-19 11:10:02 +0200 )edit

I'm ­ - don't make me blush.

PhLo gravatar imagePhLo ( 2018-11-19 11:28:57 +0200 )edit

answered 2018-11-19 08:00:29 +0200

gabix gravatar image

The text pastes riddled with soft hyphens. You can't do much about this except for removing them (with a macro, for example). However, why bother?

looks fine when I paste it into editor.

Which editor do you mean? Most probably, the editor just can't display soft hyphens correctly.

edit flag offensive delete link more



This is an artifact of the HORRIBLE idea of the $shy; HTML entity that this website is using. The web coder has an unhealthy fixation on supporting hyphenation. Completely pointless since it ruins clean text for the sake of a feature that only a few browsers support anyway. Inserting invisible html entities between every syllable is awful and probably has numerous bad consequences! Nothing to do with LibreOffice, it's a problem specific to this website because of their coding practice.

PhLo gravatar imagePhLo ( 2018-11-19 08:29:33 +0200 )edit

It's about as sensible as wrapping every single word or letter in <span></span> tags just in case you want to do something to them via CSS. Bad coding approach.

PhLo gravatar imagePhLo ( 2018-11-19 08:32:34 +0200 )edit

I suppose you could recommend an "improvement" to LO that, if such &shy; html entities are in clipboard text, that they be automatically stripped out on paste function. Here's the unicode info on the character in case anyone is curious:

U+00AD : SOFT HYPHEN [SHY] {discretionary hyphen}
PhLo gravatar imagePhLo ( 2018-11-19 08:37:19 +0200 )edit

I meant WordPad not editor. I could paste it into a WordPad document without those soft hyphens showing or affecting the text in any way. I'm not sure why but I couldn't find anything unusual in HTML while "inspecting" the page with firefox. Consequently, I was somewhat puzzled. I don't aim to fix the code on the website. It's a college website; I was summarizing/copying some parts of its content for my own personal notes.

Karaquin gravatar imageKaraquin ( 2018-11-19 11:03:01 +0200 )edit

But just think, if you bring all these things to the university's attention, they might offer you a great job on their web development team! :D Don't point them my way though. I don't speak German, and I'm thoroughly burnt out with web design. I wish this forum had private messaging so I could not pollute the answers with all my stupid jokes. But then all the experts with mega karma would get spammed big-time with private message questions! :D (I'm not one of those experts, just an average user)

PhLo gravatar imagePhLo ( 2018-11-19 11:32:09 +0200 )edit

I could paste it into a WordPad document without those soft hyphens showing…

Precisely. WordPad does not show them and seems to be incapable of handling them correctly. But the soft hyphens are still there. This is not good, this is very bad and only means that micro$oft apes can't properly design even such a simplistic program.

gabix gravatar imagegabix ( 2018-11-19 12:31:15 +0200 )edit

Question Tools

1 follower


Asked: 2018-11-19 06:14:23 +0200

Seen: 904 times

Last updated: Nov 19 '18