Ask Your Question

Splitting words in Writer [closed]

asked 2013-09-03 13:51:19 +0200

johnaaronrose gravatar image

updated 2014-08-08 20:41:00 +0200

manj_k gravatar image

I have created a document by scanning pages from various printed documents. As a result, there are paragraph breaks 'hard coded' where I do not want them. When I delete them, words split (see below for an example). I noticed that the style for the scanned text was Pre-formatted. So I changed it to Default using Edit>Replace. However, it made no difference to the word splitting. Any ideas as to why this happens and how to correct so that the text word wraps properly?

Continuations as over a 2♦ response. A no trump rebid by responder shows five spades in an otherwise balanced hand, a spade raise shows 6+ cards and a new suit is natural and game forcing.

PS I notice a previous question saying that paragraph breaks (generated by copying text from other documents) should be removed by Tools>Autocorrect but the solution there didn't work for me.

edit retag flag offensive reopen merge delete

Closed for the following reason question is not relevant or outdated by Alex Kemp
close date 2015-11-07 23:41:17.182370


I forgot to mention that there are line breaks (without splitting a word) in the middle of lines. This is also shown in the example quoted.

johnaaronrose gravatar imagejohnaaronrose ( 2013-09-03 13:54:55 +0200 )edit

I've just installed LibreOffice and still the same problem. To me, this seems like a bug. Is it?

johnaaronrose gravatar imagejohnaaronrose ( 2013-09-04 11:19:19 +0200 )edit

The example (as initially displayed) does not indicate the splitting of words. To be clear, there is a carriage return (manual break) between "2♦" and "response." and when I edit your post it seems the word "cards" is split, such that it displays as "6+ card" and then "s and a new" on the next line. Is this correct? I will amend your post to improve the formatting if this is the case, but want to be certain I have this correct first.

oweng gravatar imageoweng ( 2013-09-05 10:23:58 +0200 )edit

I've attached a screenshot, of LibreOffice Writer using a small file containing just one paragraph, to my answer.

johnaaronrose gravatar imagejohnaaronrose ( 2013-09-08 09:03:39 +0200 )edit

1 Answer

Sort by » oldest newest most voted

answered 2013-09-06 18:42:34 +0200

johnaaronrose gravatar image

updated 2013-09-08 09:02:36 +0200


It's difficult to show the presence of non-breaking spaces on text pasted into this website's questions & comments. I've only just realised that files can be attached. I've attached a screenshot of LibreOffice Writer using a small file containing just one paragraph.

I think there is a bug in the way that non-breaking space characters cause arbitrary line feeds on very long lines. If they worked correctly, then offending paragraphs would only be on one line rather than multiple lines.

Please see fdo#68924 for more detail.

edit flag offensive delete link more


You should now have enough karma to attach files to your posts. From the bug report, it is now obvious that the problem is one of the no-break space (U+00a0) being used extensively throughout the OCR text. There should be a setting in your OCR software where you can adjust this, but if not, you will be faced with performing a global find/replace. The Unicode line-breaking algorithm is a complex piece of logic with sometimes unpredictable results.

oweng gravatar imageoweng ( 2013-09-07 01:35:48 +0200 )edit

Question Tools

1 follower


Asked: 2013-09-03 13:51:19 +0200

Seen: 2,511 times

Last updated: Sep 08 '13