I have a sentence I would like to paste into a column in Calc, each word on its own row.
For example:
The
dog
jumped
over
the
cow
.
Is that possible? And if so, how?
Cheers
I have a sentence I would like to paste into a column in Calc, each word on its own row.
For example:
The
dog
jumped
over
the
cow
.
Is that possible? And if so, how?
Cheers
to keep the formula alive, copy, select A3 and choose menu Edit
- Paste Special
- Paste Unformatted Text
- OK
(the same that Ctrl+Shift+Alt+V
- Enter
).
If the original Text is in A1, enter into A2 the Formula:
=SUBSTITUTE(A1;" ";CHAR(10))
then select A2 and do →→Data→Text to Columns (without any settings) →→ok
***on windows its maybe CHAR(13) instead CHAR(10) … on Linux both Substitutions works the same way!
Better than my clunky method!
Awesome, thanks!
Another Variant with pure Monster-formulas:
A2 → =LEFT(A$1;FIND(" ";A$1))
A3 →
=IFERROR(LEFT(SUBSTITUTE(A$1;CONCAT(A$2:A2);"";1);
SEARCH("( |.$)";SUBSTITUTE(A$1;CONCAT(A$2:A2);"";1)));"")
#!!manual Linebreak for readability!!###
and pull down as far you need.
What about punctuation, quotation marks…?
A function TEXTSPLIT() might be the means of choice. Unfortunately it doesn’t exist.
The function REGEX(), available since V 6.2 can be used to do it by a formula under array-evaluation.
See attached example for details.
disask70908splitToWords.ods (12.5 KB)
Cool, thanks. Is there anyway to make it take longer sentences, or maybe paragraphs?
Is there a way to copy the resulting column?
Continue with Lupp’s solution (very nice!) but re-enter the REGEX column (B4 and down) as a longer Ctrl+Shift+Enter formula. Go as far down as you want to accommodate your longest expected sentence.
You change the size of a CSE formula by Ctrl+/, F2, Ctrl+a, Ctrl+c, Esc, Esc to copy the formula itself (not the range), delete what is there, then re-select the larger sized CSE range, then F2, Ctrl+v, Spacebar, Ctrl+Shift+Enter to reapply the formula as an array formula.
Now you’ll have ugly #N/A’s on the extra cells in column B below the last word of your sentence. Create a style based on the style you want to use (likely just Default) and make the text the same color as the background (likely white). Mark your array in column B again with Ctrl+/ then use conditional formatting and apply your new style for is error. The ugly #N/A’s are now invisible. Remember, they will still trigger page printing, etc., though.
You can copy the resulting column just selecting your sentence words (without the #N/A’s), then Ctrl+c to copy, then move to where you want the result and use Paste-Paste Special-Text… to create a text-only copy. (You might need another Paste Special choice if you have numbers in the sentence.)
I tried copying the cells in the column, but it won’t let me, there’s no option to copy either. If I try to click one of the cells, it says you can’t change the array.
If you don’t mind several steps, including a scratch sheet:
create a new sheet to work in & enter your text into a cell, say A1
With the cell selected, click Data > Text to columns and choose as separator Space. You will see in the preview that the text is separated into individual cells along a row
Select the cells with the text, Copy, select Cell A1 and click Edit > Paste Special > Paste Special, in the dialogue select Transpose and OK. You will see your words in a column.
You will have to sort out punctuation with Find and Replace
Thank you!
Yesterday I attached a demo only containing a solution by a formula with array-output.
There are a few unavoidable disadvantages, but in addition there are annoying changing issues which I judge to be bugs. For obvious reasons formulas with array-output aren’t used by many on a regular basis, and I also mostly avoid them. Issues are rarely reported as bugs - and won’t be fixed soon therefore.
However, the main purpose of the mentioned attachment was to show how the REGEX() function can be used in the given case. Accepting a probably insignificant loss in efficiency, I also made a not yet published demo how to get the results by single-cell formulas. It’s now attached here:.
disask70908splitToWordsE.ods (14.3 KB) (Lime colored column.)
Just had another glance on yesterday’s demo, and found it was unnecessarily complicated in some places. New version attached:
disask70908splitToWordsE2.ods (15.0 KB)
Well, the interactive solution (based on a formula needing to be applied in advance) by @karolus is well recommendable - but that’s case dependent.
Like with all the other suggestions, its correct working is based on the assumed assertion that the input isn’t a sentence in the ordinary sense of the term, but a space-separated list of words.
That’s your (the OQ’s) responsibility since you didn’t make things clear.
Assuming your “string” (subject) actually is meant to mean “sentence” (text of the question) I chose a more generalizing example in place of your dog/cow thing.lacking the full stop at the end.
In a real sentence words can be separated by spaces, but also by other non-word-characters. As opposed to the SUBSTITUTE() function which not can search for RegEx (even if the related option is enabled), REGEX() allows to address all the non-word-characters by simply using the shorthand group \W
.
Since human languages take pride in supporting and making mandatory nonsensical constructs, this can’t be a final solution. English, but also other languages use, e.g, the apostrophe -which is a non-word-character by general classification- as a functional word-character in some cases (Shakespeares 'tis
, genitive-infix and sometimes suffix and all the nonsense) where a purely syntactical recognition is complicated if not impossible in texts (plain ASCII e.g.) where the apostrophe also may be used as a second-level-quotation-mark…
@Lupp : I aggree, to cover all the odds and ends … Calc is defintly the wrong tool…but to some acceptable degree it is possible to normalize a given Text by ‘search&replace’ … if there is need for advanced Text-tokenizing we should use something like Natural-Language-Toolkit