# How to get a list of all unique values from two columns? [closed]

Hi! I have two columns containing values like "aaa", "aab", "aac", etc. They repeat n times in a random pattern and new values may emerge after few intervals. I need to create a list in a new column that would show all the unique values from both columns.

It would look something like this:

If I had just one column I could use Pivot table with the necessary Row Field, but since I have two columns, this does not work. Any suggestions?

edit retag reopen merge delete

### Closed for the following reason the question is answered, right answer was accepted by Aigars close date 2017-02-01 15:48:37.031419

Sort by » oldest newest most voted

Assuming the two columns are containing continuous cell ranges.

A
(1.) Collect some meta information first in a few available cells: Numbers of used cells for each column.
(2.) Dedicate two extra columns to calculate (MATCH or COUNTIF or ...) for each source column independently which elements occur there the first time (top down) inside the column .
(3.) Modify the formulae for the second source column to also return FALSE if the element already occurred in the first column.
(4.) Collect new meta info: How many distinct elements from each column.
(5.) Use one of my many examples demonstrating how to "compact" a selection based on such helper columns twice.
(6.) Make one column from the two results using OFFSET again in a final column.

B Use the obvious interactive means: Copy/Paste and Filtering.

For your convenience (and as a final demo concerning questions of the kind) I made this new example doing as described under A.
(Appended:) Please be sure to note that the data for this demo are generated randomly and will thus change on full recalculation.

(Editing with respect to the discussion redarding the answer by @rautamiekka)
For requests of the present kind it mostly essential to know whether the "data" to be processed are actually data (direct content of the cells; types 1 or 2) or probaly formula results. In the second case the suggestion to register a sheet as (a table of) a database may not apply, IMO. In addition a database will only accept sheets of a restricted structure as tables. If thie condition of only containing constants is met I would suggest to use interactive means as demonstrated in this second attachment.

more

Wow, thanks for going the extra mile and providing an example. I will look into it, might be what I needed.

As for the "B" option, it's no good, there are about 3k entries and it needs to be automated.

( 2016-09-14 12:44:45 +0200 )edit

Automation may require a bit of progranning user code. I will not encourage you to go this way, but if you do not feel the need to migrate your application to a database anyway, this is also an option in the specific case. I personally prefer solutions by formulae what is the "Spreadsheet-Do" IMO. However we have to consider formula solutions to get ineffective when applied to very large data collections.
Isn't there a way to block multiple entry of keys from the beginning?

( 2016-09-14 12:54:38 +0200 )edit

Sorry! The announced "second attachment" was not accepted by the askbot for unknown reasons. I will try again later.
As you see it finally worked after a few 'Refreshs' in my browser. This askbot really adds some additional waste of time.

( 2016-09-14 13:00:00 +0200 )edit

Ok, so I tested your spreadsheet with a small portion of my data, it seems to work very well. Although I don't yet know why and how :D (I have to run) Though, my spreadsheet is not meant to be infinite. Data fields have a fixed size, but I'm starting to wonder if it could in fact be turned into a continuous table...

The very foundation of my spreadsheet is to enter those values in these two columns with a random pattern. They are supposed to repeat.

( 2016-09-14 13:08:15 +0200 )edit

Hi @Lupp & @Aigars

For information, in this case it is no necessary to filter MERGER Does not begin with... ZzZzZz%.

In the field name just select - none - and of course, don't forget to tick OptionsNo duplications

Regards

( 2016-09-14 14:00:10 +0200 )edit

@pierre-yves samyn: Thanks for the reminder! Je me sens un peu faible ... now and then.

( 2016-09-14 15:02:09 +0200 )edit

@Lupp - I would not say this to someone who provides so many answers. You rock :)

( 2016-09-14 16:17:08 +0200 )edit

@Lupp Yes he does! :)

( 2016-09-14 18:39:05 +0200 )edit

(You made me flush. I am just after the karma.)

( 2016-09-14 21:25:28 +0200 )edit

( 2016-09-15 08:30:34 +0200 )edit

Something can be done with Menu/Data/Consolidate, but there is a bug and only the first range to consolidate it's preserved in the option when the file it's saved.

more

Worked fine for me, with no bugs, but I was looking for a more or less automated solution.

( 2016-09-15 08:29:45 +0200 )edit

I ran same test with this code and it works perfectly.

SELECT
"Column1"
FROM
"_TABLE_"
WHERE
"colum1"
NOT IN(
SELECT
"Column2"
FROM
"_TABLE_"
) UNION
SELECT
"Column2"
FROM
"_TABLE_"
WHERE
"Column2"
NOT IN(
SELECT
"Column1"
FROM
"_TABLE_"
)

more

This looks like SQL, but I'm working with a spreadsheet in calc.

( 2016-09-14 10:27:42 +0200 )edit

@Aigars: then it might not be possible without complex cell code or a macro. I seem to recall SQL can be used in Calc too.

( 2016-09-14 10:35:00 +0200 )edit

You can register the Calc file as a database and then use that SQL code.

( 2016-09-14 11:25:03 +0200 )edit

It's starting to sound promising, but I lack the knowledge in this area.

( 2016-09-14 11:32:17 +0200 )edit