Ask Your Question
3

How do I find the differences between two lists in LibreOffice? [closed]

asked 2013-06-01 16:53:45 +0200

graviton gravatar image

updated 2013-06-04 16:08:44 +0200

qubit gravatar image

Here's my problem: I need to take two lists, each consisting of many tens of thousands of items, most of them identical between the two lists, and find the relatively few new items in the second list.

I posed this question on a tech advice site, asking if any software existed that could do this. One answer I got included a YouTube video where a guy showed exactly how to do it in Excel, using three separate techniques. The first two were "match" and "vLookup".

So my question is: Do either of these two techniques have an equivalent in LibreOffice, and if so, where do I find explicit instructions on how to use them?

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by Alex Kemp
close date 2015-11-02 02:32:13.440109

2 Answers

Sort by » oldest newest most voted
3

answered 2013-06-01 20:24:07 +0200

froz gravatar image

updated 2013-06-01 20:27:52 +0200

The best way of doing some comparisons on large amount of data would be database, but it is required that you understand database tables, SQLs etc. You probably don't, because if you do, then you would probably not ask this question.

Lets try to solve this problem using LibreOffice Calc. If I understand correctly you have two lists of data. First list (e.g. list A) is in column A and second list (e.g. list B) is in column B. You have some data in list B that does not exists in list A. But as I understand you have no item in list A that does not exists on list B. You would like to know which data from list B do not exists on list A. Is it?

Simple sample:
A1: AAA
A2: BBB
A3: CCC
B1: BBB
B2: DDD
B3: CCC
B4: AAA

In C1 type in formula: =IF(ISERROR(MATCH(B1;$A$1:$A$1000000;0))=1;B1;"") and copy formula down from C1 to C4.
The formula displays in column C the values that are in the list B and do not exists in list A.

Hope this helps. If not please describe your problem in more detail.

edit flag offensive delete link more

Comments

Works for me (4.0.3.3) in a little sample set. Took about 23 seconds thinking to work out how to isolate the "new" values displayed in C column, but quite effective, really. I'm not sure how now to turn those "reference" values in C into text values I can copy into List A. Care to help on that one? (I suppose the is the "match" approach mentioned by OP.) Thanks @froz!

David gravatar imageDavid ( 2013-06-01 20:39:12 +0200 )edit
1

@David, do you mean something like this?

oweng gravatar imageoweng ( 2013-06-02 02:57:45 +0200 )edit

@oweng - Yes, exactly like that. Many thanks!

David gravatar imageDavid ( 2013-06-02 14:44:57 +0200 )edit

Works perfectly. Thank you.

Gary gravatar imageGary ( 2014-08-18 06:24:05 +0200 )edit
2

answered 2013-06-04 16:35:06 +0200

qubit gravatar image

Hiya,

Unless the data is already in an ODT or otherwise requires LibreOffice, I would go old-school and use the command line.

(On an Ubuntu system -- might require tweaks for OSX or other *nixes. On Windows, install Cygwin or something)

  • Put your lists in two different files a.txt and b.txt
  • Run this command:

    cat a.txt a.txt b.txt | sort | uniq -u
    

The "uniq- u" invocation will remove all duplicate lines. We include the "A" list twice to make sure that we remove all of those items (even items that don't appear in list "B").

edit flag offensive delete link more

Comments

Comparing files on Linux: comm -13 <(sort -u a.txt) <(sort -u b.txt)

froz gravatar imagefroz ( 2013-06-05 08:24:37 +0200 )edit

Love this command line stuff! So powerful, so simple. Right tool for the right job.

David gravatar imageDavid ( 2013-06-05 23:07:09 +0200 )edit

Question Tools

Stats

Asked: 2013-06-01 16:53:45 +0200

Seen: 39,483 times

Last updated: Jun 04 '13