Extract unique names from table columns in L.O. Base

Greetings.
I have a Libre Office DataBase with movies. In the main table there is a field with the names of the main artists that play in the movie, separated by a comma sign.

Ex: “97 – Vantage Point - Dennis Quaid, Matthew Fox, Forest Whitaker, Sigourney Weaver, William Hurt”

I need to order the movies by artist name, like this:

Adam Sandler:
2 - Murder Mystery
790 - Just Go With It
809 - Funny People

Jennifer Aniston:
8 - Love Happens
9 - Friends
20 - Along Came Polly
790 - Just Go With It

Can I do this? Is it possible to extract unique names separated by comma in a column, and then order them alphabetically, showing the movies where they acted?

Thank your for your help.
UnikNames.odb (20.2 KB)

Yes you can do it

No. Or yes, but with very great difficulty. The fact is that in databases they try to make it so that you can enter a value in one place and then use it where needed. For example, you could add Julia Roberts to the Artists table and simply include her ID in the movie descriptions. Like that - UnikNames.odb (70.4 KB)

JohnSUN,
I thank you very much for your suggestion but it would change completely the structure of my database. I already have a search form in it that allows me to find movies by actors and actors by movies.

Also I’ve tried to add a new Title in your form “Films” (Movie_x) with “Actor_x”, “Actor_y” and “Actor_z” and I couldn’t do it in the subform “Actors”.
It only allows me to add names in use from table “Actors”, which means that whenever I have to add a new actor I should add it separately in that table.
This will complicate tremendously the simplicity of my database.

But one thing I’d like very much to know: how did you do to create your tables “Actors” and “Films” from my table “Registry”?

Best regards.

Yes, you are right - it will change the structure. This change is called Database normalization (the link leads to a Wikipedia article that explains why this is needed and how it is done).
Believe me, this really needs to be done. A simple example based on your data: a misspelling of an actress’s name resulted in two different people playing Blood Diamond and The Dilemma (from a database perspective, not in reality)

Oops, you misunderstood the proposed example - these forms are not intended to maintain a database, this is only intended to demonstrate the principle: data from different tables connected by relationships can be displayed in any order, “movies and their actors” or “actors and their movies”
I could just quote @Villeroy phrase This database is faulty. Forget about filter forms until you implemented your one-to-many relations (this link should take you to a convincing [Example] Relations reflected by list boxes in forms text - please read and try to understand it)

It wasn’t very difficult. First I copied your Registry spreadsheet into a new Calc spreadsheet. Then, using a formula like

=SUBSTITUTE(TEXTJOIN(CHAR(10);1;C2:C200);",";CHAR(10))

I extracted all the actors into one cell and using Data - Text to Columns expanded them into a long column.
Using the Standard filter option No duplication, I got rid of duplicates, sorted the list alphabetically, copied and pasted it into the database as a new table. Base created an ID field for this table and renumbered all the actors

The Films table was derived from the original Registry table by simply removing the Actors column and renaming the table.

All that remains is to create the “junction table” ActorsInFilms which is mentioned in the article [Example #1] Filter/Search with Forms (leveraging SubForms)
This was done using the same Calс tools - formula with TEXTJOIN(CHAR(10)…) and Text to Columns

Yes, but it tremendously simplifies later “searches”. As an actor is only added once, but reports and searches are usually repeated often database designers decided to optimize for the more common task…
.
I guess you are aware of other little problems like “Henry V” also matching for “Henry VI” and VII and you can’t imagine how many problems names of artist may also show (lets start with “the artist formerly known as”).

@Quaresma,
.
I fully endorse the comments made by JohnSUN and Wanderer, your database is poorly structured and unfit for purpose, a little bit of pre-planing and adherence to the basic rules of normalisation would eliminate many of your issues.
.
I used macro code to extract the data from your table called “Registry”, create the necessary tables with appropriate constraints and insert the extracted data.
.
we now have 3 tables “tActors”, “tTitle”, “tTitleID_ActorID”, 2 queries “qActor_Before_Movie_Setup”, “qActor_Before_Movie_Output”, 1 view “vActor_Before_Movie_Setup” (created from the query “qActor_Before_Movie_Setup”)
.
so you now have a normalised db and the query “qActor_Before_Movie_Output” outputs data in the desired format.
.
in summary:
the only essential objects are your table “Registry” and my 2 queries.
when the macro is executed my tables and view are dropped then recreated.
we now need to refresh the tables manually:

  1. from the Base window hit ‘Tables’ icon.
  2. menu:>View>Refresh Tables.

I do not refresh the Base tables via the macro because success is not guaranteed.
.
I actually built the attachment yesterday but did not wish to upload pending your response to JohnSUN.
SplitNames.odb (63.0 KB)

“JohnSUN”, “Wanderer” and “cpb”,

You are absolutely right in your comments. I fully agree with you, but I have to tell you that the database enclosed in this post is just a small sample for testing I gathered from my real database which has plenty other fields and searching possibilities.

In fact, as I asked for in my first post, I just need to order the movies by artist name, so that I can make a report to show the list of artists I have in my database and the movies where they played in, and I have to say a special thanks to “cpb” because he almost did it, and that is the reason I considered his contribution as the solution for my problem.

As you can see in the reports of the attached database, only one report is missing, by “artist name”… and thanks to “cpb” I will try to break my brains studying the macro he generousely made.

Thank you cpb, once again.
UnikNames - reports.odb (33.2 KB)

There must have been a misunderstanding - we were answering the wrong question. Did you need a text document (report) with regrouped data?
UnikNames_rpt.odb (24.0 KB)

1 Like

You absolutely solved this macro charade. I appreciate your commitment to help me solving this problem. A big hug.

@Quaresma,
while the attachment by JohnSUN is an impressive piece of coding it’s not really a practical long term solution to you issues.
.
I suggest that you consider using a modern database all of which have a function equivalent to the HSQLDB.2.x function GROUP_CONCAT.
.
take a look at this attachment it contains a HSQLDB 2.3.4 database comprising of the 3 tables we extracted from your table “Registry”, 3 simple almost identical queries and 3 reports (2 of yours & 1 of mine for Actors).
.
there is also 1 macro coded by DACM Create a new ‘split’ HSQL 2.x database which automatically sets up the connection on loading.
.
you will have access to a host of new functions and the ability to do date arithmetic.
.
I uploaded the attachment as an .odb file but actually it’s a zip file, to extract the files:

  1. change the file extension from ‘odb’ to ‘zip’.
  2. extract the files to a new folder inside a trusted folder.
  3. open the new folder.
  4. to download the jar file:
    4a) download ‘2.3.4’ hsqldb jar from here
    4b) hit ‘2.3.4’
    4c) next to ‘Files’ hit ‘jar (1.4 MB)’
    4d) rename ‘hsqldb-2.3.4.jar’ as ‘hsqldb.jar’ and drop into the new folder (item 3).
  5. double click the odb in the new folder and the database should open OK.
  6. hit the tables icon to ensure data is read to memory.

.
I am hoping to convince you that database normalisation should be adhered to and that using the correct tool is vital for success.
SplitNames_HSQL_2_3_4.odb (79.7 KB)

1 Like

image

2 Likes

I am speechless! And what can I say? You are both right and that is an honor to learn from you!
CPC - I followed every step of your guide and finish with great success. This is the cherry on top of the cake! Thanks!
Wanderer - This reports can also help me find any inconsistencies in actor’s and movies’ names.
Thank you guys!