Ask Your Question
0

Libreoffice Base: How to filter and search in multiple tables? [closed]

asked 2020-06-02 20:37:46 +0200

pmfs1987 gravatar image

updated 2020-08-18 13:20:44 +0200

Alex Kemp gravatar image

Hello.

Using the definition of Libreoffice Base manual (https://documentation.libreoffice.org... p. 198-201), I need to build a query for filtering/searching data in multiple tables, avoiding information repetition.

I am using version 6.0.7.3 of Libreoffice Base, with the default SQL language (HSQLDB).

DATABASE DESCRIPTION

I am building a database for some donations to an archive.

Each donation (“Incorporações”) has the following fields:

  • the donation’s primary key (“ID_incorporação”);
  • the name of the entity that donated the documents (“Nome_entidade”);
  • a brief description of the donation (“Descrição_incorporação”);
  • the oldest known year in which the documents of the donation were produced (“Ano_inicial_incorporação”);
  • the newest known year in which the documents of the donation were produced (“Ano_final_incorporação”).

Each donation is divided into series. The series itself are defined by a table (“Séries”) that has the following fields:

  • the serie’s primary key (“ID_série”);
  • the type of the serie (“Nome_série”).

The relation of the series with the donations (that has a many-to-many nature) is represented in a table (“Rel_incorporação_série”) that contains:

  • its unique primary key (“ID_rel_incorporação_série”);
  • the donation’s foreign key (“ID_incorporação”);
  • the serie’s foreign key (“ID_série”);
  • a full description of the donation’s documents belonging to that serie (“Descrição_série”).

Each series within the donation also have some keywords attached. The keywords, in an identical way to the series, are defined by a table (“Assuntos”) that has the following fields:

  • the keyword’s primary key (“ID_assunto”);
  • the name of the keyword (“Nome_assunto”);
  • the type of the keyword (“Temática”);
  • a description of the keyword (“Descrição”).

Finally, I have made a table for recording which keywords each series has (“Rel_série_assunto”) that contains only two fields, defining, together, the table’s primary key:

  • the foreign key regarding the relationship of the serie with the donation (“ID_rel_incorporação_série”);
  • the foreign key of the keyword (“ID_assunto”).

I have placed a picture of all these tables with all the relationships.

database scheme

PROBLEM DESCRIPTION

  1. I want to filter the various series that are contained in the donations (any field in "Rel_incorporação_série" table) by its keywords ("ID_assunto").
  2. I want to search for words in the series’ and in the donation’s descriptions ("Descrição_incorporação" in "Incorporações" table and "Descrição_série" in "Rel_incorporação_série" table), ideally using only one field as the search criteria.
  3. I want to search for words in the entities that donated the documents ("Nome_entidade" in "Incorporações" table).
  4. I don’t want any repetition of information.

Using a filter table, for storing the filter/search criterias, I created a form for this purpose. I am trying to make one big query, using the examples given in the refered pages of the manual, showing all the information on a subform through a single table. The problem with that solution is that the table captures all the repetitions, regarding how much series I have in a donation, but must important, how much keywords I have in each serie.

Here is the query I tried. "Tabela_filtro" refers to the filter table, while "Nome_entidade", "Descrições" and "ID_assunto1" to "ID_assunto4" to the fields contained in ... (plus)

edit retag flag offensive close merge delete

Comments

Relationship definition in the query search for that in your help link

m.a.riosv gravatar imagem.a.riosv ( 2020-06-02 22:29:15 +0200 )edit

2 Answers

Sort by » oldest newest most voted
0

answered 2020-06-03 16:42:50 +0200

cpb gravatar image

updated 2020-06-09 21:07:24 +0200

this answer replaces the original in order to reflect the objectives as i now understand them.

ok so user pmfs1987 wants to search the form to show only those series which contain ALL of the keywords (filter terms). if no filtration terms are specified then show all records.

this is much more complex than showing the series which contain any of the keywords.

i have moved the filtration code relating to the table "Rel_série_assunto" from the WHERE clause to the FROM clause (no point in doing everything twice).

the reason that i have focused on a single query rather than the data form is an attempt to clarify the necessary steps required.

once the user is able to grasp the concept then form design will be relatively simple.

hope it works! i obviously cannot test it.

select
    a."Nome_entidade", a."ID_CDI", a."Ano_inicial_incorporação", a."Ano_final_incorporação", a."Descrição_incorporação",
    b."Descrição_série", b."Ano_inicial_série", b."Ano_final_série",  d."Nome_série", e."Nome_assunto", e."Temática", e."Descrição"
from
    "Incorporações" a
join "Rel_incorporação_série" b on a."ID_incorporação" = b."ID_incorporação"
join
(   select *
    from "Rel_série_assunto"
    where
    (       (select "ID_assunto1" from "Tabela_filtro") is null
            and
            (select "ID_assunto2" from "Tabela_filtro") is null
            and
            (select "ID_assunto3" from "Tabela_filtro") is null
            and
            (select "ID_assunto4" from "Tabela_filtro") is null
    )
    or
        (
            "ID_rel_incorporação_série"
            in
            (
                select r."ID_rel_incorporação_série"
                from
                    "Rel_série_assunto" r,
                    (select "ID_assunto1", "ID_assunto2", "ID_assunto3", "ID_assunto4" from "Tabela_filtro") f
                where r."ID_assunto" in(f."ID_assunto1", f."ID_assunto2", f."ID_assunto3", f."ID_assunto4")
                group by r."ID_rel_incorporação_série"
                having count(r."ID_assunto") =
                (
                    case when f."ID_assunto1" is not null then 1 else 0 end
                    +
                    case when f."ID_assunto2" is not null then 1 else 0 end
                    +
                    case when f."ID_assunto3" is not null then 1 else 0 end
                    +
                    case when f."ID_assunto4" is not null then 1 else 0 end
                )
            )
        )
) c
on b."ID_rel_incorporação_série" = c."ID_rel_incorporação_série"
join "Séries" d on b."ID_série" = d."ID_série"
join "Assuntos" e on c."ID_assunto" = e."ID_assunto"
where 
(
    lower(a."Nome_entidade") like(select '%' || lower("Nome_entidade") || '%' from "Tabela_filtro")
    or
    (select "Nome_entidade" from "Tabela_filtro") is null
)
and
    ( 
        lower(a."Descrição_incorporação") like(select '%' || lower("Descrições") || '%' from "Tabela_filtro")
        or
        (select "Descrições" from "Tabela_filtro") is null
    )
edit flag offensive delete link more

Comments

Sorry, I had to disconnect for a few days. Your suggestion works, but I want the search form to show me the series that had ALL the of keywords selected (AND), not AT LEAST ONE (OR).

I found out about UNION. Using it with IN I've reached the following condition:

(c."ID_assunto" IN (SELECT "ID_assunto1" FROM "Tabela_filtro" UNION SELECT "ID_assunto2" FROM "Tabela_filtro" UNION SELECT "ID_assunto3" FROM "Tabela_filtro"  UNION SELECT "ID_assunto4" FROM "Tabela_filtro") OR ((SELECT "ID_assunto1" FROM "Tabela_filtro" IS NULL) AND (SELECT "ID_assunto2" FROM "Tabela_filtro" IS NULL) AND (SELECT "ID_assunto3" FROM "Tabela_filtro" IS NULL) AND (SELECT "ID_assunto4" FROM "Tabela_filtro" IS NULL)))

But this also shows the series that has AT LEAST ONE of the chosen keywords. But I think with UNION clause I could get a solution to what I want, because the UNION clause can transform the values of the 4 "ID_assuntosn" to a single column. Although I could ...(plus)

pmfs1987 gravatar imagepmfs1987 ( 2020-06-07 20:51:25 +0200 )edit

Also, I don't care if the solution to what I want uses one or various queries. I just want a solution, so if you know about one with more than one query it will be perfect.

pmfs1987 gravatar imagepmfs1987 ( 2020-06-07 21:07:23 +0200 )edit

That worked! Thanks a lot!

I just had to made these additions:

  • Added b.“ID_incorporação”, c.“ID_rel_incorporação_série” and c.“ID_assunto” in the first SELECT to be able to choose these fields for link form to subforms.
  • Added the condition removed OR LOWER (b."Descrição_série") LIKE (SELECT '%' || LOWER ("Descrições") || '%' FROM "Tabela_filtro") (because I want the words saved on “Descrições” to include any results either from a.“Descrição_incorporação” or b.“Descrição_série” fields).

In the mean time I discover another way to solve the problem and I wanted to ask you which one do you think it's faster. I will post that in my answer.

pmfs1987 gravatar imagepmfs1987 ( 2020-06-10 14:06:23 +0200 )edit

@pmfs1987.

it's gratifying to see that you have achieved your own solution.

query execution depends largely upon the size of the derived table.

while i have filtered out unnecessary data you have added to it.

if you can perceive a difference in execution time between the queries with what is likely to be a relatively small result set then the performance gap will be wide.

i do not think that it's possible to measure query execution speed in an embedded database.

you should split your database anyway because embedded databases are prone to corruption.

a split database enables the use of the most recent version of that database currently hsqldb 2.5.0.

if you do decide to upgrade then i think that the 'CASE' structures will require replacement with:

cpb gravatar imagecpb ( 2020-06-10 18:43:12 +0200 )edit

what an effort to post a comment!!! struggling to format this.

(       
    case when (select "ID_assunto1" from "Tabela_filtro") is null then 0 else 1 end
    +
    case when (select "ID_assunto2" from "Tabela_filtro") is null then 0 else 1 end
    +
    case when (select "ID_assunto3" from "Tabela_filtro") is null then 0 else 1 end
    +
    case when (select "ID_assunto4" from "Tabela_filtro") is null then 0 else 1 end
)
cpb gravatar imagecpb ( 2020-06-10 18:45:24 +0200 )edit

Thanks for all the help and all the feedback.

pmfs1987 gravatar imagepmfs1987 ( 2020-06-11 23:56:14 +0200 )edit
0

answered 2020-06-04 00:57:23 +0200

pmfs1987 gravatar image

updated 2020-06-10 16:30:01 +0200

Following @m.a.riosv 's tip, and @cpb 's answer, I have reached a solution. Thanks very much for your precious help!

FORM DESIGN

I have recorded the query in a View, a feature explained in the Libreoffice Base Manual pointed by @m.a.riosv ’s tip.

Then, in the search form I added two independent forms: one for the filter table (Form_Tabela_Filtro) and another fed by the big table created by the View feature (Form_Resultados). From this last form I have added subforms showing the information I needed to visualize without repetition, selecting distinct from the big View table and using the master/slave fields to connect the information together (SubForm_Incorporações, SubForm_Séries and SubForm_Assuntos).

The final form aspect The form's structure

SEARCH QUERY

I used the query suggested by @cpb 's answer with a few minor additions to customize it to my needs:

  • Added b.“ID_incorporação”, c.“ID_rel_incorporação_série” and c.“ID_assunto” in the first SELECT to be able to choose these fields for link form to subforms.

  • Added the condition removed "OR LOWER (b."Descrição_série") LIKE (SELECT '%' || LOWER ("Descrições") || '%' FROM "Tabela_filtro")" because I want the words saved on “Descrições” to include any results either from a.“Descrição_incorporação” or b.“Descrição_série”.

ALTERNATIVE SEARCH QUERY

Although the suggested query works, I have reached another one just by joining 4 duplicates of the "Rel_série_assunto" table and selecting 4 different "ID_assunto" fields (with the help of aliasing). Then I could make the AND conditions regarding the "ID_assunto" field, because now this field is repeated into 4 different fields. Here is the query:

SELECT DISTINCT
    "Nome_entidade", 
    "ID_CDI", 
    "Ano_inicial_incorporação", 
    "Ano_final_incorporação", 
    "Descrição_incorporação", 
    ris."ID_incorporação", 
    ris."ID_série", 
    "Descrição_série", 
    "Ano_inicial_série", 
    "Ano_final_série", 
    rsa."ID_assunto" AS "Assunto 1", 
    rsb."ID_assunto" AS "Assunto 2", 
    rsc."ID_assunto" AS "Assunto 3", 
    rsd."ID_assunto" AS "Assunto 4", 
    rsa."ID_rel_incorporação_série", 
    "Nome_série", 
    "Nome_assunto", 
    "Temática", 
    "Descrição" 
FROM
    "Incorporações" i
    JOIN "Rel_incorporação_série" ris ON i."ID_incorporação" = ris."ID_incorporação"
    JOIN "Rel_série_assunto" rsa ON ris."ID_rel_incorporação_série" = rsa."ID_rel_incorporação_série"
    JOIN "Rel_série_assunto" rsb ON ris."ID_rel_incorporação_série" = rsb."ID_rel_incorporação_série"
    JOIN "Rel_série_assunto" rsc ON ris."ID_rel_incorporação_série" = rsc."ID_rel_incorporação_série"
    JOIN "Rel_série_assunto" rsd ON ris."ID_rel_incorporação_série" = rsd."ID_rel_incorporação_série"
    JOIN "Séries" s ON ris."ID_série" = s."ID_série"
    JOIN "Assuntos" a ON rsa."ID_assunto" = a."ID_assunto"
WHERE 
    (LOWER (i."Nome_entidade") LIKE IFNULL ((SELECT '%' || LOWER ("Nome_entidade") || '%' FROM "Tabela_filtro"), LOWER (i."Nome_entidade"))) 
AND 
    (LOWER (i."Descrição_incorporação") LIKE IFNULL ((SELECT '%' || LOWER ("Descrições") || '%' FROM "Tabela_filtro"), LOWER (i."Descrição_incorporação")) OR (LOWER (ris."Descrição_série") LIKE IFNULL ((SELECT '%' || LOWER ("Descrições") || '%' FROM "Tabela_filtro"), LOWER (ris."Descrição_série"))))
AND 
    "Assunto 1" = IFNULL ((SELECT "ID_assunto1" FROM "Tabela_filtro"), "Assunto 1") AND "Assunto 2" = IFNULL ((SELECT "ID_assunto2" FROM "Tabela_filtro"), "Assunto 2") AND "Assunto 3" = IFNULL ((SELECT "ID_assunto3" FROM "Tabela_filtro"), "Assunto 3") AND "Assunto 4" = IFNULL ((SELECT "ID_assunto4" FROM "Tabela_filtro"), "Assunto 4")

In this solution I also had to get another query for the "SubForm_Assuntos", that grouped together all 4 columns that I have created again into a single one. Also, I had to use a View for this, because the links of master and slave fields didn't worked without activating "Analyze SQL command" option (and that option didn't recognize the UNION clauses in the query). Here ... (plus)

edit flag offensive delete link more

Comments

hello pmfs1987,

your code is fine. you have explicitly declared your table relationships in the 'WHERE' clause. there has been some debate over many years amongst developers as to where these declarations should be made but Fred Toussi the man behind HSQLDB suggests that the 'FROM' clause is the correct place and the reason given is that the 'FROM' clause is parsed and data is retrieved before the 'WHERE' clause is executed. this results in smaller rowsets which speed query execution.

the error you received (you did not describe it) is likely related to duplication of the field "ID_assunto" because you did not explicitly declare it as '"Rel_série_assunto"."ID_assunto"' or using the alias C 'C."ID_assunto"' e.g.

AND ( C."ID_assunto" = IFNULL( ( SELECT "ID_assunto1" FROM "Tabela_filtro" ), C."ID_assunto" ) )
cpb gravatar imagecpb ( 2020-06-04 15:19:14 +0200 )edit

I just realized that the multiple conditions for "ID_assunto" field are not working when I choose multiple different fields in "Tabela_filtro"."ID_assunto1", "Tabela_filtro"."ID_assunto2", "Tabela_filtro"."ID_assunto3" or "Tabela_filtro"."ID_assunto4". I have searched for solutions, but they suggest using IN clause, which, in this case, does not seems to be valid (because I can only use one SELECT statement in the IN clause). I could keep only one field in "Tabela_filtro" table to filter the "ID_assunto", but maybe there is a solution for that. Do you think there is a way to maintain these multiple fields with a different solution in the query (or even to reformulate the search form structure)?

pmfs1987 gravatar imagepmfs1987 ( 2020-06-04 21:16:51 +0200 )edit

i have updated my original answer yet again with regard to your filtration issue.

cpb gravatar imagecpb ( 2020-06-05 01:37:02 +0200 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2020-06-02 20:37:46 +0200

Seen: 246 times

Last updated: Aug 02