We will be migrating from Ask to Discourse on the first week of August, read the details here

Ask Your Question
0

More Efficient SQL for Nested Queries

asked 2021-01-25 03:26:40 +0200

Phrance QueTara gravatar image

updated 2021-01-25 03:30:32 +0200

LO_7.0.2.2, HSQLDB embedded & registered, Windows10_64bit

My query contains 3 nested sub-queries. Runtimes are ok, besides the very last nesting.

query_1                 Runtime: < 1 sec
query_2(query_1)            Runtime: < 1 sec
query_3(query_2(query_1)        Runtime: ~ 1 sec
query_4(query_3(query_2(query_1)    Runtime: < 7 min 30 sec

By "Runtime" I mean: Time from dbl-clicking the saved query icon to the time when the result is displayed.

QUESTION:

Is there a more efficient way of writing the query, so it runs faster and still return the same record set?

DETAILS:

1.) Create Table Definition for linked data-text-file.csv

CREATE TEXT TABLE "DOE-004" ("Recorded Time" VARCHAR(20), "Sampled Time" VARCHAR(20), "Pressure" VARCHAR(20), "Temperature" VARCHAR(20)

Menu/View/Refresh Tables ==> the new table is now showing up in tables container with the little squared-arrow logo

2.) Now link data-text-file.csv in same dir of this_LO_base.odb

SET TABLE "DOE-004" SOURCE "pressDataTest1.CSV;ignore_first=true;encoding=UTF-8";
  • Note: Data in this file is messy, column alignment parses fine, but fields can contain numbers, &/or text (see query_1 below)
  • The data-text-file.csv does not contain an index column, the 1st column contains a whacky timestamp, and with several header rows.

Saved query_1:

SELECT "Pressure" FROM "DOE-004" WHERE "Pressure" < 'A'

  • Runtime: Less than one second
  • Returns: 1019 records
  • Purpose: get all the pressure records which begin with a number, ignore records beginning with letters

Saved Query_2: (contains nested query_1)

SELECT MAX( "Pressure" ) FROM
( SELECT "Pressure" FROM "DOE-004" WHERE "Pressure" < 'A' )

  • Runtime: Less than one second
  • Returns: 1 record
  • Purpose: get record with maximum pressure record. NOTE: This max value can occur several times with different corresponding timestamps

Saved Query_3: (contains nested query_1, query_2)

SELECT MAX( "Recorded Time" ) FROM "DOE-004" WHERE "Pressure" =
( SELECT MAX( "Pressure" ) FROM
( SELECT "Pressure" FROM "DOE-004" WHERE "Pressure" < 'A' )
)

  • Runtime: around one second
  • Returns: 1 record
  • Purpose: get record with the latest timestsamp at maximum pressure record.
  • NOTE: Max pressure is likely to occur at several non-consecutive timestamps.

Saved Query_4: (contains nested query_1, query_2, query_3)

SELECT "Recorded Time", "Sampled Time", "Pressure", "Temperature"
FROM "DOE-004" WHERE "Recorded Time" =
( SELECT MAX( "Recorded Time" ) FROM "DOE-004" WHERE "Pressure" =
( SELECT MAX( "Pressure" ) FROM
( SELECT "Pressure" FROM "DOE-004" WHERE "Pressure" < 'A' )
)
)

  • Runtime: 7 minutes and 30 seconds
  • Returns: 1 record with "Recorded Time", "Sampled Time", "Pressure", "Temperature" at max pressure at its latest occurrence.
  • Purpose: get record with the latest timestsamp at maximum pressure record.
  • NOTE: Max pressure is likely to occur at several non-consecutive timestamps.
  • NOTE: This is the record set I actually want to extract.
edit retag flag offensive close merge delete

1 Answer

Sort by » oldest newest most voted
0

answered 2021-01-26 03:44:11 +0200

Ratslinger gravatar image

updated 2021-01-27 22:37:50 +0200

Hello,

Split into two. It is taking a long time (and will get much longer as records are added) because the last step needs to re-evaluate each record again to get the additional information.

Make this statement Query3:

SELECT MAX( "Recorded Time" ) As "MaxTime" FROM "DOE-004" WHERE "Pressure" =
( SELECT MAX( "Pressure" ) FROM
( SELECT "Pressure" FROM "DOE-004" WHERE "Pressure" < 'A' )
)

Then for the final run here is your SQL:

SELECT "Recorded Time", "Sampled Time", "Pressure", "Temperature" FROM "DOE-004" WHERE "Recorded Time" = ( SELECT "MaxTime" FROM "Query3" )

Very little time to run.

Edit 2021-01-27:

Can get very quick result with a single query:

SELECT "Recorded Time",
       "Sampled Time",
       "Pressure",
       "Temperature"
FROM (SELECT "Recorded Time",
             "Sampled Time",
             "Pressure",
             "Temperature",
             "MaxTime"
      FROM "DOE-004",
           (SELECT MAX("Recorded Time") AS "MaxTime"
            FROM "DOE-004"
            WHERE "Pressure" = (SELECT MAX("Pressure")
                     FROM (SELECT "Pressure" FROM "DOE-004" WHERE "Pressure" < 'A')))) A
WHERE "Recorded Time" = A."MaxTime"
edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2021-01-25 03:26:40 +0200

Seen: 59 times

Last updated: Jan 27