Nested #databases in #Base

  • Issue

    Sometimes I want structured key-value pairs inside a database entry. In Python, the equivalent would probably be nested dictionaries. How do I achieve this?

  • Examples

    # Standards
    
    ["Dates"]     == ["ISO 8601"]
    ["Countries"] == ["UN M49",
                      "ISO 3166"]
    ["Cities",
     "Towns"]     == ["UN/LOCODE"]
    


PS /home/rokejulianlockhart> uname -a
Linux RQN6C6 6.3.9-1-default #1 SMP PREEMPT_DYNAMIC Thu Jun 22 03:53:43 UTC 2023 (0df701d) x86_64 x86_64 x86_64 GNU/Linux
PS /home/rokejulianlockhart>

of cpe:/o:opensuse:tumbleweed:20230701 (per cat /etc/os-release)

PS /home/rokejulianlockhart> snap info libreoffice
name:      libreoffice
publisher: Canonicalâś“
store-url: https://snapcraft.io/libreoffice
contact:   https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bugs?field.tag=snap
license:   unset
commands:
  - libreoffice.base
  - libreoffice.calc
  - libreoffice.draw
  - libreoffice.filebug
  - libreoffice.impress
  - libreoffice
  - libreoffice.math
  - libreoffice.writer
snap-id:      CpUkI0qPIIBVRsjy49adNq4D6Ra72y4v
tracking:     latest/edge
refresh-date: today at 11:49 BST
channels:
  latest/stable:    7.5.4.2 2023-06-12 (279) 1GB -
  latest/candidate: 7.5.5.1 2023-07-04 (281) 1GB -
  latest/beta:      7.5.5.1 2023-07-04 (281) 1GB -
  latest/edge:      ↑                            
installed:          7.5.5.1            (281) 1GB -
PS /home/rokejulianlockhart>

HSQLDB and Firebird.

Your specification is a bit terse to fully understand the need.

For scalar values, the SQL equivalent formulation would be something like:

CREATE TABLE kwpair
(  id TEXT UNIQUE
,  key TEXT NOT NULL
,  value TEXT
)

You reference a specific record through its id field.

When the value is a list you have a 1:n relationship and the value is accessed with an indirection through another TABLE.

Several variation are possible like:

  • keyword can’t change => put value directly in the “main” record
  • any number of keywords can tag the “main” record: create an indirection with a m:n relationship (through an intermediate TABLE)

For more detailed answer(s), edit your question to mention OS name (looks like it is some Linux distro), LO version and database backend.

@ajlittoz,

Edited.


I apologize since your response seems well laid out and concise, but I’m not much of an expert at databases. I simply want to be able to do something like

Standards = {"Cities": "ISO 3166",
             "Towns": "UN M49"}

where Standards is the database column key, and the dictionary keys and data are represent a nested version of

Cities = "ISO 3166"

Does that clear anything…? I’m used to being able to nest data structures as a programmer in order to create more complex data structures (XML, JSON, etc.) so this database stuff seems alien to me in that it doesn’t appear as extensible as I expected. I’m hoping this is due to my own incompetence.

And relational databases use one special model of tables, wich can be searched for very performant. So maybe a relational database is not the right tool to choose, or expect a lot of work to translate between two worlds…
.
You may put json strings as values in a table, but then you loose the efficient indexing and extraction, when you need to inspect the elements of every string. Some may be found using LIKE but this will not always be usable.
.
Compare it to a program like grep to find data in text-files. If you ask the program to work on archives (zip) it has to do a lot of work to uncompress data for the search, so this slows down seriously.
.
Some databases have added extensions to access especially JSON-Tokens but this is usually meant as help for importing/exporting.

Actually it is quite extensible, and your hierarchical data can maybe accessed with common-table-expressions.
Find some examples on the Sqlite-website (but dont ask me to explain this)

https://sqlite.org/lang_with.html#sudoku

This is not “simple”. On one side you have a Python expression involving a complex object called a dictionary. This data object is implemented (in memory) with hashing tables, a heap and various pointers. In short it is a “dynamic” mix of an algorithm + data. On the other side, a database is a way to store “statically” data in the long term. Data is recorded on disk in an elementary form which is described conceptually as columns (an array of identically typed basic data – numbers, strings or sometimes dates) in a table (a collection of columns where each row is a record).


As you see, there is no pointer within the basic data, with the consequence that no abstract data type, like a tree or “dictionary” is possible. There is even no array, making it impossible to store “naively” a list. However, workarounds allow to store any data architecture in a DB, thanks to the powerful and efficient query feature in the DB engine.


For example, contents of a list will be stored in a separate TABLE where each element consists of an id (a key) and the other columns making the element value. All members of the list share the same key. A different list uses a different key. In the “main” table, the column supposed to be list-valued contains the key specific to the list.

To retrieve the list, you query the DB on the “list table” with the key. You then get a collection of records = all members of the list.


You must understand that DBs and programs in any languages live in different “worlds”. You must translate concepts of one world into concepts in the other world. It is your programmer’s responsibility. In many languages, designers have proposed library to simplify the job. There are even “frameworks” with generic interfaces to DBs, but being generic and 'universal" they need to be understood mastered first.


There is also a quick “solution” avoiding the need to translate your abstract objects: BLOB (binary large object). A BLOB is yet another basic data in DBs. It was originally provided to store data which are neither numbers nor strings (think of images). The DB engine makes no assumption about what is inside (and therefore doesn’t try to optimise either for storage nor speed). Consequently, you can store directly any Python object in such a column. However, there is an important caveat: internally a Python object uses hardware pointers to manage its structure. You can’t store pointers, even in binary form, in a DB because the object will never be reloaded at the exact same address as where it was created. Consequently a Python object must be serialised (is “pickled” the Python term?) before being stored into the DB and unserialised on reload. This again must be done by yourself.

Anyway, using Base, you can’t spare the task of describing your data in a schema. This schema is written in SQL language. There are slight variations in SQL depending on the backend. After organising your data storage, you have to write queries. Each query addresses a particular aspect of your data. Some queries can be relatively complex likes ones retrieving records which contains lists or other high-level sub-records.

1 Like

Do you propose a different tool instead? I haven’t found any solution to nested data storage with reuse of keys except Nested SQL - Wikipedia, which was why I expected a database was the correct tool.

Surely they’re also used for dynamic data storage, @ajlittoz?

Base is not a database. It is a tool to work with relational database (SQL databases). SQL does not support the syntax you have in mind. If you want to work with JSON, apart from this extension LibreOffice is completely unrelated to JSON.

1 Like

Maybe read about NoSQL databases like CouchDB or MongoDB. These usually provide storage for json-like data.

1 Like