Dynamically integrate external data sources (comparable to Power Query in Excel)

Operating system: Linux Mint 22.2 Zara - LibreOffice Version 24.7.72 - new to Linux and LibreOffice

Hello, I have searched and found several posts that deal with either Power Query functionalities for LibreOffice or linking external data sources. However, I have not found a real answer to my question, so I am posting it here:

I regularly receive *.csv files with exactly the same structure in a shared directory. I want a table to query this data (all files - including newly added ones!) when called up, update it and display consolidated data it in a slightly modified form (columns hidden and rearranged).

I’m not sure whether it’s better to use Calc or Base and how to get started. I’ve already made a few attempts, but nothing has worked the way I would like it to or — more likely — I didn’t really get it.

In Excel, I created a query with Power Query in a table for this purpose – this query updates at the touch of a button, hides unnecessary content and rearranges the remaining columns the way I want.

Best regards, and I look forward to an exciting exchange here in the forum – unfortunately, I will probably be “taking” a lot at the beginning and will only be able to “give” something back in the medium term.

Well, LibreOffice’s default database is “embedded HSQL”. When you create a new database while the use of Java is enabled and the use of “experimental features” is disabled, then you get a Base document with an embedded HyperSQL database (see Tools>Options>Advanced). This database engine supports linked text tables. However, the GUI of LibreOffice Base ignores this feature.
You may import csv into Calc, and if you do it right, it is easy to transfer these data into some database.
Personally, I handle all my csv with HSQL because in the long run, it is far easier than struggling with spreadsheet import. Most of the setup is done in SQL, and a little macro helps to pick the right file and exchange the linked csv.
As always, your personal flavour of csv is the big unknown, which makes it impossible to suggest concrete solutions.

1 Like

Hello @Villeroy, I don’t have a particular preference for the *.csv file format. It’s just that I receive reports from a system several times a day in this format – in a predefined directory. In order to be able to analyse and evaluate this data over longer periods, I always need the current data in a table – preferably with the columns that are important to me at the beginning and without columns that are irrelevant for the analysis. So my primary concern is the process of reading all files from a defined directory, consolidating them in a table and, ideally, selecting the required content. If the problem is the *csv format of the sources, I could see if this can be changed in the supplying systems – which table format would be more suitable? My knowledge of SQL or macro languages is rather limited - what is the absolute minimum required for this reading process?

That would be brilliant. ISO date-times and decimal points do work with any flavor of SQL.

This requires either a true database (such as HSQL) collecting all the csv data in a single table or a macro based spreadsheet solution doing something similar. I have prepared a spreadsheet template with macros: Apache OpenOffice Community Forum - Merge sheets of spreadsheet documents - (View topic)
Fill out the setup form. For correct csv-import you need to open one of your files manually and then call a helper macro reading the right import options for your flavour of csv.
In order to import csv correctly into Calc, you have to check option “Detect special numbers”. Unfortunately, this option is off by default, but your changes are stored in the configuration.
Select “English (USA)” as import locale if you have M/D/Y dates and point decimals, “German (Germany)” for dates like “31.12.1999” and comma decimals. You don’t need to struggle with my macro code.

1 Like

Running a Unix system, you have the utilities to analyze any amount of text data without any office suite nor database engine.

1 Like

That may be true, but I have just switched from Windows to Linux and do not need any further challenges. Furthermore, IT is merely a means to an end for me – I do not need to understand every detail of how things work – the result is what matters to me. I’m willing to invest a certain amount of effort to achieve the same result that I got with Excel and Power Query in five minutes – but only a certain amount, not any amount. I’m currently testing your macro, but I keep running into errors.

:question:

1 Like

I’m also a bit ambitious – I’ll try a little more and if I can’t figure it out, I’ll report the error here – maybe you can help me then.

How could it be possible by any means without any info at all?

  • Tools>Options>Security, button [Macro Security…]
  • Select the highest level of security.
  • Specify some directory(ies) where you want to store documents able to call macro code. This may include your documents directory but NOT the downloads directory.
  • Store all your text files in such directory or any subdirectory of such directory.
  • Open my “SheetMerger” template and save the new file in the same directory.
  • Cell B3: empty (same directory as your files, click the […] button otherwise)
  • B4: shell pattern of the files in question, for instance *.csv
  • B5: empty for now
  • B6: empty or 0 (any csv file gives only one sheet)
  • B7: >0 if your csv inclueds any header row(s), empty or 0 otherwise
  • B8: empty or 0
  • B9: to be filled out by a helper macro
  • Open one of your text files. Do not ignore any option. Check “special cells”, some appropriate locale and the right sepeartors.
  • With that text file in the active window, call Tools>Macros>Run… navigate to the setup document, library “SheetMerge”, module “FilterOptions”, select “setFilterOptions” and click the [Run] button. The macro will fill out cell B9 for you.
  • Finally click the [Go] button on the setup sheet.
1 Like
  • The [Go] button calls a macro that opens all the specified files one by one.
  • Inserts as many rows as being found in a file (minus header rows).
  • Writes data into the inserted cells one row below and one column right of the cell named “Target”. The column directly below “Target” gets the name of the file. This is useful if someone needs to keep track of the origin.
  • The cells right of the target cell remain untouched. You can use it as a header row for column labels.

Because the macro inserts new rows, all references in formulas, charts etc. update automatically to the growing size of the data range.

1 Like
  • Thank you for your many suggestions, and perhaps we have misunderstood each other: I did not expect you to help me now – with an error that I am not prepared to describe in more detail.

  • I wanted to say that I am keen to try a few things out for myself. Only if I am unsuccessful will I contact you here and ask if you would be willing to help me again.

  • Yesterday I had to work late into the night and today I have to take care of my sick parents – so it may be the weekend before I can deal with trivial matters such as Calc macros again.

…und wenn ich mir deinen vollen Namen so ansehe: Wir könnten uns auch auf Deutsch unterhalten bzw. diesen Dialog in der deutschen Community weiterführen, oder?

See you on the other side.