Calc: Way to join/merge multiple data ranges to one

Hi there!

Assume we have defined multiple data ranges in Calc document. Data sources are LO Base queries against multiple Base databases, all ranges have identical structure - column headers, data types and so on, see image attached. Is there a pure Calc way to combine these multiple ranges into one new range? I have Macro and SQL based solutions for it, but I am just curious, if there is a Calc based alternative for SQL Union operator's functionality.

edit retag close merge delete

1

What do you want to do with the combined range? For example, to sum them, specify all three ranges like this: =SUM(B3:B6;D3:D6;F3:F6). Or do you want to view the results in a list, with no further calculations?

( 2018-06-10 01:21:17 +0200 )edit
1

Hi Jim!

Thanks a lot for your comment. Such combined range would be the final result with no further operations applied. So it would be just a list with data.

( 2018-06-10 01:38:55 +0200 )edit

Sort by » oldest newest most voted

The following is an overview of what can be done with Calc functions. If you have an SQL solution, then that might be better than these solutions, since databases do this naturally.

The union operator in Calc is a tilde (~), as explained here. This operator can be used in the INDEX functionto merge two or more ranges. The following formula was adapted from https://forum.openoffice.org/en/forum.... Enter it as an array formula.

=INDEX((B3:B6~D3:D6);{1;2;3;4;1;2;3;4};{1;1;1;1;1;1;1;1};{1;1;1;1;2;2;2;2})


However, this approach is of limited use because currently, there does not seem to be any way of generating the values for the arrays dynamically. For example, the formula below produces an error because ~ does not work in the context of the second parameter to INDEX.

=INDEX((B3:B6~D3:D6);(ROW(B3:B6)~ROW(D3:D6));{1;1;1;1;1;1;1;1};{1;1;1;1;2;2;2;2})


So, the preferred way of merging ranges is to use OFFSET instead. This requires several helper columns to determine the parameters for OFFSET based on the source columns.

A good example of the OFFSETapproach is at https://forum.openoffice.org/en/forum.... Here is a copy of the file, in case something happens to the original link: 201412011517.ods

more

Any single-formula solution would in addition require to be entered for array-evaluation and to calculate and lock its output range when called the first time. This range would be fix subsequently. Dynamic changes in the sizes of the primary ranges would not apply. A solution with helpers per row is likely to require a huge number of formulae, and bloat the file size making save/load operations slow... Directly resorting to SQL or "macro" solution better for the purpose.

( 2018-06-10 12:27:11 +0200 )edit

Dear @Jim K and and @Lupp, thanks a lot for your answers and comments. I have seen, downloaded and tested @Lupp solution from openoffice.org forum previously - it's smart and took me some time to figure out its logic. And unfortunately currently it seems the only way dealing with the task provided. Before opening this tread, I was pretty sure that there should be some simple joining syntax or new data range definition I was not aware of. Like Test1~Test2~Test3 or similar.

( 2018-06-10 21:48:59 +0200 )edit

But it's just fine, it is just my curiosity to find easier and more elegant way of doing things in Calc. SQL way is the best fit in this case, as @Jim K mentioned, it is natural task for SQL, not Calc. I am ok with it. Once again, thanks a lot for explaining things and for your time.

( 2018-06-10 22:03:54 +0200 )edit

=INDEX((B3:B6~D3:D6);...) should not work because "A reference list is not allowed inside an array expression.". https://help.libreoffice.org/Calc/Ope...

( 2018-06-10 22:27:11 +0200 )edit

It works though. Even more, if I replace ~ with ; and enter an array formula, LO changes semicolon back to tilde automatically. And if you use ~, you even do not need parentheses around Reference argument in Index function. Maybe this is because Index allows multiple ranges? Or array expression means {a,b,c,d...}, not an array formulae?

( 2018-06-10 23:12:52 +0200 )edit

Quoting @SM_Riga: "... but I am just curious, if there is a Calc based alternative for SQL Union operator's functionality."

I was also curious and made a demo similar to what the OQ presented unfortunately only as an image. (Always bad if not the question is specificallly about view issues.)

There was a total 95 rows in three groups of 3 columns each and the solution was designed to allow for a maximum of 1000 rows per group. (Using formulae you always need to specify such maximums.)
The solution based on standard functions was made using 3 helper columns and a few additional helper cells. Its final file size was about 200 KiB.

A solution with a slightly adapted custom function from my toolbox under array-evaluation made a file of 24 KiB. Of course it may be markedly slower if much more than 3000 rows of output are prepared. The file size would (next to) only grow for the data, not for the output range.

Edit1 regarding the comment below:

In case there is interest I attach this above mentioned demo containing the custom function I used.
(No idea for what reason the file size went up to about 40 Kib. I changed next to nothing.)
To pass and to evaluate an unknown number of parameters (for the same purpose) is a tricky thing and will only be possible at all if an upper limit for that number is assured.

Please note: The demo will not correctly work in AOO.

more

I was also considering to write user defined function for this task, but my knowledge of Basic is really basic :) And I stuck in writing VB command to join multiple arrays together. I managed to pass fixed number of ranges as function's arguments and that's all by now, not even passing array of unknown number of ranges to function. But hopefully, I will try to learn this later ;)

( 2018-06-10 22:36:18 +0200 )edit

Many thanks @Lupp I will take a look at this Demo a bit later.

( 2018-06-11 15:08:07 +0200 )edit

Amazing Work @Lupp. There is no easy way to STACK UP the data and "Data-->Consolidate" SUCKS. This is the main reason I had to move to Googlesheets where QUERY function does the same job almost effortlessly. I loved your approach to workaround this in Libreoffice Calc. This formula of yours ( COLLECTFROMRANGES ) should be made standard into libreoffice calc.

( 2020-01-05 08:28:58 +0200 )edit