How to merge two sorted colums in one single column

Hi,

I need a formula for the following.problem:
I have two columns, both with sorted integer values. The number of values is unknown but each list ends with #nv. E.g.:
Column 1: 3-4-5-23-31-44-#nv
Column 2: 1-6-7-15-17-66-67-72-112-#nv
The resulting column should look as follows:
1-3-4-5-6-7-15-17-23-31-44-66-67-72-112-#nv

I couldn’t find a standard formula and up to now I wasn’t able to construct terms that solve this task.

Maybe somebody has a good idea?
Thanx for helping!
Andy

V 25.8 comes with a package of new Calc functions allowing to do such a thing with a single formula based on LET(), SORT(), FILTERBY() [implemented since V 24.8].
The attached example was made with a pre-release version.
disask124104_playMergeWithNewFunctions_25_8.ods (21.0 KB)
disask124104_playMergeWithNewFunctions_25_8_B.ods (19.6 KB)

1 Like

Thank you very much for spendind time for my question.
Also thanks to kemel and fpy!

With your answers I realized that I forgot to mention important additional informations:

  1. The two input colums themselves are an output of a database query and are subject to change (this is why you find a # at the end of the data)
  2. The integer values in the two colums are unique (IDs) and the number of the entries in the two colums is changing with every change of the database data
  3. I need a dynamic solution based on functions of CALC and …
  4. The solution has to be compatible with EXCEL because the file will be changed with people that are not using CALC …

But I got good advices from all of you and will find out, if some of them will help to solve my problem.

(Have these people also acces to the database you use queries for, and is Excel capable of getting the results? The fact that the alleged need to be compatible with Excel blocks good solutions is an old misfortune.)
What I presented is dynamic if the functions used in column D including the new (25.8) VSTACK() are available. The used functions are either old or recently implemented for compatibility with Excel. The non-dynamic presentation in column F is only included to show the result to users not already having a V25.8 prerelease at hand.
Concerning the annoyances coming with the usage of array formulas, Excel's "dynamic arrays" may be able to help, but I have no Excel at hand, and the “explanations” offered by MS-help pages I don’t understand as usual. Anyway my own solution for the problem is well considered, and made for many purposes. Mainly it avoids the need to lock any ranges for array output.
The fact that the original attachment in the example above contained an unused UDF fragment was an mistake. I have since corrected this.

The obvious solution is to copy, paste and sort again, but I suppose you want some dynamic solution.

I guess you could use MATCH() to locate the #nv marker position, or COUNT() to determine how many numbers you have. OFFSET() can use the returned position value to set the size of the list in use. Concatenate two OFFSET() ranges and use SMALL() repeatedly down the new column to generate the resulting list.

For more explicit solutions, a sample file showing your data structure would be useful.

In procedural programming you would probably proceed down the lists with a compare/insert operation to merge two sorted lists, but for the “functional programming” of a spreadsheet I believe that this is the best strategy.

somehow an anti-pattern :face_with_thermometer:

your use case is simpler → 43pxk59bu - Python - OneCompiler

but you can get the idea from the one discussed at length here : Consolidate partial-matching rows - add unique cells to previous matching rows - #15 by fpy

a bizarre solution in 7 steps to approximate by auxiliary columns:
0_disask124104_bizarre successive approximation in 7 steps_093027.ods (31.9 KB)


I’ve chained all the inputs together into a mess. Then, because every number is unique and can appear anywhere, I search for the position of this unique number as a WORD string. Furthermore, I remove everything that isn’t a number WORD. This is explained by the fact that CALC/EXCEL doesn’t recognize real numbers, but only individual letters or digits. I use helper columns that can be hidden, and their short formulas are easy to change. A long-winded formula is far too exclusive and difficult to adapt.

As you need exchange with Excel, I would stick with xlsx-Format. Using that format hat the advantage, that you can use “structured references”.

Import your data as database range. Thereby enable “data has total row”. For your data it is not really a total row, but contains the #nv marker.

In my example I do not have a data range from import but have defined the database ranges “Data_A” and “Data_B” locally. This names define each a “structured table” in Excel.
The formula is then =SORT(VSTACK(Data_A[ValuesA];Data_B[ValuesB])). Thereby ValuesA and ValuesB are the column labels in the example. This formula is as Excel has generated it. A description of the “structured references” is in CalcTableRef - The Document Foundation Wiki.

The function VSTACK is new. It will be released in August, but you can already test it in a Beta-Version.

Excel has the feature “dynamic array”, that is, the result automatically spills into the needed area. That feature is not available in LibreOffice (tdf#127808). In LibreOffice it is imported as array formula. When the imported data change, you need to manually drag the range of the array formula of the combined data, so that all data are visible. The range for the imported data adapt to the current values, when you refresh the data base range.

CombineTables.xlsx (12.2 KB)

This short solution is not possible, when you use ods-format. When you want to use ods-format, than you need to find the position of #nv using the MATCH function. From that position you calculate the cell range as text. Then you can access the actual data by using function INDIRECT.

MergeSortedRanges.ods (12.5 KB)

2 Likes