XMATCH and MATCH in neighbour cells

ulrike · October 31, 2024, 11:06am

I have problems using XMATCH and MATCH functions in neighboured cells.

I have two sheets in 1 document called t01 and Saldo.

In “t01” project times are (ordered by day) collected. And if a project is finished it is marked with a special string.

Date	Start	End	TestValue	Text	hours
Jänner 2024
01.01.2024	07:00	08:00	P00000	test	1
	08:00	09:00	P15000	test	1
	12:00	13:00	P17001	test	1
	15:00	16:00	P24001	test	1
	17:00	19:00	P15000	test	2

				§§fertig P15000



02.01.2024	15:00	16:00	P24001	test	1
	17:00	19:00	P15000	test	2
				§§fertig P15000

In Saldo the sum of hours for each project calculated. In addition I wanted the date when a project is finished. During testing I got some strange results, which I summarized in row 14-20 in “Saldo”.
First column shows the functions used, second and third columns contain the result. And in forth column I wrote the expected result of column three.
For me it seems that the result in column three is not calculated correctly, but taken from column two, but I am not sure, wether I made a mistake or somehow misunderstood XMATCH function.

get year from table name	2024		18.10.2020

				MATCH	XMATCH ↑
		hours		closed	closed test
	Testvalues
	P00000	1
	P15000	5		01.01.2024	01.01.2024
	P17002	0
	P24001	2




MATCH	9	9
XMATCH ↓	9	9
XMATCH ↑	15	15
MATCH XMATCH ↑	9	9	15
XMATCH ↑ MATCH	15	15	9
XMATCH ↓ XMATCH ↑	9	9	15
XMATCH ↑ XMATCH ↓	15	15	9

For calculating I use following formulars:
MATCH: =MATCH(CONCATENATE("§§fertig “;$B$7);t01.$E$1:$E$50;0)
XMATCH downwards: =XMATCH(CONCAT(”§§fertig “;$B$7);t01.$E$1:$E$50;0)
XMATCH upwards: =XMATCH(CONCATENATE(”§§fertig ";$B$7);t01.$E$1:$E$50;0;-1)
I also send you the calc document for reproducing the problem.
LO-XMATCH-p_2024.ods (16.7 KB)

Kind regards, Ulrike

Villeroy · October 31, 2024, 6:59pm

This is very easy to accomplish because it is a more of a database task rather than a spreadsheet task. The most database-ish tool within a spreadsheet application is the pivot table.
https://ask.libreoffice.org/uploads/short-url/ocdHmzA2hJwz0yHPAjtom4RYUYs.ods calculates the sums for each month of a year, for each category and for each person.
Notice that the source table on the left has complete column contents without gaps. Dates are formatted numbers (yours are text). The source records are sorted in random order. Sort order and filtering have no effect on the resulting pivot table.
Pivot Table
After fixing your data, you could create a pivot table with project-IDs in row fields (column on the left side, one row per project), MAX(Date) in data fields in order to get the latest date for each project and SUM(Hours) in data fields in order to get the amount of hours. This gives a pivot table with the sum of hours and the latest date for each project-ID.

EDIT:
ask113282.ods (113.7 KB)

ulrike · October 31, 2024, 9:27pm

Thank you so much, it will take me a time to change/fix data and create pivot tables.
Nevertheless shouldn’t the result of a formular coded with absolute references always be the same, regardless of the position in a sheet? Especially astonishing for me is, that the result changes depending on the formular used in the lefthand neighbour cell. As shown in my example: Why does =XMATCH(CONCATENATE("§§fertig “;$B$7);t01.$E$1:$E$50;0;-1) return 15 if it is first formular in a row and if it is second formular in a row (after =MATCH(CONCATENATE(”§§fertig ";$B$7);t01.$E$1:$E$50;0) the result shows 9.
Kind regards, Ulrike

Villeroy · November 1, 2024, 8:07pm

Spreadsheet lookups are difficult. Far too many computer users try to mimic databases on spreadsheets.
The attached database draft tries to demonstrate why interrelated items such as projects, employees and work hours do not belong into spreadsheets.
ask113282.odb (126.4 KB) (2nd version, fixing an issue with input in the “Times” form, introducing a default date)

The user interface consist of 3 forms (input of interrelated table data) and 1 report (printable output).
The actual data content, working shifts in particular, make not much sense since I generated them randomly on a spreadsheet.

ulrike · November 2, 2024, 8:00am

So kind of you to implement a fully database. But I think with my sample data I misleaded you.
So may be I can explain my problem better with data provided by libreoffice (I found in PIVOT table sample). I took the data from the car inventory and did a search for car models. To get the first row of one model use match or xmatch. To get the last row use xmatch with reverse direction. Unfortunately the results differ, wether you have the formulars in neighbour cell of the same row or not.
Maybe the data or use of functions once again do not conform the usage rools of spread-sheets. But please, look at my technical question: why changes the result of MATCH and XMATCH?
LO-XMATCH-p_2024 - V2.ods (18.5 KB)
Kind regards, Ulrike

EarnestAl · November 2, 2024, 8:16am

Try Data > Recalculate > Recalculate Hard (Ctrl+Shift+F9)

Just editing the formula and letting AutoCalculate, or even F9, doesn’t seem to do the trick.

You could report a bug on recalculation for XMATCH, How to Report Bugs in LibreOffice - The Document Foundation Wiki

karolus · November 2, 2024, 9:55am

Hallo
its clearly a bug … XMATCH changes return-value dependend on evaluation from Formula-wizard or via ctrl shift F9

please verify|confirm bug: tdf#163736

karolus · November 2, 2024, 10:28am

done!
bug: tdf#163736

Villeroy · November 2, 2024, 4:17pm

Right now, I can not analyze your files because XMATCH has been introduced in the latest version 24.8 which I had to discard because of regressions (issues that did not exist in versions 24.2). I’m not surprised that the new “X-functions” introduce new issues.

However, (X)MATCH has nothing to do with this task because these functions return the position of one distinct value in a list. (X)MATCH can not aggregate sets of multiple matching values.
WIthin the database demo, the report and its source query “qSimpleReport” do the same as a pivot table would do. The query aggregates the min(Date), max(Date) and sum(Hours) for each finished project. Since the projects are stored in a separate table, one can simply mark a project as finished by means of a check box. In my spreadsheet sample, I use a separate list with the names of finished projects.

The source table on sheet #1 of your sample lacks a consistent structure. It requires dates in chronological order, otherwise you get wrong results. The dates are text and they are incomplete. Empty cells in column A are meant to refer to the latest date above. At some point you need to explain this meaning to the spreadsheet software or fill up the gaps (which is easy).
Then you have “exceptional rows” with human-readable info such as “§§fertig P15000” instead of a valid record. This complicates things even more. A report could be such a human-readable source of information in chronological order having groups with interspersed info. This structure is how info can be derived from data for presentation. Regardless of spreadsheet or database, this is not the way how data should be stored, if they need to be analyzed in a later process.

In the long run, a database is much easier to maintain. Apart from questions related to computing efficiency, you never need to care about the first and last row of some reference used in a calculation or lookup. You simply add new data without changing the database document or the database structure. You don’t need to store anything because every single record is stored to disk after editing.

ulrike · November 2, 2024, 7:03pm

Sorry, if I annoyed you with a bad structured sample. And many thanks for your advices.