Ask LibreOffice - RSS feedhttps://ask.libreoffice.org/en/questions/Questions and answers for LibreOfficeenThu, 20 Sep 2018 20:31:16 +0200Formula to find a row that uniquely matches multiple criteria, including a regular expressionhttps://ask.libreoffice.org/en/question/166273/formula-to-find-a-row-that-uniquely-matches-multiple-criteria-including-a-regular-expression/[C:\fakepath\MatchCriteria.ods](/upfiles/15374671682812127.ods)
The following formula counts the number of rows that match certain criteria. **Is there a formula that will find the row that matches those same criteria when there is exactly one such row?**
=IF(LEN(A2),
COUNTIFS(
$Sheet2.$A:$Sheet2.$A, "=" & A2,
$Sheet2.$C:$Sheet2.$C, "=" & COUNTIF($A:$A,"=" & A2),
$Sheet2.$D:$Sheet2.$D, ">" & SUMIF($A:$A,"=" & A2,$B:$B) - 0.5,
$Sheet2.$D:$Sheet2.$D, "<" & SUMIF($A:$A,"=" & A2,$B:$B) + 0.5
)
,"")
----------
I'd also like to be able to replace references to A2 in the formula with a regular expression string, which I can do in the formula above because COUNTIFS, COUNTIF and SUMIF all accept regular expressions.
The rest of this post is context for the above question
----------------------------------------
I have two spreadsheets (one sheet in each) both with data relating to the same list of people. The data relating to each person can be used to find which person matches which between the two spreadsheets. Only of the spreadsheets has an ID for each person. I need to add the each person's ID to the spreadsheet that doesn't have them.
To do this I've written a macro that
1) Imports both spreadsheets into a new spreadsheet. The first sheet (Sheet1) in the new spreadsheet is the one without IDs, the second sheet (Sheet2) has the IDs
2) Adds a column to Sheet1 ("Match Count") with a formula for each person which counts the number of rows in Sheet2 that the matches the person's name and other criteria (See below)
3) Adds a column to Sheet1 ("ID") with a formula that shows the ID from Sheet2 when the "Match Count" value is 1.
In Sheet1 there are one or more rows for each person, grouped together. Sheet1 has an "Amounts" field which has a numeric value in each row.
Sheet2 has one row for each person, and has the columns "Number of Amounts", which corresponds to the number of rows for the corresponding person in Sheet1, and "Total Amount" which corresponds to the sum of the "Amounts" values for the same person in Sheet1. For the purposes of matching people between the two sheets "Total Amount" can be up to 0.5 more or less than the sum of the corresponding "Amount" values in Sheet1.
There can be different people with the identical names but they will have different "Amounts".
This is the "Match Count" formula (without the nested SUBSTITUTEs for readability)
A is "Name" in both sheets
Sheet2: B = "ID", C = "Number of Amounts", D = "Total Amount"
=IF(LEN(A2),
COUNTIFS(
$Sheet2.$A:$Sheet2.$A, "=" & A2,
$Sheet2.$C:$Sheet2.$C, "=" & COUNTIF($A:$A,"=" & A2),
$Sheet2.$D:$Sheet2.$D, ">" & SUMIF($A:$A,"=" & A2,$B:$B) - 0.5,
$Sheet2.$D:$Sheet2.$D, "<" & SUMIF($A:$A,"=" & A2,$B:$B) + 0.5
)
,"")
That works great, but I want is a formula that gets data from the row that uniquely matches the same criteria when the "Match Count" is 1. **It seems to me that if I can determine that one row uniquely matches certain criteria then I should be able to find that row with a similar formula, but I don't see it. Maybe I'm missing something very simple.**
The first formula that I came up with works in most cases, but does not work when there are more than one rows in Sheet2 with the same name as the person being matched from Sheet1, unless the first row with the matching name in the Sheet2 happens to match all of the criteria.
A is "Name" in both sheets
Sheet1: E = "Match Count" (above)
Sheet2: B = "ID", C = "Number of Amounts", D = "Total Amount"
=IF(OR(ISBLANK(A2), E2<>1), "", IF(AND(
COUNTIF($A:$A,"="&A2) = INDEX($Sheet2.$C:$Sheet2.$C,MATCH(A2, $Sheet2.$A:$Sheet2.$A, 0)),
SUMIF($A:$A,"="&A2, $B:$B) - 0.5 < INDEX($Sheet2.$D:$Sheet2.$D,MATCH(A2, $Sheet2.$A:$Sheet2.$A, 0)),
SUMIF($A:$A,"="&A2, $B:$B) + 0.5 > INDEX($Sheet2.$D:$Sheet2.$D,MATCH(A2, $Sheet2.$A:$Sheet2.$A, 0))
),
INDEX($Sheet2.$B:$Sheet2.$B,MATCH(A2,$Sheet2.$A:$Sheet2.$A, 0)),"Failed") )
The other formula I came up with works but is VERY slow and uses the "MATCH(1, (...) * (...) * (...), 0)" version of the MATCH function which does not seem to allow regular expressions...
A is "Name" in both sheets
Sheet1: E = "Match Count" (above)
Sheet2: B = "ID", C = "Number of Amounts", D = "Total Amount"
=IF( OR(ISBLANK(A2), E2<>1), "", IFERROR(
INDEX($Sheet2.$B:$Sheet2.$B,
MATCH(1,
($Sheet2.$A:$Sheet2.$A=A2) *
($Sheet2.$C:$Sheet2.$C=COUNTIF($A:$A,"="&A2)) *
($Sheet2.$D:$Sheet2.$D < SUMIF($A:$A,"="&A2, $B:$B) + 0.5) *
($Sheet2.$D:$Sheet2.$D > SUMIF($A:$A,"="&A2, $B:$B) - 0.5)
, 0)
)
, "Failed"))
To get around the speed issue by using the Fast formula for all persons that have a "Match Count" of 1, and for any that the formula doesn't work for I can then use the Slow formula instead.
However I also need to be able to use regular expressions. Some of the rows with a "Match Count" of 0 (less than 1%) come about because of minor differences between the way a name is represented, like several spaces in the name in one sheet where there is only one space in the other, or a comma in one that does not appear in the other (e.g. "John Smith, Jnr"). I can handle those variations relatively easily by using a regular expression based on the literal name string, e.g, "John[, ]+Smith[, ]+Jnr". I can't to do that when the formula I'm using doesn't take take regular expressions. I have to use macro code to find the row with all the matching criteria. That works, but I feel that a formula similar to the "Match Count" one would be more efficient.
So - how do I find row that uniquely matches multiple criteria, including a regular expression?
[C:\fakepath\MatchCriteria.ods](/upfiles/15374671682812127.ods)Thu, 20 Sep 2018 20:31:16 +0200https://ask.libreoffice.org/en/question/166273/formula-to-find-a-row-that-uniquely-matches-multiple-criteria-including-a-regular-expression/