How to select row at random from among the rows that have a value of "1" in a particular column.

NightOwl · April 1, 2021, 7:28pm

I am trying to select one row randomly from among the rows that have a value of “1” in a particular column. Not sure how to represent the sub-array of those rows so I can select one randomly.

I calculated a random index for the elements that are set to 1 like this:
=1+INT(RAND()*COUNTIFS(C2:C177,"=1"))

This works, but I’m not sure how to use INDEX to select only from the rows where the C column value is 1. Please advise. Thank you.

JohnSUN · April 2, 2021, 7:35am

Not sure how to represent the sub-array of those rows

I read your question several times, but I could not understand - do you need all the rows (“sub-array”) that have 1 in column C? Or is one of them really enough for you? For the first task, you can use the usual Standard Filter. For the second case, may turn out to be a working array formula of the form

{=OFFSET(<first_row_from_data_range>;LARGE(IF(<range_with_1>=1;ROW(<range_with_1>);"");RANDBETWEEN(1;COUNTIF(<range_with_1>;1)))-1;0)}

For example, for a table that a respected colleague @Lupp has formed for you in his example, the formula will be as follows

{=OFFSET(A2:B2;LARGE(IF(C2:C177=1;ROW(C2:C177);"");RANDBETWEEN(1;COUNTIF(C2:C177;1)))-1;0)}

Lupp · April 2, 2021, 10:46am

@JohnSUN:Your more concise formula is well designed.
However, including the “check” for the content of column C (must be 1), I find that the formula aims one row below what was expected.
(Since I always expect errors in my formulas, I tend to include checks where possible.)

Rectified (You may replace the C by B again in the main range address):
=OFFSET(A2:C2;LARGE(IF(C2:C177=1;ROW(C2:C177)-ROW(C$2)+1;"");RANDBETWEEN(1;COUNTIF(C2:C177;1)))-1;0)
(I preferred -ROW(C$2)+1 over a “simple” -1 to emphasize the functionality of the correction.)

JohnSUN · April 2, 2021, 11:02am

@Lupp Yes, we recently mumbled about it - fixing minor bugs as a learning process

Lupp · April 2, 2021, 11:16am

Completely agree, my friend.

NightOwl · April 2, 2021, 6:18pm

Thanks for your suggestions. I am a newbie to spreadsheets, so a lot of this is confusing. The file submitted by Lupp finds a random row but not from those columns under the “1” heading that have value 1, but from those that are non-zero in the INDEX column. So it doesn’t seem to work.

The other formulas given above and the deprecated one cell formula in the file produce 504 errors when I try them in Lupp’s file. When I enter them in a cell of my spreadsheet they give me a #VALUE! result. Do I just enter them directly in any cell and press enter?

To restate what I want to do: select a row (or two particular cells in that row) at random from among the rows that have the value “1” in the C column. The other cells in that column have no values assigned. The data I am interested in is between A2 and C177.

Lupp · April 2, 2021, 10:01pm

The rows having a number in the (helper) column INDEX are exactly the rows having the value 1 in column C. The number in the INDEX column is the row-index into the data range (See A1) needed to get the chosen row. These indices are unique and each one can therefore be selected with equal probability with the help of the SMALL() function using RANDBETWEEN() for the rank (See field below TAKE).
The complicated one-cell-formula used for the second solution in H6:J6 may be seen as a kind of a joke. (I’m not a friend of complicated one-cell-formulas.) If you definitely want to solve your task this way, you should better use what @JohnSUN posted (with the correction).
All the three mentioned solutions work correctly for me.
If you don’t understand the usage of INDIRECT() or some other sub-expression, don’t hesitate to ask again.

Lupp · April 1, 2021, 8:37pm

As so often i spreadsheet it depends on the “size of the exotic problem” if a basically correct solution is actually applicable and efficient enough.
If you can assure to have no more than a few hundred rows, different solutions are
possible.
See attachment for two of them.
If the task is a reoccurring one for you, you may consider to resort to a bit of user code for it.
ask301860randomSelectRowConditional.ods