Hello Y’all!
To start:
I’m an average user of Linux and LibreOffice (know to do simple formulas, but not very complex; i can install software and understand simple principles, don’t know programming).
I’m trying to do a Monte Carlo simulation in Calc - in an .ods file. I don’t know data bases - so this is the only option for me.
I’ve done what I consider 90%-99% - need help with the last 1%-10%.
I think I’m missing just 1 formula, but can’t figure how to get to the end and is quite frustrating.
If you’re the Guru of LibreOffice & math and don’t need the whole BACKGROUND/PRESENTATION of the issue - just jump to the PROBLEM part. Help is very much appreciated!
BACKGROUND/PRESENTATION:
Me and a colleague are comparing probabilities of events happening, which are presumed to be random, but which we consider to be influenced by unforeseen factors (environment/hidden abilities/etc), each having his own idea / formula how to express that.
Events can be something like “how many times we flip heads in 100 coin tosses” - in theory it should be close to 50/50 - but in practice maybe there are people that do it 60/40 or 70/30 on a regular basis/in certain conditions (eg: if he is calm/concentrating, or with a specific coin, or if it is warm, etc)
So far we have input and analysed the data I have; my estimations so far; the other guy’s estimations so far; we’ve made the comparison between expected outcome by myself, expected outcome by him and actual outcome.
Important note here is that our guesses are probability style - (not true/false) so the results we compare are also probability style, both positive (>0) and negative (<0) - not just 1 vs 0 vs -1.
like with the coin: i say this guy has 0.55 chance for heads on a coin toss (makes it 55 of 100 tries); he says 0.44. The guy tosses and makes 52 = the event is 0.52. I win (because my choice was closest to 52) the amount of 0.03 (=0.55-0.52)
Since the set of events considered is quite small (only 80 so far) and the variation of success both against each other and against actual outcome is very high - we can’t draw an firm conclusion who is right more often (and by how much) as a general basis / on a very long term (consider years/lifetime) - which is the target of this project.
If we had a big number of events (like 1.000.000 or 10.000.000) maybe we could, but we don’t have that.
So we considered comparing the difference in expectations vs random numbers (this is the Monte Carlo part. First time I try it, hope I got it right)
What I’ve done so far:
-
used RAND function to generate random numbers so we can compare for each event (the difference between my success and his) vs the random numbers. The result so far is a column of 80 numbers (between 1 and -1), 1 number of them for each event. The number is between 0 and 1 if i was more right than him, between -1 and 0 if I was more wrong than him.
-
I add these numbers (use SUM function) - and the result shows if I am generally right (sum>0) or generally wrong (sum<0) versus the other guy.
In theory if we do a big number of simulations - let’s say 1.000.000 or 10.000.000 (basis on the data we have so far) and make an average - this should show who’s right more often and how much time more often.
Or it can show is totally random and we need to improve our ideas/formulas until we get it right.
This is the years part
THE PROBLEM 1:
Be able to do this - I need a formula to generate thousands of SUM values (which are different each time because of RAND numbers that are involved in getting the numbers we add up) for me to make an average of - this should show the viability of my formula vs events so far.
Something that maybe “sends” in a cell (or 10000 cells or 100 or 100000) just the value generated by of the formula I use, each time I use it?
THE PROBLEM 2:
Since I’ve tried many ways to bypass this. One way I’ve found is that i can generate a SUM - then I can copy it and by “Paste Special” - “Number” - i Can paste the VALUE of the sum in a new cell.
Then i generate a new sum again, and go through the whole process and get a second value.
And a third, 4th, etc.
The problem is that it’s not viable for the large numbers I need (thousands to millions of SUM values to make average of) - because it takes maybe 10-30 seconds (less if rested, more if tired) to get this done to a cell.
Multiply 30 seconds10.000 cells (minimum) = 500 hours to get a 10.000 cells sample to make an average of… Just for current formula & current events.
And each time I add a new event (and we try to add a few /daily) everything changes & for me to study the impact = need to do it again. So… 5 events in a day = 5500 = 2500 continuous hours of copy-paste - in a 24 hour day… and i also need to sleep, eat and shit…
So another way of doing it would be if we could automatize the “copy”+“paste special”+“number” in a formula that can be used at once on many, many boxes, calculating each time a new value and sending that new value to the new cell.
Any Ideas/solutions?
P.S. I know there is an EXCEL function called “What if” which does exactly what I need/would be a solution to PROBLEM 1 - but I love Linux & I don’t want to Have to go back to Windows (or dual boot) just for this…