Standard deviation spreadsheet - unexpected result

I have set up a spreadsheet to demonstrate to students how to calculate standard deviation. The result that I have calculated matches the result given with the data, from the source. However, it differs from the result of the spreadsheet formula...
=STDEV(B2:B20)

Is anyone able to explain why the function STDEV might not give the result expected?

I shall be happy to share my spreadsheet if attachments are possible on this forum?

The data set I am working with is available here:
https://www.rgs.org/CMSPages/GetFile....

My step by step spreadsheet calculations involve...
(1) Calculate the mean.
(2) Subtract the mean from each value in the set.
(3) Calculate the square of each value in (2).
(4) Sum the vales from (3).
(6) Take the square root of (5).

edit retag close merge delete

Do you expect potential contributors to figure out on their own, which data you have used from the link to a PDF, recreate formulas based on your description and compare with STDEV() result, while it could be so simple, if you'd upload your Calc sheet with the incorrect calculation (but may be you are looking for STDEVP())

( 2020-05-11 14:56:37 +0200 )edit

@Opaque: note that OP wrote:

I shall be happy to share my spreadsheet if attachments are possible on this forum?

... which means that only technical problem prevented OP from uploading the sample document.

( 2020-05-11 15:36:14 +0200 )edit

Oh sorry, I oversaw the statement
@Meneer: my apologies for the comment above.
@Mike Kaganski - Thanks to point me to that.

( 2020-05-11 15:39:25 +0200 )edit

Sort by » oldest newest most voted

Hello,

probably OASIS Standard documentation - 6.18.72 STDEV clarifies your mismatch (STDEV() is based on 1/(n-1), while your procedure divides by n in step (5))

Your procedure is how function STDEVP() works (see also OASIS Standard documentation - 6.18.74 STDEVP).

Hope that helps.

more

1

Good answer! Some background may still be in order.

There is a reason why the STDEV() "sample standard deviation" differs from the STDEVP() "population standard deviation".

"How much variation is there?"

STDEVP() is meant to be used if you have a dataset which encompasses the entire population in question. It is a determined value.

"What uncertainty is there to my statistical analysis"

STDEV() is meant to be used for a sample set. The (n-1) divisor gives a larger result, which signifies an added insecurity, represented numerically by what you may think of as an "inflated deviation". It is an approximation which depends on the selection of the sample set. The "inflating" diminishes if you take a larger set of samples.

( 2020-05-11 15:55:06 +0200 )edit

... I am aware of the difference between population and sample standard deviation. ...

Of course you would be aware of that, given that you are teaching a subject where this is an element. I must read my audience better, it seems. No offense was intended by my previous comment. My bad!

I still leave that elaboration as it stands, seeing that it just might be useful to other users searching for similar subjects.

( 2020-05-12 10:06:38 +0200 )edit

Thanks for a simple, clear and very swift answer. I am aware of the difference between population and sample standard deviation. I had looked for, but not found the alternative function. My bad.
=STDEVP()
Indeed, gives the result I was expecting.

more