Standard deviation


This time I will take the opportunity to talk about how to calculate the STANDARD DEVIATIONS functions. The general formula for calculating the STDEV value for population is

As you can see in it’s original format, the function is a two-pass operation. First calculate the average value and the subtract that average value from each row in the current dataset.

I don’t find that good in terms of performance, so I figured out how to rewrite the function to be a one pass operation.

Can we apply the same thinking again when calculating the STDEV value for sample? Yes, we can.

Now we have a one pass operation and we can test the formula like this

The interesting thing here is that if you performance test my rebuild against the built-in Microsoft function, the rebuild is 25-30 percent faster! So it seems that the built-in function is using the two-pass operation. If Microsoft would adopt the rebuild, the function would be even faster and since it is non-blocking it also would be a candidate for batch mode operations.