Rearranging
I was chatting with some other quant developers the other day, as you do, and the issue came up of calculating variances. Its a pretty common operation, and the naive approach does two passes, one to calculate the mean, , and one to gather the squares of differences from that. … but someone asked if it could be done in one pass, and of course it can, quite easily.
Recall, the population variance of a sequence is defined as -
We can expand this and see how the variance after n terms differs from the variance after m=n+1 terms, vis -
and we have in terms of
vis -
In other words , so theres very little we need to store to accumulate the variance as we traverse the sequence.
C++ Implementation
Expressing this in C++ code we’d have a functor that maintains state and gets handed each element of the sequence, in simplest form -

and call it like this -

Boost Accumulators
We could make the above accumulator class larger, and calculate all sorts of other stats whenever we do any pass – eg. mean, variance, nth order moments, skew, kurtosis, spectrum buckets, min, max etc.
A nicer solution is to use boost/accumulators/stats. Basically you decide which accumulators you want, and add them to an accumulators set. Each accumulator in the set is run as the single pass proceeds, even keeping track of dependencies between each other so they are run in the right order.
Vis -


1 comment
Comments feed for this article
February 8, 2009 at 4:21 am
computing kth moments - performance comparisons « quantblog
[...] February 8, 2009 in Uncategorized My previous blog articles describe approaches to calculating kth moments in a single pass – see compute kth central moments in one pass and variance of a sequence in one pass. [...]