You are currently browsing the tag archive for the ‘vfuncs’ tag.

Continuing on the same topic as my previous post, its nice to be able to gather all the kth order moments in a single pass.

Last time I mentioned the boost/accumulators example, but you will have noticed two issues if you use that.  Firstly, moment<k> tag will give you the kth simple moment relative to zero, whereas we often want the kth central moment of a sequence relative to the mean.  Secondly, although boosts accumulator is well written it does seem to take a while to compile [~ 12 seconds for code using moment<12>].

After some playing around Ive got a faster simpler approach, where the inner loop accumulates kth powers of the element.  After you’ve run the sequence through, you can then easily extract variance, and all the kth central moments.  So in adding the more general case of kth moments, Ive made the particular variance case simpler.  That often seems to happen in programming and math!

### algebra

First a bit of math and then the code.  We want to express the kth central moment in terms of the k basic moments.

First, lets define the basic moment as –

$\displaystyle M_{n}^{j}= \sum_{i=1}^n {x}_i^{j}$

We rearrange the binomial expansion –

$\displaystyle nv_{n}^{k}= \sum_{i=1}^n({x}_{i}-\mu_{n})^k$

$\displaystyle = \sum_{i=1}^n \sum_{j=0}^k \binom{k}{j} {x}_{i}^j(-\mu_{n})^{k-j}$

$\displaystyle = \sum_{j=0}^k \binom{k}{j} (-\mu_{n})^{k-j} \sum_{i=1}^n {x}_{i}^j$

So we have the kth central moment given as a weighted sum of the kth simple moments –

$\displaystyle v_{n}^{k} = 1/n(\sum_{j=0}^k \binom{k}{j} (-\mu_{n})^{k-j} M_{n}^{j})$

which shows that all we need to accumulate as we walk across the sequence is the kth simple powers $({x}_{i})^k$.

Notice the variance is now handled as a special case where k=2.  Likewise k=0 corresponds to n, the element count and k=1 is the sum of elements.

### c++ impl

Heres a basic impl of the above expression –

The embarrassment of riches truly astounds in the domain of ahem ‘very small’ lisp implementations… and who am I to refuse to fill an obvious missing link in the series?  We got small covered…

Yes, I have in fact begun a super cut down lisp implementation for vfuncs, which I think will be a nice way to express such idioms as map [hey, lisp invented map, right? – maybe not, better check ]

I’ve uploaded the new files atto.c, atto.h, attotest.c to the vfuncs project page – grab v0.6.7.

Some of the earlier blog comments around vfuncs got me to wondering.. How would you implement closure and currying in straight C so it could be used in useful array verbs?  It seems what you need is to replace some verb params with values from the ‘environment’.  That way the verb can use both runtime args and design time args easily… I did hack up something, but it began to look more and more like I was writing an interpreter… so I decided the ‘honest, unenlightened hacker’ way to proceed was to actually _write_ an interpreter for as simple a language as I could think of…

The core guts of lisp is, well – calling functions with args, when any of those args could be a const, a variable or another function.  So, the atto-kernel of lisp is something like this grammar [a comment pasted from the C source] –

//      expr : left func args right // (func arg0 arg1 arg2 )
//      args : arg args | empty // args is any length 0..n
//      arg  : ident | const | expr // each arg can be an expr itself – nested calls

In the spirit of ‘start small, get it working and add features’, this cut-down grammar allows you to parse expressions like the following –

//     “(sqrt (add (sqr 3.0) (sqr 4.0)) )”

This is the ‘lisp way’ – everything is in prefix notation ie. (* a b) not (a * b).  You get used to it, and it is very consistent [and the reason lisp macros are more part of the language than say C/C++ macros].

After seeing some discussion about Kernel Tracing tools in the recent Linux Kernel Summit…I found this great tech-talk.

Dtrace is like ‘TiVo for the Kernel’ … its been included in FreeBSD and Solaris, but Linux is unlikely to embed it directly due to the incompatibility of the GPL with Dtrace’s licence.

Great Dtrace google talk video by one of the developers Bryan Cantrell [his blog here]…

Hilarious discussion of seeing into all the abstract layered levels that is software, nice demo tracing into both kernel and user space programs.

At the end he talks a while about how to view into scripting languages – normally you’d just see rbeval [for ruby] or the JVM implementation – but he decribes how to extend further to Dtrace into scripting Environments.   Superb!

He makes some frank comments about the political situation of porting Dtrace to Linux.

Perhaps solution is to make a clean room GPL  impl from the external API.. watch this space.

Whats immediately apparent from this talk is how useful Dtrace is for developers and sysadmins.

[ fyi, Im working on an ‘attolisp’ scripting implementation for vfuncs.. hmm.. now how to instrument it?  Can you believe nanolisp, picolisp and femtolisp project names were taken! what an embarrassment of riches! :]

Just a note that Ive uploaded the initial version of vfuncs to google code. Ive released under a BSD license so you can use it in your commercial and noncommercial code easily.

Download from here [I’ll import to SVN sometime soon]. See my previous post for a description of vfuncs.

This version contains an example of a digital filter. This can be used to smooth the series data, or apply other signal processing operations. If your familiar with applying a blur filter in photoshop or gimp, using a gaussian filter kernel, this is exactly the same idea (except in one dimension).  Gaussian filter is basically just a moving average of the data.

Think of the algorithm as applying a sliding window across the data – the sliding window contains the filter weights, and at each position you apply the weighted average [dot product] of the filter weights against each data point in the window.

If the filter contains a single element of weight 1.0, then the result is just the input (the filter is just the Dirac delta function in that case). If the filter contains [0.25 0.50 0.25] its going to mix each element with its neigbours and take a weighted average, thus smoothing the data.

I want to describe a simple experiment Ive just done, a direct way to write code with medium level verbs in a semi-functional style in pure C.

All of this can be done in C++ and theres certainly more syntactic sugar there, but I wanted to explore the idea in C… C is close to the metal [but not too close, like assembler], compilers generate fairly good machine code, while the language supports a minimalist way to define functions [without lambdas, but we can use function pointers and context pointers to get that, if not in a type safe way].

Another approach would be to do it in C++ with operators and templates, much of it is reusable from STL and boost… yet another way would be to do it in ansi C and use MACROS heavily… but my experiment is to make simple, readable C code thats fairly quick.

In the K (terse) and Q (less terse) languages of KDB+, one can express something like this –

drawfrom:{[spec; vals; n] mon: 0, sums nmlz spec; idx: mon bin n?1.0; vals[idx] }

function drawfrom(spec, vals, n) mon = partial sums of spec (the cdf after normalizing to 1.0) generate n random numbers uniformly in [0,1] idx = array of indexes of each random sample into mon return the values indexed by idx