You are currently browsing the monthly archive for September 2008.

Just a note that Ive uploaded the initial version of vfuncs to google code. Ive released under a BSD license so you can use it in your commercial and noncommercial code easily.

Download from here [I’ll import to SVN sometime soon]. See my previous post for a description of vfuncs.

This version contains an example of a digital filter. This can be used to smooth the series data, or apply other signal processing operations. If your familiar with applying a blur filter in photoshop or gimp, using a gaussian filter kernel, this is exactly the same idea (except in one dimension).  Gaussian filter is basically just a moving average of the data.

Think of the algorithm as applying a sliding window across the data – the sliding window contains the filter weights, and at each position you apply the weighted average [dot product] of the filter weights against each data point in the window.

If the filter contains a single element of weight 1.0, then the result is just the input (the filter is just the Dirac delta function in that case). If the filter contains [0.25 0.50 0.25] its going to mix each element with its neigbours and take a weighted average, thus smoothing the data.

Read the rest of this entry »

I want to describe a simple experiment Ive just done, a direct way to write code with medium level verbs in a semi-functional style in pure C.

All of this can be done in C++ and theres certainly more syntactic sugar there, but I wanted to explore the idea in C… C is close to the metal [but not too close, like assembler], compilers generate fairly good machine code, while the language supports a minimalist way to define functions [without lambdas, but we can use function pointers and context pointers to get that, if not in a type safe way].

Another approach would be to do it in C++ with operators and templates, much of it is reusable from STL and boost… yet another way would be to do it in ansi C and use MACROS heavily… but my experiment is to make simple, readable C code thats fairly quick.

In the K (terse) and Q (less terse) languages of KDB+, one can express something like this –

drawfrom:{[spec; vals; n]
mon: 0, sums nmlz spec; idx: mon bin n?1.0; vals[idx] }

Basically this reads –

function drawfrom(spec, vals, n)
mon = partial sums of spec (the cdf after normalizing to 1.0)
generate n random numbers uniformly in [0,1]
idx = array of indexes of each random sample into mon
return the values indexed by idx

So basically, this semi-functional zen kaon simply generates n random samples from the spectrum supplied.  Think of spec as the weights that determine how often each of vals appears – spec is a histogram or discrete pdf.  Actually this is the readable version, closer to Q than K, as Ive defined nmlz and used the verbose style – in K it can be much more ‘terse’ [proponents would say succinct].

At first this style of programming is annoying if your from a C++ background, but once you get used to it, you begin to think directly in terms of verbs working on vectors – In the same way that std::vector allows you to think at a higher level and avoid many for() loops by using accumulate and other languages the foreach construct…

So how does this look in C?  Try this –

Read the rest of this entry »

Im normally a kdb+ kind of guy when it comes to managing huge amounts of streaming and historical tick data, the performance is great, the app small and clean and the language Q terse with just enough to get the job done.  On the downside Q is a bit cryptic, and the documentation is brutally terse.. though readable.

I decided to do a bit of googling to see whether another product was out there that might be useful, and came across StreamBase, which is the commercial outgrowth of some research projects at MIT – Aurora, Borealis and Medusa.  These projects were led by Michael Stonebraker, who invented {|discovered?} ingres and postgresql databases in their original form.  His short blurb on Stream processing – Data Torrents and Rivers – is a worthwhile introduction.

The StreamSQL language spec seems to be independent of StreamBase, as it has its own site which describes the language – StreamSQL.org.

StreamSQL does seem to fall short of being a fully independent spec, and I wanted to make some comment on this… because the world really does need an accessible stream processing language that acts in the same way as SQL – I love Q but I just dont see your average quant developer having time to grok it when they already have to learn C++/Perl/Python/Matlab/R and I guess soon ruby [until lisp becomes the 100year language].

Heres my Open Letter to the StreamSQL people –

Read the rest of this entry »