I want to describe a simple experiment Ive just done, a direct way to write code with medium level verbs in a semi-functional style in pure C.
All of this can be done in C++ and theres certainly more syntactic sugar there, but I wanted to explore the idea in C… C is close to the metal [but not too close, like assembler], compilers generate fairly good machine code, while the language supports a minimalist way to define functions [without lambdas, but we can use function pointers and context pointers to get that, if not in a type safe way].
Another approach would be to do it in C++ with operators and templates, much of it is reusable from STL and boost… yet another way would be to do it in ansi C and use MACROS heavily… but my experiment is to make simple, readable C code thats fairly quick.
In the K (terse) and Q (less terse) languages of KDB+, one can express something like this -
drawfrom:{[spec; vals; n]
mon: 0, sums nmlz spec; idx: mon bin n?1.0; vals[idx] }
Basically this reads -
function drawfrom(spec, vals, n)
mon = partial sums of spec (the cdf after normalizing to 1.0)
generate n random numbers uniformly in [0,1]
idx = array of indexes of each random sample into mon
return the values indexed by idx
So basically, this semi-functional zen kaon simply generates n random samples from the spectrum supplied. Think of spec as the weights that determine how often each of vals appears – spec is a histogram or discrete pdf. Actually this is the readable version, closer to Q than K, as Ive defined nmlz and used the verbose style – in K it can be much more ‘terse’ [proponents would say succinct].
At first this style of programming is annoying if your from a C++ background, but once you get used to it, you begin to think directly in terms of verbs working on vectors – In the same way that std::vector allows you to think at a higher level and avoid many for() loops by using accumulate and other languages the foreach construct…
So how does this look in C? Try this -
V val = vnewseq(0.0, 1.0, n);
V pas = vnmlz(vpasc(vnew(n+1),n));
V draws = vnewspecdraw(pas, val, n);
Where vpasc makes a pascals triangle and vnmlz makes sure the array sums to 1.0. Hmm.. we’ve kind of cheated though – we really need to build the last function from verbs also…
Expanding vnewspecdraw() we have the following C code -
V mon=vsums(vnmlz(vpasc(vnew(ns+1),ns)));
V draws=vnewrand(n);
I ix=inewbin(draws, mon);
vidx(draws, ix, vals);
Thats more in the spirit of constructing a useful script from simpler building block verbs… This is actually compilable once the simpler vfuncs are defined, and its a reasonably close translation of the K kaon we saw above.
Other useful vfuncs are vavg, vdev, vwavg [for mean, deviation, weighted average etc]. Its nice to build these from smaller blocks such as each(v, lambda_func) but they are common, and Ive coded them outright for the moment.
Ill upload an implementation of the vfuncs and sample source code soon… its fairly quick, sub-second to generate and do stats on 4million doubles… although the above code does create an extra copy of the array. In fact I re-implemented it after looking at the memory allocations – but thats a valid development paradigm… get it working, test it, then optimize.
At some stage well look at doing a convolution filter using similar primitives… until then happy quant coding!

8 comments
Comments feed for this article
September 24, 2008 at 5:53 am
brainyoga
You might find
http://www.smartarrays.com
interesting
September 24, 2008 at 12:55 pm
Ryan
Neat article! Seeing radically different ways of using the same language is very interesting. Looking forward to the source code!
September 24, 2008 at 2:02 pm
quantblog
Ryan,
thanks… I should have the vfuncs code on sourceforge soon [I think it takes them a couple days to vet projects]
brainyoga,
grazias, I didn’t know about smartarrays, seems like there are several good commercial offerings out there now…although K seems to have an edge. Interesting to see smartarrays has grown up with APL. Theres definitely something about APL, J, K and other functional languages that helps you think about the problem, rather than managing indices.
Both with K and the C code, it is nice to see small executables again, in this day and age. That may be a minor reason performance is fairly good.
fyi. Ive implemented a general sliding window primitive and am using that to do a digital filter… it seems to filter data at over 5million a sec. I intend to reuse the same primitive for a convolution – which is pretty similar logically, as one of the series is reversed and travels left over the data [ http://en.wikipedia.org/wiki/Convolution has nice pics]
[ off topic, I found an old but digestible intro on map / reduce at Joel On Software here -
http://www.joelonsoftware.com/items/2006/08/01.html. Its a very mild segway into lisp.
This recent article on Map Reduce in Hadoop might be more hip for Java afficionados -
http://www.javaworld.com/javaworld/jw-09-2008/jw-09-hadoop.html ]
cheers
September 24, 2008 at 8:40 pm
Attila
>Theres definitely something about APL, J, K and other functional languages that helps you think about the problem, rather than managing indices.
Definitely it is much more important that most would think – and they would argue against it based on the cost of maintanance…
Although APL/J/K has it much more ‘naturally’ (beautifully?) than other functional languages (even Mathematica – which is quite a different beast though)
Check out Ken Iverson’s Turing Award lecture http://elliscave.com/APL_J/tool.pdf (you can find some connecting articles as well)
You can write C much closer to the simplicity/brilliance of K, for example http://www.kx.com/q/c/c/odbc.c and http://www.kx.com/q/c/c/k.h
September 24, 2008 at 9:33 pm
quantblog
Now thats literate programming taken to an extreme :]
Great link – the odbc implementation first defines its own DSL-like terse language, then proceeds to implement an ODBC driver effectively in that K-like DSL.
Id find it a bit hard to remember all the abbreviations – but some seem quite natural like E and F for different precision reals…
I particularly like Arthur’s incantation defining the variant type –
typedef struct k0{I r;H t,u;union{G g;H h;I i;J j;E e;F f;S s;struct k0*k;struct{I n;G G0[1];};};}*K;
As a self imposed constraint of my own experiment, I avoided macros,but I did notice MIN() MAX() and _count_of(array) creeping into my code. I’m a big fan of macros in lisp – they just seem more a part of the language.
I happily admit that “DO(n)” is more readable than
int i;for (i=0;i<n;i++) ...
which appears a lot in my vfuncs implementation
September 25, 2008 at 7:39 am
Attila
Which one do you find hard to remember? I think they are very carefully chosen.
kdb+/k/q are about – at least for me – the absolutely minimal set of features one needs to achieve the desired result… For example C’s simple text substitution macros are enough to write code close to the spirit of APL/K. And that is why kdb+ does not have macros : you don’t need it (however beautiful they are ) – and I don’t want to be ignorant – but what remarkable software did it make possible?
September 25, 2008 at 9:03 am
quantblog
hmm.. C macros are powerful no doubt about it. I could have used C++ templates or C macros to achieve some functional style things in C.. but that was not this particular experiment…
Using terse names is a personal thing.. it can make a lot of sense for the domain. I might find it hard to remember someone elses short abbreviations, but use my own easily. I know of another excellent db implementation that used names like v4x v4r v4c. it looked unreadable at first, but if your working on it everyday its a great time saver. KDB naming style is another example… For most of my own programming 4-7 letter names seem to be good mnemonics.
vfuncs is a experiment to get some of the abilities of dedicated functional languages, without resorting to too many tricks…thats why I ruled out macros.. you can do magic with them for sure.
I’m not too religious about it… I agree K Q of KDB+ are very focussed and suit the job very well… my experiment is to see if one can get some percentage of the same power from C. Im still doing the experiment…
Q is somewhat a verbose version of K, which makes it easier for people who may come from an SQL background for example. My verbs are vlonger still but_not_this_long or MicrosoftVerboselyLong.
I guess I’m still impressed by C as a language – just having function pointers is very powerful. But certainly I’m not sure yet how I would do curring and closures in C without macros… maybe theres a nice way to do it.
October 3, 2008 at 10:53 am
obiettivoefficacia
thak’s!
very interesting