You are currently browsing the category archive for the ‘node.js’ category.

For experimenting with some ideas on improving blockchain scaling :

https://github.com/justgord/blksimjs

Currently simulates the growth of a crypto-currency blockchain ( similar to Bitcoin ), loops thru a cycle of :

  • spend – pick addresses at random and spend to another address
  • mine – put waiting transactions into the block, mine the cryptographic hash, add to chain

Features / Limitations :

  • uses single sha256 bit hash for ids
  • uses an easy Proof-of-work [ 256 tries on average to get a block hashs with ’00’ leading bytes ]
  • uses node.js byte buffers for transactions and blocks
  • runs around 7000 tps on i5 laptop

Motivation

I wanted to simulate the growth of a blockchain with unspent transactions spread somewhat sparsely at the early older parts of the blockchain, and more dense at the top of the blockchain [as more recent transactions havent had time to be spent yet ].

The reason for this is to test the feasibility of reducing the size of the data needed to bootstrap a new node. eg. in Bitcoin the whole dataset is :

  • around 150GB of transactions [ 250Mn txns ]
  • utxo of around 2GB [ ~50Mn txns ]
  • so unspent ‘utxo’ set is around 20% of transactions

Bring UTXO set forward

We can use much less data [5x smaller ] when spinning up a new node, by bringing utxo set forward to nearer the front of the chain. The sim gathers old utxos and injects them into the blockchain in baches of ids, so they are stamped into the block at block creation.

These ‘utxo catchup sections’ are read when starting a new processing node – ie. it only needs a provable list of utxos, not the complete history of all spent transactions.

skip links [ todo ]

Using block extension areas, we can also include skip links to blocks much earlier in the chain – the process of walking back thru the links of transactions inputs and outputs all the way back to the initial ‘genesis’ block is like walking a DAG tree, as un-needed areas of the blockchain are skipped over.  These skip links are validated at block creation time by other nodes.

This is useful for clients which want to traverse the chain to make better proof of validity that SPV, and for nodes that use utxo bring forward above, so they can trace PoW to an arbitrary level back to the genesis block.

Been ultra busy lately on two distinct web projects, both of which really need a good architecture so they can scale easily.

One nice way to do this is with services which message each other – if you have a fast, persistent reliable message queue you can have many processes grab jobs off the queue and thus scale out over many cores.  I feel this is a natural way to scale out node.js apps.

I immediately discarded SQS [ Amazon’s scalable queue service ] as it basically polls a web url to check for messages, so it really is not a message queue at all.

Another nice option is Mongo tailable cursors, which is the message queue approach that mongo uses internally to replicate between mongo instances.   This worked sort of ok, but didnt strike me as ultra high performance.  see mongoMQ npm module for a simple api wrapper.

I was very impressed with Redis Publish/Subscribe semantics… which are very fast, but dont have persistence.

In the end I decided to use a combination of Redis pub/sub for notifications, and Redis List operations rpush/lpop to store/retrieve the actual message data.  I wrapped this in a simple node.js api, which Im calling redpill.

I have an initial implementation of a typical web app which has users and info items.  This performs rather well, with web request response times under 50ms whle 1000 items/sec are being inserted as a background task.

The nice thing is that using message queues allows you to scale up by running any number of servers.

See code + comments here :

redpill : persistent messaging using Redis primitives

gorgon : demo web server architecture with messaging between server components

Lots of things to optimise, but basically a good proof of concept and initial working demo code for this approach.

Lokenote

Lokenote is a very simple way to share notes based at a particular location… a kind of geo PostIt note taker.

There are great apps out there such as Gowalla, FourSquare, Yelp but I felt the need for something ultra simple with low overhead where you dont need to register or signup – just drop a note where you are, and leave it for other people to find.

I still see notices pasted up on supermarkets and on lampposts for Missing Cat,  Part-time Work,  Flatmate Needed or Garage Sale.. so I think there is a need for this kind of utility belt app.  I specifically wanted a tool to jot down nice graffiti snippets from around St Kilda where I live, and mention handy places a non-local might not know about such as the well hidden laundry shop.  Id like to give a virtual nod to some of the superb out of the way food places that exist in my version of Melbourne.

Next

In developing Lokenote I took the ‘Fire!… Ready? Aim‘ approach – I had a rough idea, started as simply as I could, implemented feverishly and only added things I thought were absolutely essential.

Not being able to entertain any extra features had the perverse effect of generating many more ideas.. but these had a more organic character growing out of reality.  Beware ideas that have clean academic edges, they tend to not fit the world.

The process of building Lokenote gives me a a furtive and voyeuristic sense of the kind of realtime app which I think is just around the next bend.

We can safely assume Next apps are :

  • mobile/web hybrids with touch UI
  • reactive in realtime, via flowing data feed
  • location aware, fulltext searchable
  • online/offline robust
  • built on graph style data models

But what might they actually do ?

  • live auction or product sales [ the last 15mins of an eBay auction without the 6 day lead-up ]
  • convergence of blog, web page and chatroom with live comment feeds
  • realtime automated sentiment, trend summary
  • flexible links between any kinds of data
  • scrolling realtime chat, tethered to a location, keyword/topic, group or event

Feedback

Feedback is the best word I have to describe the qualitative difference of realtime apps.

An example – rather than go to a conference and wait for surveys to come in from attendees and adapt in time for the next event, the feedback loop is immediate enough to customise as it progresses.  This already happens, with some presenters saying ‘tweet me if you want more or less on this topic’ – a tweet is more anonymous and less impolite than interrupting the speaker, and feedback is current.

We might see more prices that are changing moment to moment, or other micro-optimisations –  the cost of a flight might be offered within a range, and be fixed only once the aircraft leaves the gate.

Realtime Dating?

Another example of a realtime next app is for dating.  I envision this as a kind of randomised, localised topic-chat :

  • nominate a topic and post a comment, or join an active topic that looks interesting
  • chat away anonymously for a while
  • notice someone interesting, share your profile
  • get a nudge back or an invite for a one-on-one chat
  • if things progress, decide to meetup at a cafe on neutral ground

Most dating apps use the profile photo as the initial filter.. but Im not so sure that is the initial filter in Life.. sometimes people with unremarkable looks win you over and in fact become more attractive over time as conversation unveils their personality.  So conversation as the initial filter might actually work.

The very same app could be a great way to generate ideas in business or science or political activism… it just seems the old chat room needs to be upgraded for the realtime web, so that it resides next to all the other things I do on the web.  I might want to attach a web page or doc or graphic or photo or video to my realtime comment.  Parties might agree to go private with some comments.  You might want to limit the audience to a group or post anonymously then go back on-record.

Lets Build

We have all the plumbing to do this feasibly – technologies such as nginx, node.js, Mongo, Couch, Riak, Redis, Web Sockets, JSON, HTML5 are really at the point of becoming the normal way to write dynamic data-driven responsive web/mobile apps.

Its about taking some risk to walk over the local maxima and build these things that will make life simpler, leaving more time for people to enjoy the roses.

Well, those are my thoughts for now… enjoy, gord.

Been using git more and more for public hacking and private consulting work.

Some impressions / notes –

  • github.com is superb!  Radically better/simpler/easier/nicer than sourceforge or google code
  • I lurrve code snippets, aka GISTs hosted on github. Sane blogs support inline gists [ not wordpress.com, yet ]
  • Found a clear readable tutorial on setting up a remote git repo (with git and gitosis) for private or public use
  • Github is so nice, its really tempting to just pay them money for some private repos rather than step thru the above
  • A handy git meta- cheat sheet here

NPM Module

So that we can reuse the simple serialq code from the previous blog post, I have tidied things up and packaged into an NPM module.Apparently its now installable using “npm install serialq”.

Creating the module was a breeze –

  • within your directory run ‘npm install’ and fill in the questions
  • npm adduser, npm publish
  • test with npm install

After publishing, the module magically shows up on the extremely handy npm.mape.me module search site, under keyword ‘serial’.  See isaacs article ‘How to Module‘ for overview.

Code

The code is a bit simpler to read, as you can see its a very short implementation :

exports.SerialQueue = function()
{
    var sq = 
    {
        funcs : [],
        next : function()
        {
            var Q = this;
            var f = Q.funcs.shift();
            if (f)
                f(function() {Q.next();});    
        },
        add : function(f)
        {
            this.funcs.push(f);
        },
        run : function()
        {
            this.next();
        }
    };
    return sq;
}

Usage :

    var Q = SerialQueue();
    Q.add(fn_first);
    Q.add(fn_second);
    Q.run();</code>

[ For a more readable version of the code snippets above, see this github Gist. Would be nice if wordpress.com supported Gists, Posterous do..ahh maybe time to move my blog. ]

Thoughts

I found this module handy for serializing access to a mysql database. Breaking out this boilerplate made the rest of the code clearer.  Code is up on github

Surprisingly Javascript + Node.js is a real workhorse.  I actually prefer it to Perl/PHP and even Ruby/Python for data plumbing tasks.  You have hashmaps and regex handling built in, garbage collection, and a superb general purpose data format in JSON.  Perhaps Javascript is the hundred-year-language?

Be aware this is ‘cooperative’ sequencing.. each function gets passed a done or next argument, and will have to invoke that to signal completion [ causing the next function to be run ].

Node is Async by Default

The whole crux of Node.js is that everything is done async by default – you fire off something now and get a callback at some later time.  Its a beautiful paradigm and means that you can get great performance, because it fits so closely with the underlying operating system calls [ libevent, completion ports, sockets etc. ]

However, there are times you do need to enforce serial processing… for example checking for valid user/password must return a result before getting sensitive data and displaying it on the web page.

There are now sophisticated serial modules for Node, which Id recommend you look at for real work.  For example Conductor can mark sequential dependencies and will allow the most async processing to happen, while honoring those sequential constraints – the best of both worlds.  Another nice approach is this fork() primitive via stackoverflow.com.

Lets have a look at the simplest case, to see whats under the hood…

Demo Code

I made a test program to compare sync versus async,  This illustrates a very direct approach for serial processing using a queue of work functions. Code on github, here : async_vs_serial.js

Read the rest of this entry »

Been hacking in Node.js and am really enjoying the saneness of this dev environment.

Some handy links before I forget  :

  • howto.no.de articles – especially Part I, II and III of ‘Learning Javascript with Object Graphs’
  • Joyeur blogs – people working for Joyent on Node / DTrace / Solaris
  • no.de Joyent Node hosting [ built atop – ‘open’ solaris, ZFS, DTrace ]
  • NodeJitsu blog
  • npm.mape.me – searchable Node.js Modules list
  • connect-it guide – web framework with chained middleware layers
  • express guide – article on express web framework for Node
  • eventserver – Tom Lee’s internet tee piping for notifications

In other news.. Im hacking over ssh via a long thin pipe to my linode server – using a very erratic mobile broadband connection, arrgh!

Cant wait for ADSL to _finally_ be connected here, so I can watch Bryan Cantrill talk about Cloud Analytics :]

As an aside.. why Javascript?  Consider

  • Javascript is a totally distinct language from Java
  • Javascript deserves its bum rap.. to mis-quote Dame Judi Dench, its bad parts are “arse-clench-ingly” bad :]
  • The good parts of Javascript feel very nice, like a modern lisp inspired language, fairly concise, many valid idioms
  • V8 js engine is fast
  • Javascript callback mechanism fits async event IO really well
  • Node.js embodies the above bullet point into a fine server development environment
  • JSON, the Javascript native data format, is all the good things of XML with none of the bad
  • feels like a unix-like web-plumbing philosophy
  • can keep the same language syntax hat on when writing front end web apps and back end servers