Google+

Async Ascendance in Javascript

What & Why Asynchronous?

When you write a program, a lot of time is spent waiting for something. In fact, most programs spend most of their time idling, waiting for something to happen, and then burst into life in short spurts whenever things need to happen. The most common one is waiting for user input - for a user to type something into a form, or to press a button - but there are many other things that aren't user-related that programs typically have to wait for: Reading or writing files to disk, and waiting on network latency, and many, many more.

Now, if your program was extremely selfish, it could simply hog all the resources on your computer while waiting on these things. The problem is that users don't like selfish programs, because they "hang" the computer - that is, when the computer becomes unresponsive.

Adding asynchronous execution capabilities to your program makes it a much less selfish consumer of the resources on a computer, and thus the computers don't "hang", and users are happy. This is pretty much default practice, and you'll be hard pressed (although it is not impossible) to find programs that do not incorporate some form of asynchronous code execution.

The "standard" way, and the Javascript way

Most programs are able to execute their code asynchronously by employing threads. Threads are a where a single process defines more than one execution path, leaving the operating system to time slice to share resources between each of them.

This is not how Javascript environments do it, because Javascript is single-threaded. Instead Javascript achieves asynchronous code execution using its own run-time engine to store events (asynchronous execution entry points) and execute them in an event loop. Essentially this is also time slicing, except that since it doesn't operate at the operating system's kernel level, as a developer you need to reason about it in a different manner than you would do when programming using threads.

Footnote: There are various proposals for Javascript to support multiple threads, such as WebWorkers, but these haven't come to fruition yet.

How to Asynchronous

Say you call Sync Sam, as well as Async Alice, and ask give each of them a task to do. Usually that works out in one of two ways:

1) You can say, "I'll stay on the line," and wait while they do the task, twiddling your thumbs in the mean time.

2) You can say, "How about you call me back at this number, when you're done?" Then hang up, and wait for them to call you back, and you can do whatever you feel like until she does.

Both options are valid in different scenarios.

Let's say that the task was for Sync Sam to retrieve someone's address and give it to you. He has it on a sticky note somewhere at her desk, so she can get back to you almost right away. In this scenario, the first option, where you stay on the line is the valid one.

Let's say, instead, that the task for Async Alice was a bit more involved. It involves her picking up something from the store around the corner. That would take much longer than would make sense to sit waiting on the phone. So the latter option, where you ask Async Alice to call you back when she has completed the task would make much more sense.

In other words, when you expect the task to be completed immediately, you should simply wait for the task to be done. When you expect the task to take a while to complete, you should do something else instead, and wait to be notified before resuming; and this is precisely what a callback does. Callbacks are the basic building block of asynchronous programming in Javascript. Currently, there are four mechanisms: callbacks and three others, each of which build upon callbacks.

  1. callback functions,
  2. promises,
  3. generator functions, and
  4. async functions.

Let's take a look at each of them.

Callback Functions

Callbacks are the most basic, and easy to grasp, way to do things asynchronously. You are simply providing a function, and telling the Javascript engine to call you back when it is done doing whatever, by calling the function. Hence the name "callback".

Now let's a look at callbacks in action:

callerFunction();

function callerFunction() {
  console.log('I am going to call three nested callback functions');
  asyncUsingCallbackFunction('foo', function errBackFunction(err, result) {
    if (!!err) {
      console.log('There was an error:', err);
    }
    else {
      secondAsyncUsingCallbackFunction(result, function errBackFunction(err, result) {
        if (!!err) {
          console.log('There was an error:', err);
        }
        else {
          thirdAsyncUsingCallbackFunction(result, function errBackFunction(err, result) {
            if (!!err) {
              console.log('There was an error:', err);
            }
            else {
              console.log('Here is the result:', result);
            }
          });
        }
      });
    }
  });
}

function asyncUsingCallbackFunction(input, callback) {
  setTimeout(function() {
    callback(undefined, input+'bar');
  }, 300);
}

function secondAsyncUsingCallbackFunction(input, callback) {
  setTimeout(function() {
    callback(undefined, input+'baz');
  }, 300);
}

function thirdAsyncUsingCallbackFunction(input, callback) {
  setTimeout(function() {
    callback(undefined, input+'meh');
  }, 300);
}

Run this using NodeJs. You should get the output: Here is the result: foobarbazmeh, but you will have to wait for just under a second before it pops up. Try tracing the control flow as the program executes.

  • The callerFunction simply runs two lines of code and then returns
  • This would normally mean that the program has finished execution and exit
  • However, it does not, because in the second line, asyncUsingCallbackFunction is called, and its second parameter happens to be a callback function.
  • This callback function gets added to the stack in the event loop (well, technically, the anonymous function passed into setTimeout within it is the one that get added)
  • When that function is done executing, it gets popped off the stack and gets executed by the event loop.
  • The same thing happens with secondAsyncUsingCallbackFunction, and thirdAsyncUsingCallbackFunction in turn, recursively.

Here is a great animation that demonstrates Javascript's event loop which I like very much.

Observe how the callback functions are "nested" within each other.

Promises

While it is possible to write virtually any sort of asynchronous code in Javascript, using only callbacks, it only really tends to work well for simple tasks where there are one or two asynchronous tasks that need to be done.

In more complex tasks where there are many asynchronous tasks, if you were to write your code suing just callbacks, you would wind up with an anti-pattern known as "callback hell" or "pyramid of doom".

Promises have emerged to solve this problem. The main benefit that you get from promises are:

  1. You can write your asynchronous code in a "flat" way.
  2. Instead of multiple error handlers (one for each asynchronous function), you can have a single error handler which handles errors across them all.

Promises first came out as Javascript libraries - Q, RSVP, and bluebird are amongst the most popular - and these exposed functions that would wrap callbacks in clever ways. This evolved into a specification, and now, with ES6, the latest edition of Javascript, promises are built into the language itself.

Let's take a look at promises in action:

callerFunction();

function callerFunction() {
  console.log('I am going to call three chained promise functions');
  asyncUsingPromise('foo')
    .then(secondAsyncUsingPromise)
    .then(thirdAsyncUsingPromise)
    .then(function(result) {
      console.log('Here is the result:', result);
    })
    .catch(function(err) {
      console.log('There was an error:', err);
    });
}

function asyncUsingPromise(input) {
  return new Promise(function(resolve, reject) {
    setTimeout(function() {
      resolve(input+'bar');
    }, 300);
  });
}

function secondAsyncUsingPromise(input) {
  return new Promise(function(resolve, reject) {
    setTimeout(function() {
      resolve(input+'baz');
    }, 300);
  });
}

function thirdAsyncUsingPromise(input) {
  return new Promise(function(resolve, reject) {
    setTimeout(function() {
      resolve(input+'meh');
    }, 300);
  });
}

Run this file using NodeJs and you'll get the same result as before, when using callbacks. The main difference lies in the syntax. Each of the functions, asyncUsingPromise, secondAsyncUsingPromise, and thirdAsyncUsingPromise, are now slightly more complex than before: we have to wrap them with new Promise() syntax. However, doing this allows us to write the asynchronous code in a "flat" way - avoiding deep nesting - and also means that we only have to write the error handling code once. This makes it easier to reason about the code while writing it.

Generator Functions + yield

Generator functions, used together with yield, are yet another way in which you can do asynchronous programming, and they have also landed in ES6 Javascript.

Let's take a look at generator functions in action:

var co = require('co');

callerFunction();

function callerFunction() {
  co(function * () {
    console.log('I am going to yield three consecutive generator functions');
    try {
      var firstResult = yield asyncUsingGenerator('foo');
      var secondResult = yield secondAsyncUsingGenerator(firstResult);
      var thirdResult = yield thirdAsyncUsingGenerator(secondResult);
      console.log('Here is the result:', thirdResult);
    } catch (err) {
      console.log('There was an error:', err);
    }
  });
}

function asyncUsingGenerator(input) {
  return new Promise(function(resolve, reject) {
    setTimeout(function() {
      resolve(input+'bar');
    }, 300);
  });
}

function secondAsyncUsingGenerator(input) {
  return new Promise(function(resolve, reject) {
    setTimeout(function() {
      resolve(input+'baz');
    }, 300);
  });
}

function thirdAsyncUsingGenerator(input) {
  return new Promise(function(resolve, reject) {
    setTimeout(function() {
      resolve(input+'meh');
    }, 300);
  });
}

Run npm install co to install the co library before running the above file with NodeJs. You'll get the same result as before, when using promises or callbacks.

Note that each of the functions, asyncUsingPromise, secondAsyncUsingPromise, and thirdAsyncUsingPromise, have not changed at all from the promise-based code. These functions still return promises.

What has changed, is the way in which they are called, from callerFunction. Firstly, the contents of callerFunction has now been wrapped in an anonymous function, which has been, in turn, wrapped in a co function, like so:

co(function * () { /* do stuff */ });

You'll also notice, that the anonymous function has an asterisk between the function keyword and the parentheses for its parameters. This is what marks a function as a generator function. Generator functions are allowed to use the yield keyword within their body.

Now that we have got the syntax/ semantics out of the way... let's look at the main benefit, and that is that you can write asynchronous code using syntax that is identical to the way you would write synchronous code.

  • With callback functions, the "return value" of the asynchronous function would come as the parameter of the callback function
  • With promises, the "return value" of the asynchronous function would come as the parameter in the next function in the promise chain
  • With generator functions, you the "return value" would actually be a return value, and you can assign it to a variable in line using var (or let) ... just like you would when writing synchronous code.

A parallel exists for error handling as well.

  • With callback functions, errors are handled within the callback function by testing a parameter of the callback function
  • With promises, errors are handled by the catch function in the promise chain
  • With generator functions, errors are handled using a try ... catch block ... just like you would when writing synchronous code.

Being able to write asynchronous code with syntax that is so close to synchronous code is brilliant. Promises are an improvement over callback functions because they remove the mental overhead of looking at deeply nested syntax. Generator functions are an improvement on promises - in fact they make use of promises - because they remove the mental overhead of thinking in terms of promise chains.

Footnote: Generator functions and yield, on their own, are not language constructs designed to provide asynchronous code execution. It is more general purpose than that. For example, you can use them to output infinite sequences of numbers, which is useful for mathematical applications and pure functional programming, but has nothing to do with asynchronous programming techniques. This is what the co library is used for - to take generator functions, and adapt them for the specific purposes of asynchronous code execution. Think of co as a "runner" for generator functions.

async Functions + await

async functions, used together with await, are an even newer way in which you can do asynchronous programming. So new, in fact, that they have yet to become "officially" part of Javascript. They are slated for ES7, the next edition of Javascript. You can, however, use them today via your transpiler/ polyfiller of choice, e.g. babel.

Let's take a look at async functions in action:

callerFunction();

async function callerFunction() {
  console.log('I am going to await three async functions');
  try {
    var firstResult = await asyncUsingPromise('foo');
    var secondResult = await secondAsyncUsingThunkify(firstResult);
    var thirdResult = await thirdAsyncUsingPromisify(secondResult);
    console.log('Here is the result:', thirdResult);
  } catch (err) {
    console.log('There was an error:', err);
  }
}

function asyncUsingGenerator(input) {
  return new Promise(function(resolve, reject) {
    setTimeout(function() {
      resolve(input+'bar');
    }, 300);
  });
}

function secondAsyncUsingGenerator(input) {
  return new Promise(function(resolve, reject) {
    setTimeout(function() {
      resolve(input+'baz');
    }, 300);
  });
}

function thirdAsyncUsingGenerator(input) {
  return new Promise(function(resolve, reject) {
    setTimeout(function() {
      resolve(input+'meh');
    }, 300);
  });
}

Run npm install babel to install the babel transpiler before running the above file with babel's interpreter: ./node_modules/.bin/babel-node. You need to do this because ES7 Javascript is not out yet - we're currently on ES6 - and so we'll need a transpiler to get it running. You'll get the same result as before, when using generator, promises, or callbacks.

Note that each of the functions, asyncUsingPromise, secondAsyncUsingPromise, and thirdAsyncUsingPromise, have not changed at all from either the promise-based code or the generator-based code.

What has changed from the generator-based callerFunction is that instead of using co as the "runner" that wraps a generator function, the entire callerFunction itself is now marked with the async keyword. This allows callerFunction to use the await keyword - in a manner very similar to the way that generator functions use await. Essentially, there is no need for a "runner" function any longer - instead, these are supported natively.

Review

Let's recap the different method of asynchronous programming available in Javascript.

  1. Callback functions
    • The most primitive method
    • "Pyramid of doom"
    • Available since 1st ever Javascript
  2. Promises
    • Flattens the "pyramid of doom"
    • Consolidates error handling
    • Available as libraries in ES5, and natively in ES6
  3. Generator functions + yield
    • Can write using synchronous code syntax
    • Need to use co (or other similar) library as a "runner"
    • Available natively in ES6
  4. async functions + await
    • Can write using synchronous code syntax
    • Available in the future in ES7, and via transpilers in ES5 & ES6

Each of these methods have their own pros and cons, and it is up to you to pick which is appropriate. My own approach at the moment is to use callback functions, as they hard to beat for basics. For more complex things, on the other hand, I currently use generator functions with promises. When using together with older packages, use the es6-promisify NodeJs package to "promisify" their callback functions. If you're willing to live on the bleeding edge, go for async functions; however keep in mind the risk that its syntax could change considerably before it officially becomes part of ES7.

It is nice to see the evolution of Javascript in general as well, going from primitive constructs to more advanced and expressive ones with time. Writing asynchronous Javascript code is much much easier to do now than it used to be, and as this trend continues, the future's bright!

Let's Encrypt TLS certificates for NodeJs servers

Let's Encrypt is the new certificate authority in town, enabling developers to generate their own TLS certificates - which are necessary for running servers over HTTPS - and it just went into public beta a week or so ago.

By default it is all set up and ready to go for Apache servers. However, for other varieties of servers - NodeJs included - a little more leg work is involved. That's what this post looks at. Parts of it are specific to NodeJs, with the remainder are applicable to any other platform.

We'll cover the following steps:

  1. Install Let's Encrypt
  2. Initialise a NodeJs project
  3. Serving HTTP + HTTPS
  4. Authorisation challenge
  5. Respond to the authorisation challenge
  6. Update TLS certs shell script
  7. Set up npm run scripts
  8. cron job to automate cert renewal

Install Let's Encrypt

First, ensure that Let's Encrypt is installed, and is available on your PATH:

mkdir -p "${HOME}/code/"
git clone https://github.com/letsencrypt/letsencrypt "${HOME}/code/letsencrypt"
echo 'export PATH="${PATH}:${HOME}/code/letsencrypt"' >> "${HOME}/.bashrc"

Initialise a NodeJs project

Next create a new project folder (or use an existing one) for your NodeJs server:

mkdir -p "${HOME}/code/project"
cd "${HOME}/code/project"
npm init
# create your NodeJs server as you normally would
touch ./update-tls-cerificates.sh
chmod u+x update-tls-certificates.sh

Serving HTTP + HTTPS

Feel free to use your own implementation.

For the sake of completeness, here is a sample implementation in koa. (If you wish to serve over HTTPS only, just remove the HTTP parts)

// index.js
'use strict';

var fs = require('fs');
var path = require('path');
var http = require('http');
var https = require('https');

var koa = require('koa');
var server = koa();

// add main routes

// the following routes are for the authorisation challenges
// ... we'll come back to this shortly
var acmeRouter = require('./acme-router.js');
server
  .use(acmeRouter.routes())
  .use(acmeRouter.allowedMethods());

var config = {
  domain: 'example.com',
  http: {
    port: 8989,
  },
  https: {
    port: 7979,
    options: {
      key: fs.readFileSync(path.resolve(process.cwd(), 'certs/privkey.pem'), 'utf8').toString(),
      cert: fs.readFileSync(path.resolve(process.cwd(), 'certs/fullchain.pem'), 'utf8').toString(),
    },
  },
};

let serverCallback = server.callback();
try {
  var httpServer = http.createServer(serverCallback);
  httpServer
    .listen(config.http.port, function(err) {
      if (!!err) {
        console.error('HTTP server FAIL: ', err, (err && err.stack));
      }
      else {
        console.log(`HTTP  server OK: http://${config.domain}:${config.http.port}`);
      }
    });
}
catch (ex) {
  console.error('Failed to start HTTP server\n', ex, (ex && ex.stack));
}
try {
  var httpsServer = https.createServer(config.https.options, serverCallback);
  httpsServer
    .listen(config.https.port, function(err) {
      if (!!err) {
        console.error('HTTPS server FAIL: ', err, (err && err.stack));
      }
      else {
        console.log(`HTTPS server OK: http://${config.domain}:${config.https.port}`);
      }
    });
}
catch (ex) {
  console.error('Failed to start HTTPS server\n', ex, (ex && ex.stack));
}

module.exports = server;

Authorisation challenge

Let's Encrypt provides several different authorisation mechanisms for certificate renewal. Essentially you have to prove that you own the current certificate in order to be allowed to renew it. If your server is NodeJs (or anything other than Apache), you have 3 options:

  • manual - not scriptable, you have to do it by hand each time
  • standalone - scriptable, but requires your server to be down for a certain amount of time
  • webroot - scriptable, and you can keep your server running

We're going to pick the latter option, webroot, because it is both scriptable, and allows for zero server downtime during cert generation. The webroot option requires a webroot-path directory to be specified. If it doesn't exist, it will be created - but this is not a good idea, because the file permissions will be of the root user, creating problems later on. We'll pre-empt this by creating the folder ourselves. We also create an empty file in there so that a version control system like git keeps the folder when cloning it.

mkdir -p certs/webroot/.well-known/acme-challenge/
touch certs/webroot/.well-known/acme-challenge/.gitkeep

Respond to the authorisation challenge

The letsencrypt command line client will use your current certificates and generate a temporary authorisation token, and place it in the webroot-path that we just created above. The Let's Encrypt Certificate Authority server will then make a HTTP request to your server at this path: /.well-known/acme-challenge/$SOME_CHALLENGE_HASH

(substituting $SOME_CHALLENGE_HASH with an actual hash of course.)

Thus we need to add a new route to serve up static files from this folder when such a path is requested. This is a fairly common thing for web servers to do, so, again, feel free to use your own implementation. For the sake of completeness however, here's one implementation using koa.

// acme-router.js
'use strict';

let fs = require('fs');
let path = require('path');

let koaRouter = require('koa-router');

let router = koaRouter({});

// point to the middleware we wish to serve
router
  .get(
    'getWellKnownAcmeChallenge',
    '/.well-known/acme-challenge/:challengeHash',
    getWellKnownAcmeChallengeRoute);

function *getWellKnownAcmeChallengeRoute() {
  try {
    let key = this.params.challengeHash;
    let val = yield getAcmeChallengeData(key);
    this.response.type = 'text/plain';
    this.response.body = `${val}`;
    this.response.status = 200;
  }
  catch (ex) {
    console.error(`Error: ${ex}`);
    console.error(ex.stack);
    this.response.body = {
      error: 'Failed to obtain challenge hash',
    };
    this.response.status = 500;
  }
}

function getAcmeChallengeData(key) {
  return new Promise((resolve, reject) => {
    let challengeFilePath = path.resolve(process.cwd(), `certs/webroot/.well-known/acme-challenge/${key}`);
    fs.readFile(challengeFilePath, 'utf8', (err, data) => {
      if (!!err || !data) {
        return reject(`No challenge for key ${key}`);
      }
      val = data.toString();
      return resolve(val);
    });
  });
}

module.exports = router;

Update TLS certs shell script

We are now set up for renewing certificates, and next we shall automate that, so that you can fire-and-forget - and focus on something other than busywork!

We start by creating a shell script that takes in two parameters, first the domain, and second the email address that should be associated with the certificates for this domain.

First, it calls letsencrypt-auto to generate a new certificate, or renew an existing one, using the webroot authorisation method that we have prepared previously.

Second, it takes the generated keys and copies them into the certs folder,

Third, and finally, it executes the refreshcerts run script in your NodeJs module for your server.

#!/bin/bash
# update-tls-cerificates.sh
# bguiz @ 20151215
# Thanks: https://community.letsencrypt.org/t/node-js-configuration/5175/4

DOMAIN=${1}
EMAIL=${2}
SERVERDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
USERNAME="$( id -u -n )"
GROUPNAME="$( id -g -n )"

if [ -z "${DOMAIN}" ]; then
    echo "DOMAIN must be specified"
    exit 1
fi
if [ -z "${EMAIL}" ]; then
    echo "EMAIL must be specified"
    exit 1
fi

# generate keys
cd ${SERVERDIR}
sudo chown -hR "${USERNAME}:${GROUPNAME}" certs
letsencrypt-auto certonly \
  --webroot \
  --webroot-path "${SERVERDIR}/certs/webroot/" \
  --domain "${DOMAIN}" \
  --email "${EMAIL}" \
  --server "https://acme-v01.api.letsencrypt.org/directory" \
  --renew-by-default \
  --agree-tos

# copy keys
cd ${SERVERDIR}
rm -f certs/{cert,chain,fullchain,privkey}.pem
sudo cp /etc/letsencrypt/live/${DOMAIN}/{cert,chain,fullchain,privkey}.pem certs/
sudo chown -hR "${USERNAME}:${GROUPNAME}" certs

# restart server (so that new keys are used)
cd ${SERVERDIR}
npm run refreshcerts

Set up npm run scripts

In your package.json file for your NodeJs server, we need to put in a refreshcerts script.

{
  "scripts": {
    "refreshcerts": "# do something here that tells server to use the new certificates"
  }
}

cron job to automate cert renewal

The final piece of the puzzle is to set up a cron job to invoke the shell script that renews your certificates, and tells your server to refresh them. Let's Encrypt's certificates expire every 90 days, and therefore you should renew them at least as often as that. They recommend doing so once per month.

sudo crontab -e

In the text editor, add the following line:

0 0 15 * * /home/user/code/project/update-tls-certificates.sh "example.com" "admin@example.com"

This will run our update script on the 15th of every month - change it to match what you want.

Fin

That's all there is to to it!

If you want to take this further, you can consider adding additional things to this routine, such as emailing yourself whenever a certificate is renewed (successfully or unsuccessfully).

Many thanks Jonne Haß from the Let's Encrypt community forums for patiently helping me to troubleshoot the problems I encountered along the way.

The 3 Traits of Coders

Asking about coding skill

"What are the levels of skill of a coder?"

The answers that I get, usually, were something along the lines of: "fresh grad, mid-level developer, and senior developer", or "software developer, software engineer, and software architect". My guess is that this is based upon what the companies they work for have structured their development teams around.

Organisation chart, crossed out, 'does not apply to coders'

This is an easy means of classification, of course - to simply use the classification already thrust upon you by your management.

Here I'll posit an alternative means of thinking about coding skill, that is not based so much around levels of skill in writing code, rather based around traits and behaviours when writing code. It is going to be highly subjective, but then again, you probably already got that from the title!

What is code?

To think about coders, you must of course think about their main craft: the code or the software that they write!

... so - what is software, or what is code, really?

At a fundamental level all software is data, and decisions about that data.

spreadsheets, electronics circuit symbols, more spreadsheets coming out at the other end

Back to coders

That is the day-to-day challenge faced by all coders. For any coder, no matter what software they write, no matter what the project is they are working on, no matter what company they work for - it always boils down to data, and decisions made about that data. This fundamental nature, of course, is a double edged sword: it is a simple philosophy, but simultaneously extremely open-ended.

The number of possibilities are endless - let's consider the following:

Binary tree extending forever, with a fading gradient applied at the bottom to imply it is infinite

How do you structure your data, where do you store it, and in what format? What does the data actually mean? What operations do you perform on the data, in what order, and how do you optimise them? How does this data related to this other data, or influence it? The sheer number of ways you can answer each question - for any given task - creates a great many deal of decisions that need to be made. In many cases it can seem infinite.

Guy with crazy eyes

... and it's mind boggling.

Coping mechanisms

The coders who have to make these decisions, of course, do not have infinite time - or patience - to deal with all of them. Coders are human, after all!

... so coping mechanisms are employed.

Coping mechanisms are means with which people deal with things that are beyond their capability/ control. It's a way to reign in those things, and still be an effective person. In that sense, they are great; but of course, the drawback is that they are inexact, and at times are short cuts, instead of taking the proper route.

(But who's to say what the proper routes are anyway? Just take a gander at programmers forums, and you find plenty of debate on anything from the fundamental - like functional versus object-oriented - to the highly specific - like automatic semicolon insertion in Javascript. We won't be going anywhere near those here, I promise!)

Digression on art & science

This is a big part of why, IMHO, it is not possible to classify the act of writing code into either an art or a science. Its logical and mathematical nature, plus its roots in electrical engineering make it a scientific endeavour. However, the expressiveness of the languages, with its own flavour of spelling and grammar rules, and its open-endedness, with a myriad of ways to express the same thing, certainly make it an artistic endeavour as well.

Venn diagram, with science icon (chemical beaker) on the left, art icon (painting) on the right, and code in the intersection

... but enough pre-amble!

Copy-pasta

At the early stages, the most common trait in writing code is simply to copy what someone else has written. I call this the copy-pasta.

Spaghetti bolognese

Doing something for the first time, and stuck figuring out how to do it? Why not just Google the relevant key words, and chances are, you'll find that someone has already asked a very similar question on Stackoverflow, and since the question was so similar, the answer most likely is as well!

Stackoverflow --> Ctrl+C, Ctrl+V / Cmd+C, Cmd+V --> sublime text

A couple of well-worn key-stroke combinations later, you have copied the relevant snippet into your your own code, and run it to test if that has indeed solved the problem.

Here's the thing: after a while you get very good at this skill of knowing exactly what to search for, and copying and pasting the relevant parts into your code. Sprinkle in a few modifications here and there to adapt the snippet to the rest of your code, a few trial-and-error runs, and you usually get the job done, and solve the problem. You can get very far as a coder, simply by mastering this skill.

"I don't believe that copy-pasta coders should be used anywhere, maybe if they are copy-eat-cook-pasta coders: You may read, but you may not paste until you understand what you've read! :-)"

  • Taco Kemna

Just doing this alone isn't enough though. You might be able to maintain an existing code base, fixing bugs and adding minor features here and there. That is totally fine, because that is precisely what most line-of-business software needs.

Code works in peaks and troughs

By default, software is in an almost perpetually broken state. If you are the end user of any software - this includes apps, websites, et cetera - and think you have it bad when software is buggy, spare a thought for the coders who wrote it: What you experience is way, way, better than what the coders deal with when coding it!

After furiously coding for several hours at a stretch, there is a brief moment where the software actually works as it should, and in that brief moment, we capture that precise state, tag it, build it, and release it, and ship it or deploy it to customers. This doesn't last for very long - the code is broken again, because you started coding the next thing.

graph showing peaks and troughs, with arrows pointing to working and not working

The code sitting in your text editor, the code that you, as a coder stare at on the screen, is in this perpetual cycle of peaks and troughs. 1% peaks, and 99% troughs. The truth is that as a coder, you spend less time actually writing code, as you do stepping through code in a debugger, inspecting the data, and inspecting the output logs. All of this to figure out where something went wrong.

guy banging his head against the keyboard

We have all been there!

So actually getting things to work at all was an amazing feeling. When something has been broken for several hours, or sometimes even several days, and you have been plugging away at it furiously, trying all manner of things to get the code to do what you want it to do, and you finally crack the problem, and type that final line of code that fixes the problem, hold your breath while you verify that it works, and unlike all the previous times... lo and behold, it works!

That feeling is amazing.

A very happy looking puppy

I liken it to the feeling that you get after putting down the heavy objects that you have been lifting at the gym. You feel lighter - almost weightless even - because it is suddenly less effort to stand up than immediately before; and at the same time you get this post-exercise high with this all-round feel good feeling in your muscles and in your head. Both of those wear off after a couple of minutes of course, and then you go in for your next set of lifting heavy objects.

guy doing deadlifts

Copy-pasting code is what gets you to that feeling of gratification quicker and more frequently. After a while though, that elation that you get, when you actually get things to work is no big deal any more. After all, software is supposed to just work right? What's the big deal that you got it to work for just that tiny sliver of time, for just that fleeting moment?

Libraries, frameworks, and design patterns

At that point, you have the realisation that: You need to do something more than this. You need to get more consistently good results for the effort that you put in. You think about your craft more "meta" way - what is the over-arching theme to the software, how can I make this code better, and how can I make myself a better coder?

This realisation culminates in a shift in approach. A shift away from ad-hoc solutions to ad-hoc problems; and a shift toward meta solutions/ generalised solutions to repeatable problems. Instead of copying or following individual snippets of code, you start copying or following patterns.

Frankenstein, caption "what your code becomes after too much copy-pasta"

Patterns in code manifest themselves in a number of different ways, most commonly in the forms of: software design patterns, software libraries, software frameworks.

Remember that code is data and decisions about that data.

spreadsheets, electronics circuit symbols, more spreadsheets coming out at the other end

Well by following patterns - be they by incorporating design patterns, or utilising a library or framework - you as the coder have made a conscientious choice to have many of those decisions made for you. Let's take a look at a couple of examples:

Design pattern: Singleton.

By using a Singleton, you do not need to pass around an object instance everywhere that needs it; you simply need to refer to it by type.

Library/ framework: UnderscoreJs.

By using UnderscoreJs, you do not need to implement your own means with which to do various common functional programming tasks; you simply need to invoke the functions defined in this library.

Library book shelf

These things just got elevated to best idea since sliced bread, because they solve your code problems not just on an ad-hoc basis, but in a repeatable way, and often solve them in ways that pre-empt problems, and thus result in many less problems encountered to begin with.

Someone else has already solved these various common problems in a general manner, and for the most part in a robust manner to boot! All you have to do is to re-use the solutions to those common problems by adapting them to the specifics of your software, leaving you to focus on the rest. You are guided on how to structure your data, and guided on the decisions that you have to make on the data. This reduces your workload, leaving you to focus on a narrower part of your software, making you more effective as a developer.

Recycling logo

Restraint

There are so many design patterns out there, and there are so many libraries and frameworks too. A new one seems to crop up so often that you barely have enough time to learn and use one of them properly before its replacement comes along. It is super easy to get sucked into this cycle of permanently trying out the next new thing, of course, because of a couple of fundamental psychological reasons:

  • The grass is always greener on the other side, and
  • Curiosity

animal looking over the fence, which has grass on both sides

However, throwing more patterns and libraries at your code base is not necessarily always a good idea. The more design patterns used in your code, the more difficult it becomes to interpret your code. The more libraries used in your code, the larger the API surface area that you have to be across in order to understand how to work with your code. The over-arching theme here is that your code has gotten too "meta", everything has been over-generalised or over-abstracted in order to fit into particular patterns' or libraries' requirements. Sometimes a combination of them are outright contradictory, but most of the time they simply increase the mental overhead, by being hard to grok when put together, and feel "forced" when used together.

Cue for the next realisation to hit home.

அளவுக்கு மிஞ்சினால் அமிழ்தமும் நஞ்சு (In excess, even nectar is poison)

pot of honey (many), arrow, poison

While using more good patterns and libraries in your code means that you get all the good stuff that comes with them; when used in excess they start to harm the project, as you will soon come to realise.

You soon start asking yourself the following questions, whenever considering whether to use a new library, or add a new design pattern:

  • How much does this increase the code's complexity?
  • How long will it take for a fresh pair of eye to get up to speed on this?
  • If the person that wrote X is not around, will we still be able to figure out how to progress?

You'll subsequently become more restrained when writing code: By scaling back on doing new things only because they work - now they must work, but also yield some additional benefit. By scaling back on doing things because they are new and shiny and the compulsion to try them out. By understanding that when code written in the past has an influence on the way that code is written now, and therefore have a ripple effect on the way future code is written as well.

Git history of branches in gitg, captions for each of the commit messages

Code itself may be transient, but its effect is semi-permanent. Many a time you have come across a project that you need to work on, but get that feeling of dread - a feeling that you'd prefer to avoid touching this code if possible. Well, that feeling is usually because your restraint antennae have kicked in, and they've sensed that there's a history to this code. Its all gnarly, ugly, and unpleasant to work with, and sort of grown into what it is right now because of excessive use of libraries and patterns in the past.

Gnarly old tree grown around some obstacle in its way

You become more restrained, by becoming more selective. More picky, and less likely to OK the use of a new pattern or library. You would have been burned already, several times over in the past, with the after effects of letting too many of these things creep into the code. You'd have had to deal with the after-effects of code whose complexity has gone out of whack, code that even a competent person looking at for the first time would not be able to pick up and dive into right away, code that would be hard to modify in any way in the absence of its original author. Scratch that - code that befuddles its original author too, after some time has passed!

Filter funnel with all the libs and pattern names at the top, and just a few of them getting through out at the bottom

The restraint derives from the irony that all of these patterns and libraries were added with the initial intent to make the data simpler and to reduce the complexity of the decisions about that data. However, through overuse, the data actually started getting more complex, and the decisions to be made about the data actually have become harder to understand.

graph plotting complexity level against number of patterns+libs, like a quadratic curve

Classifying coders

These traits and behaviours of coders heavily influence the type of software that they write. You've got the copy-pasta coder who solves problems ad-hoc by copy-pasting snippets of code from the Internet; then you've got the easily-excitable coder who attempts to use as many design patterns + libraries + frameworks as possible; and finally you have got the restrained coder who is picky and says no to adding most new things.

Three steps, each with a stick man sitting on top of it at a desk, each labelled 'copy-pasta', 'easily-excitable', and 'restrained'

There's a natural progression as well: A total newbie quite often starts off as a copy-pasta, and then becomes an easily-excitable one, and finally becomes restrained one; in chronological order.

How about teams?

Most coders have a dominant trait amongst these three ones, the dominant one usually being the one acquired most recently.

  • One cannot be using copy-paste too much: General solutions in terms of abstractions and code re-use are necessary
  • One cannot be too easily excitable about new design patterns, libraries, and frameworks: Too much of that makes the code too complex to reason about
  • One cannot be too restrained: It staves off creativity and innovation

Professional software development is rarely ever done as a solo pursuit - usually we are organised into teams (or scrums, or whatever label teams are given) A good team needs people with all of the above traits, they need to keep each other at bay in order to balance them.

This balance is sufficient to create a team that works well in writing good code. Better yet, we could all strive to be balanced within ourselves, and wear the different hats at different times as appropriate. After all, if each coder within a team was themselves already balanced, then the team would be balanced by default - and no-one needs to feel awkward for telling what the other person should change about the way that they code.

Door hinge + WD-40 = Smiley Face

If the team is comprised of coders with very very strong bents toward any of these traits, the teamwork is going to be akin to rusty door hinges: the door still can opens and shut, but it is going to make a lot of noise and irritate anyone nearby. If each coder in that team can balance the traits within themselves, that's like a spraying some WD-40 on the hinge.

It's a wrap

Old school silent black and white film with "Fin!"

tl;dr=

The act of writing code is both an art and a science, and at a fundamental level it boils down to data and making decisions about that data. This can be infinitely complex, because of the sheer number of ways to do each thing, and the permutations and combinations thereof.

Coders are human beings, and human beings cannot handle infinite complexity. Thus they seek ways to be still effective and productive at writing code by managing this infinite complexity. This manifests itself in the form of three common traits & behaviours: (1) the copy-pasta, (2) the easily-excitable, and (3) the restrained; and most coders pick up these traits in that order.

Code is often written in teams, and those teams are most effective when they are comprised of coders who exhibit a balanced combination of these traits - wearing different hats for each one, at the appropriate time. Individual coders are most effective when they balance these traits within themselves; and teams comprised of such coders are most effective.

Happy coding!

Haxe for Javascripters

tl;dr= I'm giving a talk on Haxe at a Javascript conference, here are the slides

Haxe for Javascrtipters

Do you use a compile-to-Javascript language? Have you gotten stuck when running Javascript on different devices? Why am I talking about not-Javascript at a Javascript conference?

Haxe is a language that can compile to Javascript, in addition to 8 other languages, and even compiles natively for some targets. It is heavily inspired by ECMAScript, so it can be picked up in less than a day if you already know Javascript.

(as an aside, if you like Typescript, you'll love Haxe)

It is a mature language to boot - it has been around for a decade now.

... The only thing is, hardly anyone has heard of it, and I think that that's quite a shame - My goal for the talk: to get this virtually unheard of language the attention & headspace I think it deserves.

In Haxe for Javascripters, I'll be covering:

ECMAScript

The history of ECMAScript, the set of standards implemented by Javascript, and how this relates to Haxe.

slide

Community

What the Haxe community is like to be a part of.

slide

Haxe to JS

Speaking at a Javascript conference about ... not Javascript? ... Well not quite - I'm not that crazy! In fact Haxe has a Javascript target, and it will the focus of the talk!

slide

Pros/ cons

Highlight some of the pros & cons of Haxe, and compare them to other compile-to-Javascript kits. (I've stolen some material here from Andy Li - thanks Andy!)

slide

Demo

Cap it all off with a demonstration of the Haxe to Javascript target. Time permitting, perhaps even the other targets that Haxe compiles to.

slide


Many thanks to Chris Decoster for reviews and feedback on this presentation.

CampJs tickets are going quick, so get your tickets while their still hot!

PostgreSql now both a relational and document database

Postgres 9.3 added support for new data types, JSON and JSONB; the latter of which is very similar to BSON used in MongoDb.

However, I was very disappointed because while it was possible to Create/Read/Update/Delete entire JSON objects, it was not possible to Create/Update/Delete individual keys within the JSON object - it was all or nothing, which I was not too happy about; even pestering PostgreSql's developers intermittently about. Many others wanted to be able to do this as well.

postgres mailing list on JSONB create/update/delete

The half-baked solution at the time was to use HSTORE instead of JSON/ JSONB, HSTORE was essentially a hashmap, which, unlike JSON, only supported a single level of hierarchy, so using it was rather limiting.

Postgres 9.4 didn't make any progress on this, but it looks like Postgres 9.5, still in alpha, has added support for per-key Create/Update/Delete operations.

The implications of this are profound, as I believe that this makes PostgreSql the first database to properly support both relational (table based) and document database paradigms.

Ever needed to do a join on two documents in MongoDb? You probably had to work around this by doing the join logic client side, and send more than one database query. Ever needed to store unstructured data in MySql? You probably had to use a text field or a blob, and write custom scripts to serialise/deserialise.

Well, soon we should be able to do both at the same time easily!