Google+

Deploying an ember-cli app to Heroku - Demo Apps Only!

Deploying to Heroku is easy... if you can figure out all of the hidden gotchas!

No dev dependencies

That means that you cannot depend on any global npm packages.

Since ember-cli install itself locally by default, the only global package you will need is bower.

npm install --save bower

Except ember-cli

... which must be in both dependencies and devDependencies.

This is because the ember command inspects the package.json in the file, looking for ember-cli. It does this to determine if that project is indeed an ember-cli app. If it does not find this there, it will display an error saying that you need to run the command from within a folder containing an ember-cli app.

If this is too much trouble for what it is worth, simply issue this command instead:

heroku config:set NODE_ENV=staging

... so that Heroku will run npm install instead of npm install --production when it spins up the dyno.

Server on web Proc only

The process that runs the server must be called web. Do not call it main or anything else. If you want to access a server running on a Heroku dyno from port 80 externally, that server must be running in a Proc named web. I wish Heroku's documents actually stated this explicitly.

web: npm run start

Use scripts in package.json

NodeJs packages may define an optional scripts section in their package.json file. For ember-cli apps, use scripts.postinstall to do a bower install; and use scripts.start to start run ember serve

Use the PORT environment variable

When running ember serve, do not use a default port number. Whenever heroku spins up a dyno (which happens at least once per deploy), it will assign a new port number (among other things), and this is the one that Heroku will port forward from port 80.

"scripts": {
    "start": "./node_modules/ember-cli/bin/ember serve --environment=production --port=${PORT}",
    "build": "./node_modules/ember-cli/bin/ember build",
    "test": "./node_modules/ember-cli/bin/ember test",
    "postinstall": "./node_modules/bower/bin/bower install"
},

Note that npm install is not necessary in scripts.postinstall - Heroku does that automatically for all NodeJs projects.

A Word of Caution

You should not use ember serve to deploy production apps. There are possibly some security and performance problems that this entails. But of course, sometimes you simply want to deploy a demo app, and in these cases deploying and ember-cli app like this works quite well.

New to Heroku? - Quick Run-down

Heroku is a cloud hosting service, which allows you to spin up and down instances on the fly. You can operate it entirely via the command line by installing Heroku toolbelt, and deployment happens by pushing to a git remote hosted on Heroku.

If deploying to Heroku for the first time, you will need to set up the requisites on your computer

wget -qO- https://toolbelt.heroku.com/install-ubuntu.sh | sh
# for other OS'es: https://toolbelt.heroku.com/
ssh-keygen # save to id_rsa_heroku
echo "Host heroku.com" >> ~/.ssh/config
echo " IdentityFile ~/.ssh/id_rsa_heroku" >> ~/.ssh/config
chmod 600 ~/.ssh/config
heroku keys:add

To get a NodeJs app up and running on Heroku, first create the app, and when ready for deployment:

git init # if you have not done so already
git add . && git commit -a # commit whatever should be deployed
heroku create name-of-your-app
git push heroku master

Heroku's git repository has a post-hook that runs upon each push, which will attempt to (re)install and (re)deploy your app, and the push will only succeed if it the installation and deployment succeeds.

Migrating from Tumblr and Wordpress to Docpad - Extract and Transform

In the previous post, I made the case for static site generation. Let us take a look at how to extract data from tumblr and wordpress blogs, and transform it for docpad, a static site generator.

Get Your Node On

mkdir blog-extract
cd blog-extract
npm init #accept all the defaults, it isnot very important
npm install --save request mkdirp moment tumblr.js
touch index.js

Edit index.js, and add the following:

var fs = require('fs');
var url = require('url');
var path  = require('path');
var mkdirp = require('mkdirp');
var moment = require('moment');
var request = require('request');
var tumblr = require('tumblr.js');

Now we have a shiny new NodeJs project ready to go, with batteries (dependencies) included.

Wordpress Posts API

Wordpress exposes a JSON API that allows you to extract your posts. There is almost no set up required, as no form or authentication is required.

In order to get our posts, we can follow these instructions.

Extract and transform

With the API documentation in hand, we can now write some code to automate that - we certainly do not want to be issuing multiple wget or curl calls, and then copying the results from them into new files by hand . I would do that for maybe a couple of posts, but since I am dealing with about 80 posts here, and that is certainly going to be too time consuming of an endeavour!

var pos, step, total;

var wordpressSite = 'yourblogname.wordpress.com'; // replace with your own
pos = 0;
step = 20;
total = 0;
do {
    /*
     * Here we do the queries, and be sure to set total so that it loops more than once
     * The looping is necessary, because you cannot download all posts at once
     * and we must paginate the requests
     */
} while (pos < total);

That is the basic run loop. Within the run loop, we perform the requests to the wordpress API server:

    var reqUrl = 'https://public-api.wordpress.com/rest/v1/sites/'+wordpressSite+'/posts/?number='+postsAtATime+'&offset='+postIdx;
    request(reqUrl, function(err, resp, body) {
        if (err || resp.statusCode !== 200) {
            console.log(err);
            return;
        }
        body = JSON.parse(body);
        if (body.total_posts > total) {
            //set total count, should only happen the first time
            total = body.total_posts;
        }
        //parse each of the posts in the response
        body.posts.forEach(function(post) {
            //transform the post into the format required by docpad
            //and write to file
        });
    });

We can take a look at what the API response for each blog post looks like in these instructions.

The format that we need to translate to consists of two important parts:

  • Directory and file name
  • Metadata

The third part is the post's contents, but that can be copied verbatim without any transformation.

For a default docpad blog configuration, this would usually be: src/documents/posts/slug-for-this-post.html

We get check this by looking at docpad.coffee, and inspecting docpadConfig.collections.posts:

`@getCollection('documents').findAllLive({relativeDirPath: 'posts'}, [date: -1])`

We are however, not going to put our extracted files in the posts folder, and put them in a wordpressposts folder instead. Instead we will create a separate folder for all the wordpress posts, and configure docpad to look there as well. This configuration will be covered at the end, so if you want to test things out right away, skip to the bottom of the post

I am using the plugin, docpad-plugin-dateurls, so the URL paths of each of the posts is will match the default wordpress URL paths. Here, we want the directory and file name to follow this pattern: src/documents/wordpressposts/YYYY-MM-DD-slug-for-post.html

        var postUrl = url.parse(post.URL);
        var pathname = postUrl.pathname;
        if (pathname.charAt(pathname.length - 1) === '/') {
            pathname = pathname.slice(0, -1);
        }
        pathname = pathname.slice(1).replace( /\//g , '-');
        var filename = path.normalize('src/documents/wordpressposts/'+pathname+'.html');

For the metadata, we use moment to format the date and time

        var title = post.title && post.title.replace(/"/g, '\\"');
        var date = moment(post.date).format('YYYY-MM-DD hh:mm');
        var tags = Object.keys(post.tags).join(', ');
        var contents = '---\n'+
            'layout: post\n'+
            'comments: true\n'+
            'title: '+title+'\n'+
            'date: '+date+'\n'+
            'original-url: '+post.URL+'\n'+
            'dateurls-override: '+postUrl.pathname+'\n'+
            'tags: '+tags+'\n'+
            '---\n\n'+post.content;

Finally, write the output to file:

        var dirname = path.dirname(filename);
        mkdirp(dirname);
        fs.writeFile(path.normalize(filename), contents, function(err) {
            if (err) {
                console.log('Error', filename, err);
                return;
            }
            console.log('Written', filename);
        });

Tumblr Posts API

Tumblr is a little more involved than Wordpress, as in order to query any of their API, you will need to have a tumblr account (which you probably already have since you are extracting your posts from it), and register a tumblr app to obtain an API keys. Copy your "OAuth Consumer Key", and you are good to go.

Once that is done, we simply need to follow this section in the documents. The upside ofthis slightly higher complexity is that tumblr provides a NodeJs client library that makes it easier to call the tumblr API, and avoid having to deal with making raw HTTP requests, like we did for the Wordpress API.

Extract and transform

var tumblrSite = 'bguiz.tumblr.com'; // replace with your own
var client = tumblr.createClient({
    consumer_key: 'sfsdfsdfsdfjkjksjdfhkjkjhkjshdfkjhkjhskdjfhkjhkjhd' //replace with your own
});
pos = 0;
step = 20;
total = 0;
do {
    /*
     * Perform the paginated requests
     */
} while (pos < total);

Performing the requests:

    client.posts(tumblrSite, {
        offset: pos,
        limit: step,
    }, function(err, data) {
        if (err || ! data) {
            console.log(err, data);
            return;
        }
        if (data.total_posts > total) {
            //set total count, should only happen the first time
            total = data.total_posts;
        }
        data.posts.forEach(function(post) {
            //transform the post into the format required by docpad
            //and write to file
        });
    });

Here, we want the directory and file name to follow this pattern: src/documents/tumblrposts/YYYY-MM-DD-slug-for-post.html

        var ts = moment(post.timestamp*1000);
        var postUrl = url.parse(post.post_url);
        var dateStr = ts.format('YYYY-MM-DD hh:mm');
        var filename = 'src/documents/tumblrposts/'+ts.format('YYYY-MM-DD')+
            '-'+postUrl.pathname.split('/').slice(-1)[0]+'.html';

For the metadata, we want to set the dateurls-override property. Note that this feature is not yet available on in docpad-plugin-dateurls, and you will need my patch for this to work. To get this, modify package.json in your root folder, replacing the version number of the plugin with an explicit git URI, like so:

"docpad-plugin-dateurls": "git+ssh://git@github.com:bguiz/docpad-plugin-dateurls.git#exclude-option",

This tells npm to install a NodeJs package, not from the default npm repository, but instead by cloning a git repository. Unfortunately, this also means docpad will not be able to run the plugin yet, as npm installing a git url does not run prepublish. To work around this, for now, you need to do the following:

npm install
docpad run # fails "Error: Cannot find module 'node_modules/docpad-plugin-dateurls/out/dateurls.plugin.js'"
cd node_modules/docpad-plugin-dateurls
cake compile && cake install
ls out #you should see dateurls.plugin.js
cd ../..
docpad run # success!

For tumblr posts, the default URL path follows the format /post/12345678/slug-for-this-post, and if we migrate posts from the old blog to the new blog, any links, especially external ones, to the site will be broken. That will make for a really annoying experience for those visiting your sites, so it is best to preserve URLs where possible; hence the need to override the default URLs.

        var title = post.title && post.title.replace(/"/g, '\\"');
        var tags = post.tags.join(', ');
        var contents = '---\n'+
            'layout: post\n'+
            'comments: true\n'+
            'title: '+title+'\n'+
            'date: '+dateStr+'\n'+
            'original-url: '+post.post_url+'\n'+
            'dateurls-override: '+postUrl.pathname+'\n'+
            'tags: '+tags+'\n'+
            '---\n\n'+post.body;

Finally, write the output to file:

        var dirname = path.dirname(filename);
        mkdirp(dirname);
        fs.writeFile(path.normalize(filename), contents, function(err) {
            if (err) {
                console.log('Error', filename, err);
                return;
            }
            console.log('Written', filename);
        });

Docpad Configuration Changes

We edit docpad.coffee, in the root directory of the docpad project. Modify docpadConfig.collections.posts to look like this instead.

@getCollection('documents').findAllLive({relativeDirPath: {'$in' : ['docpadposts', 'tumblrposts', 'wordpressposts']}}, [date: -1])

All the wordpress posts should be in src/documents/wordpressposts, tumblr posts in src/documents/tumblrposts. When writing any new docpad posts save them in src/documents/docpadposts.

If you have any docpadConfig.environments configured, be sure to modify each of their collections.posts accordingly too.

That is all there is to do for now. Execute docpad run, and visit the newly extracted blog in a browser!

Where to from here?

One task in blog extraction, that we have not covered here, is that of any static assets, such as images, that may have been hosted on your previous blogs. Most notably, images. If you have hosted these on CDNs, they will continue to work. Otherwise, you will need to extract them too.

Another extraction task that we have not covered are links between posts. Since we have preserved the path for each post's URL here, this should not pose a problem.

The solution to both of these involves parsing the URLs in each post's content, be it href attributes in <a> tags, or src attributes in <img> tags, and download and save them too.

Migrating from Tumblr and Wordpress to Docpad - Static Site Generation

I currently write my blog using tumblr, and previously I blogged using wordpress. While both of these are great platforms, they share common pitfalls, when it comes to giving you control over your writing.

I wanted to be able to have a copy of all the assets that comprise my blog, in its entirety, on my hard disk, and be able to modify and publish them as I pleased. I also wanted to be able to include fancier things in my pages - like embed a Github gist, or create my own d3 visualisation, or, well why not take it to an extreme, create an AngularJs app running within one of my posts; and I wanted to be able to do all of these things without having to log into some website hosted in a far away country, and wait for all those bytes to fly across several oceans and back each time.

Flexibility and control - that is key.

Enter Static Site Generators

For a blog, the contents are almost static. The server only needs to send a different response for a page, when that page has been modified by the author. The exception to this are comments, but with the advent of disqus, that is no longer even a consideration.

A content management system, including both tumblr and wordpress, builds each page upon demand, which can be an expensive operation, as it involves database queries, assemlby of templates, et cetera. Quite often, when a CMS driven site receives a lot of concurrent visitors, its response times start to lag noticeably. To work around this, it has become common practice to cache the results of each dynamically generated page, using tools like memcached.

Static site generation is all about taking caching to the next level. The author of the site knows exactly when the previous cache needs to be invalidated - when they write a new post or update an existing one. Why not, at that point of time, generate the cache contents, and upload them directly to the server? Well, that is exactly what static site generators do; the static files are the cache

What about collaboration?

One of the big advantages of a CMS is that it enables collaboration. If everyone just logs into the same website, be it wordpress.org or tumblr.com, and made their edits on the site, then there is only one copy of the site, and therefore it is easy to manage collaboration on the contents of the site.

Indeed that is a very direct and simple solution that addresses collaboration. We do, however, have a more sophisticated solution, that is already readily available: distributed version control systems. Tools such as git and mercurial have solved the distributed collaboration problem in a rather elegant way. All collaborators get to keep a copy of the site that they are contributing to on their own computers, and thus get the benefits that come along with that. When they are done writing a post, they simply have to push their latest contributions to the master copy. There are built in mechanisms to resolve any conflicts, for example, if two collaborators edit the same file.

Docpad

After reviewing the top few in this humungous list, I have decided that Docpad suits my needs the best, and I should be able to hit the ground running. I will give it a go, and the best part is, if I do not like it, my data is not stuck on some server somewhere - it will all be on my computer, and easily moved to a different static site generator.

In the next post, I will be tackling that very problem: With hosted CMSs, like tumblr and wordpress, getting your data out can be a little tricky; as can be transforming it such that it can be used in a static site generator.

File Download with HTTP Request Header

In a website which uses session-based authentication, when a file needs to be downloaded, and that file should only accessible by the currently logged in user, making that work client side in a web page is extremely easy. That is because the session credentials are typically stored inside cookies, and the browser automatically adds the cookies to every HTTP request’s headers for that domain.

When you create an anchor tag, and set its URL to point to the route that responds with the file to be downloaded; and this anchor tag is clicked, that file will get downloaded, as the authentication requirement for that route is satisfied by the cookie that gets automatically added to the HTTP request header by the the browser.

However, it is not quite so simple if the website uses token-based authentication. This is because browsers do not have any mechanism where it can be told to add the token to each HTTP request’s headers across the board.

Let us say that you do the same thing as before: create an anchor tag, and set its the URL to point to the route that responds with the file to be downloaded. The only difference is that this time, that route requires token in the header, and there are no cookies involved. Now when you click on this anchor tag, the authentication requirement is not met, and the file does not get downloaded.

My instinctive reaction to this was to find out a way to add the token to the header of the HTTP GET request that gets sent upon clicking the anchor link. It turns out, however, that there is no way to do this; there is simply no way to intercept that request and modify it before it gets sent.

So I asked this question on Stackoverflow.

The only way to add a header to any HTTP request is using AJAX - by creating a XMLHttpRequest. However, the catch is that you simply get the data in a JavaScript variable in the callback function when the AJAX response arrives. It does not trigger a file download, like clicking an anchor tag would.

How do we get around this? Turns out that there are a couple of rather creative solutions to the problem.

When the anchor tag is clicked, intercept the event, and initiate an AJAX request, being sure to add the appropriate token in the request header:

    var id = 123;
    var req = ic.ajax.raw({
        type: 'GET',
        url: '/api/dowloads/'+id,
        beforeSend: function (request) {
            request.setRequestHeader('token', 'token for '+id);
        },
        processData: false
    });

When the response is returned, we use a temporary anchor tag when handling it:

    req.then(
        function resolve(result) {
            var str = result.response;

            var anchor = $('.vcard-hyperlink');

            /* transform the response into a file */

        }.bind(this),
        function reject(err) {
            console.log(err);
        }
    );

Depending on the size of the response, and whether the browser is modern enough to support HTML5 File APIs, we either use base64 encoding or temporary files.

Using HTML5 temporary files:

            var maxSizeForBase64 = 1048576; //1024 * 1024
            var windowUrl = window.URL || window.webkitURL;
            if (str.length > maxSizeForBase64 && typeof windowUrl.createObjectURL === 'function') {
                var blob = new Blob([result.response], { type: 'text/bin' });
                var url = windowUrl.createObjectURL(blob);
                anchor.prop('href', url);
                anchor.prop('download', id+'.bin');
                anchor.get(0).click();
                windowUrl.revokeObjectURL(url);
            }

Using base64 encoding:

            else {
                //use base64 encoding when less than set limit or file API is not available
                anchor.attr({
                    href: 'data:text/plain;base64,'+FormatUtils.utf8toBase64(result.response),
                    download: id+'.bin',
                });
                anchor.get(0).click();
            }

In both cases we set the anchor tag to a data URI or file URI, and then trigger a click event on it.

The caveat for this however, is that both of these approaches are going to be rather inefficient when downloading and processing large files. More so for the base64 encoding method than the HTML5 File API method.

One way of solving this problem is to modify the server such that the route that requires the token in the HTTP header does not respond with file contents, but instead with the URL of a another route, which does not require anything in the header at all, but expires very quickly. It is this route which actually returns the file contents.

In my case however, I only needed to download rather small files (mostly under 1KB), so this worked very well, as I wanted to find if there was a way to solve this problem client-side. With large files however, I would recommend considering using a server-side solution.

How to Write a BroccoliJs Plugin

Recently, I released broccoli-sprite. I was just a week into using BroccoliJs for the first time, and writing a plugin for a build system that I had barely used was understandably tricky.

While writing it, I googled quite a bit for how to write a BroccoliJs plugin, but there really has not been much written about it. I would like to make it easier for others doing the same thing, so here is a quick overview of the process of creating a BroccoliJs plugin.

Basics

What is BroccoliJs? Think [GruntJs], but different. Different how? Well in a number of ways. Its creator, Jo Liss, [explains its philosophy in her post on its release].

tl;dr= Plugins can chain their output to one another, and the built-in watch only rebuilds what has changed rather than the whole lot.

There is one more thing to it: If you are building an app using ember-cli, you will need to use BroccoliJs.

Sold! Now Time to Write a BroccoliJs Plugin

The first thing to take a look at is the plugin API specification. Looks very straight forward: There are just two things that you need to implement: tree.read()and tree.cleanup(). The former, however, does not really do much that is useful, on its own at least.

Getting started

All BroccoliJs plugins are NodeJs modules that should be installed inside a project

cd my-project/
npm install --save-dev broccoli-my-plugin

… and thus the first step is to create an npm package:

mkdir broccoli-my-plugin #replace `my-plugin` with the name you would like
cd broccoli-my-plugin/
#if you plan to use version control (which is a good idea), do it now, e.g.
#git init && git flow init
npm init #this creates `package.json`
#be sure to specifiy one of the keywords as `broccoli-plugin`

Now you will need to edit index.js, which exports your plugin:

var BroccoliMyPlugin = function BroccoliMyPlugin() {};
modules.exports = BroccoliMyPlugin;

Extending an Existing BroccoliJs Plugin

BroccoliJs has several plugins, that are designed to be extended. The one that we will look at here is broccoli-writer.

Install it:

npm install broccoli-writer

Edit index.js:

var brocWriter = require('broccoli-writer');

var BroccoliMyPlugin = function BroccoliMyPlugin() {
  if (!(this instanceof BroccoliMyPlugin)) {
    return new BroccoliMyPlugin();
  }
};
BroccoliMyPlugin.prototype = Object.create(brocWriter.prototype);
BroccoliMyPlugin.prototype.constructor = BroccoliMyPlugin;
BroccoliMyPlugin.prototype.description = 'my-plugin';
modules.exports = BroccoliMyPlugin;

Here we have simply extended the function exported by broccoli-writer using prototypical inheritance. At the moment it does not do anything at all, and we will add that next.

Adding functionality

Firstly, we should make the plugin able to accept some input parameters. All BroccoliJs plugins must accept an input tree as its first argument. Any subsequent parameters are completely up to you as the plugin developer. A common pattern, however, seems to be to accept just one parameter, and options hash, which is what we will do here.

var brocWriter = require('broccoli-writer');

var BroccoliMyPlugin = function BroccoliMyPlugin(inTree, options) {
  if (!(this instanceof BroccoliMyPlugin)) {
    return new BroccoliMyPlugin(inTree, options);
  }
  this.inTree = inTree;
  this.options = options || {};
};
BroccoliMyPlugin.prototype = Object.create(brocWriter.prototype);
BroccoliMyPlugin.prototype.constructor = BroccoliMyPlugin;
BroccoliMyPlugin.prototype.description = 'my-plugin';
modules.exports = BroccoliMyPlugin;

We add the inTree and options parameters to the constructor function, and then save them in the instance. If you wish to specify default options, or other instance variables, this is where you would parse and set them.

Next we can implement the main functionality, the part where we specify the thing that this plugin does. Since this plugins extends the broccoli-writer plugin, we do this by specifying a writefunction:

BroccoliMyPlugin.prototype.write = function(readTree, destDir) {
  var self = this;
  return readTree(this.inTree).then(function (srcDir) {
    /* use srcDir and information from self.options to figure out which files to read from */
    /* use destDir and information from self.options to figure outwhich files to write to */
    /* synchronously read input files, do some processing, and write output files */
  });
};

readTree is passed in as the first variable to the write function, and this is a function that returns a promise that you should return. Callthen()`on the promise, and do the processing in the callback function. Here you do whatever it is the plugin needs to do; but you have to do it synchronously - no callbacks allowed.

Asynchronous Plugins

Most of the time however, we want to do things asynchronously - after all, that is the NodeJs way! See Mixu’s article on control flow in NodeJs for an excellent introduction to asynchronous code in NodeJs. We need to get a little more advance than this however, and use promises instead of callbacks. Not to worry though, promises are actually much more straight forward to use than callbacks! In fact, we have already used the one returned by the readTree function previously.

We shall use promises implemented in the RSVP library, as that appears to be the most popular choice amongst Broccoli plugins; although you are free to use any other promise library.

Install RSVP:

npm install --save rsvp

Include RSVP:

var rsvp= require('rsvp');

Modify the readTree callback to create a promise an return it.

return readTree(this.inTree).then(function (srcDir) {
  /* use srcDir and information from self.options to figure out which files to read from */
  /* use destDir and information from self.options to figure outwhich files to write to */
  var promise = new rsvp.Promise(function(resolvePromise, rejectPromise) {
    /* asynchronously read input files, do some processing, and write output files,
       for example, here we have `someAsyncFunc` that does this` */
    someAsyncFunc(function(err, asyncData) {
      if (err) {
        rejectPromise(err);
      }
      else {
        resolvePromise(asyncData);
      }
    });
  });
  return promise;
});

Here, since we return a promise, BroccoliJs knows to wait until it is either resolved or rejected. For the more astute, you will notice that here we actually have a promise within a promise, as readTree itself returns a promise. We could possibly refactor this to chain the promises instead of nesting them, but I shall leave that as an exercise for the reader!

Fin

Now we have a functional BroccoliJs plugin, and it is ready to be published:

npm publish

… and now anyone can npm install it!

Going further

Besides broccoli-writer, there is also broccoli-filter, and broccoli-caching-writer, which I have not covered here.

Depending on what your plugin does, you might want to extend these instead. One great way to learn more about writing BroccoliJs plugins is to search for existing ones, and examine the source code for each one. Most of them are fairly simple, only containing a single index.js file, which means that you will likely find what you are looking for rather quickly. In fact, that is precisely what I did to get up to speed, when writing brocoli-sprite.

NPM

Good luck with yours!