ÆFLASH

Make and Browserify

I've written a lot about front-end build tools over the past year, and I have one more build-related topic to write about before I go back to talking about functional programming and javascript: Make.

Make is over 30 years old and a standard tool on unix-ey systems (e.g. Linux and OSX). You probably already have it installed on your machine. While it is ubiquitous for C builds, it is flexible enough that you can use it to automate just about anything build-related.

I have been a strong proponent of Grunt for the past few years, but I feel like it is too complicated for what it does. I had found myself fighting with it a bit, especially when I needed to define a complicated sequence of processing steps with intermediate artifacts. It also made me uneasy to see Gruntfiles that were hundreds of lines long, or Grunt configuration folders that contained dozens of small javascript files for large projects.

Meanwhile, I still was using Make. It had become standard at my company to use Makefiles as a sort of "command rolodex" that would list common operations to run on a project, such as make setup to run npm install and any other dependency management, make dev to start up Grunt or any other watchers, and make test or make ci to run unit or integration tests. But these were just a shallow usage of Make -- I wasn't taking advantage of Make's killer dependency management features. I eventually came to realize (with the help of others) that Make is actually better than Grunt in many regards.

If you've ever done any C or C++ programming in linux, you've probably seen a makefile full of rules like this:

foo.o: foo.c foo.h
  gcc foo.c -o foo.o

foo.o depends on foo.c and foo.h, so whenever any of those are newer thatn foo.o, run the gcc command to build it. If foo.o is newer, do nothing. Seems pretty basic on the surface, but within it is beautiful simplicity. Let's bring it to the javascript world:

dist/js/app.browserify.js: $(wildcard app/*.js app/**/*.js node_modules/*/package.json)
  browserify app/index.js --output $@

dist/js/app.min.js: dist/js/app.browserify.js
  uglifyjs --compress $< --output $@

We've just defined how and when to build our dev and production JS bundles. Our browserify bundle depends on anything in the app/ or node_modules/ folder, so when any of those update (or one of our node modules), just run Browserify. ($@ is an automatic variable in Make that just means "the name of the current target".) The minified version depends on the bundle as a prerequsite, so when the bundle updates, run Uglify. ($< is another special var that means the first of the list of prerequsites.) You can set up an alias for your JS files like so:

scripts: dist/js/app.browserify.js dist/js/app.min.js

# required so if you do have a file or folder called "scripts", it will
# always check the prereqs
.PHONY: scripts

Run make scripts and your JS will bundle and minify. If it needs to.

You can also build other things, and perform other tasks pretty easily:

dist/css/main.css: $(wildcard styles/*.scss styles/**/*.scss)
  node-sass --include-path styles/ styles/main.scss $@

styles: dist/css/main.css

static:
  mkdir -p dist
  cp -av public/* dist/

lint:
  jshint --config .jshintrc app/ test/
  jscs app/ test/

build: static styles lint scripts

.PHONY: scripts styles static lint build

Sass compilation and linting accomplished. We also set up a target that will copy any static images or html files we need to our dist/ folder, as well as a build target that will build everything with a single command.

It's also easy to make sure that you have the correct versions of your build tools on hand -- just install them normally as devDependencies with NPM, and add this line to the top of your Makefile to add the executables to your path:

export PATH := ./node_modules/.bin/:$(PATH)

This is the one caveat with a Make-based build system -- any tool will have to be its own separate program. However, you are not limited to javascript or node -- any program in any language will work as long as it is a unix process. This also means you can use any of the standard unix tools where applicable, including pipes!

dist/js/app.min.js: dist/js/app.browserify.js header.js footer.js
  cat header.js $< footer.js | sed 's/NODE_ENV/"production"/g' | uglifyjs -c -o $@

We added a header and footer around our bundle, did some variable substitution, and minified, all in one line. This is using standard unix tools that have existed for decades and are well understood. No need to use grunt-contrib-concat when you can just cat. With proper piping (and use of file descriptors, if things get complicated) you can also obviate the need for intermediate temporary files.

Watching

Make is awesome for defining a build, but it is still up to you to run make build after you modify a source file. For a nice development workflow, file system watching is still where Grunt shines. A very simple way to automate Make is to just hav grunt-contrib-watch watch your entire project folder, and then just have it invoke make with grunt-exec when anything changes. If you have your rules defined properly, with targets and prerequsites accurate, it will only build what it needs to.

grunt.initConfig({
  watch: {
    all: {
      files: ["app/**", "styles/**/*", "public/**/*", "node_modules/*/package.json"],
      tasks: ["exec:make"]
    }
  },
  exec: {
    make: {
      command: "make build"
    }
  }
});

A bit brute force, but it will work. I also like to hook up a dev server with livereload:

grunt.initConfig({
  watch: {
    all: {
      files: ["app/**", "styles/**/*", "public/**/*", "node_modules/*/package.json"],
      tasks: ["exec:make"]
    },
    livereload: {
      files: ["dist/**/*"],
      options: {
        livereload: 35729
      }
    }
  },
  exec: {
    make: {
      command: "make build"
    }
  }
  connect: {
    dist: {
      options: {
        base: ["dist", "node_modules"],
        port: 8888,
        livereload: 35729
      }
    }
  }
});

Browserify

The main drawback to the blunt-force-watch-everything strategy is that your browserify build will be slow. It will rebuild everything fron scratch each time. This will add several seconds to your edit → save → rebuild → livereload → debug → edit cycle, so I would recommend adding grunt-browserify with watch: true to your Gruntfile, and have the make task run when anything except your source javascript changes. This will get you back to sub-second rebuilds. See my previous article for more details on how to set up the Gruntfile with Watchify.

This will mean that you will have to duplicate your browserify configuration in the Gruntfile and in the Makefile. It can be a pain, but if you are doing things right, your custom configuration should be minimal, if non-existent.

I strongly reccommend against anything that requires custom Browserify configuration, or anything that deviates from the node.js module conventions. I've done too much mucking around with bizarre Browserify builds that all become more and more fragile until they break. Aliases, browser overrides, shims, multiple bundles -- they all over-complicate things, will make your build less robust, and are more headaches than they are worth. Manually place things in node_modules/, create your own modules were applicable, favor libraries that are CommonJS native, use the require("foo-browserify") wrapper if you have to, manually shim if there is no wrapper (and automate it with Make), and forget about multiple bundles unless they are absolutely necessary. I've wasted too much time dealing with the fallout from these techniques, and have concluded that they are more trouble than they are worth. The problems with the artifacts and build process are worse than the slightly increased ugliness while coding. Simple module setup also keeps the door open for the increasing numbers of tools that can grok the Node's require() logic. Complicated module setup inhibits interoperability and closes your app off to those special tools. I could write an entire article about this.

So define your Browserify build in both the Makefile and Gruntfile, but keep it simple.

Drawbacks

This won't work on Windows. You can probably get some basic things to work with MinGW, but you won't be able to easily use basic unix tools and pipes and I forsee lots of ugliness with PATHs. Gulp is probably your best bet if you have to support Windows. You trade unix pipes for node streams and will have to write a lot more code.

Make has no built-in watching or dev-server capabilities, so you'll still have to use Grunt or another watcher for those features.

Make has ugly syntax: Bash-like variables, esoteric automatic variables, and strangeness with .PHONY targets. It's less structured than JS, and does take some getting used to. On the other hand, the language is all well documented, follows simple rules, and there are lots of examples.

Other ideas

  • You can make makefiles more readable by using variables:
BROWSERIFY_DEPS = $(wildcard app/*.js app/**/*.js node_modules/*/package.json)
JS_BUNDLE = dist/js/app.browserify.js

$(JS_BUNDLE): $(BROWSERIFY_DEPS)
  browserify app/index.js -o $@
  • Make supports includes. You can put common tasks in a small file and import it. Store it in a small npm module and you can make repetative configuration portable:
include ./node_modules/build-tools/makefiles/CommonMakefile.mk
  • Here's an easy way to run npm install only if you need to:
setup: .last_install

.last_install: package.json
  npm install
  touch .last_install

Make sure to add .last_install to your .gitignore and delete it on make clean.

  • Use disc to see pretty analysis of your bundle's file size in your browser:
disc:
  browserify --full-paths app.index.js | discify -O

Summary

  • Define your targets, dependencies, and build tasks.
  • Use NPM to manage tool dependencies.
  • Add ./node_modules/.bin/ to Make's $PATH.
  • Write your own build scripts if you need to.
  • Use standard unix utilities and pipes where applicable.
  • Keep Browserify configs simple.
  • Use Grunt for watching and livereload, and have it just invoke make.
  • Use variables and includes to keep Makefiles clean.

Make, processes and pipes. Now get back to writing killer apps!

code javascript make makefiles commonjs browserify grunt Gruntfile livereload

Watchify and Grunt

One of the things that inevitably happens as your Browserify project gets large is that your build starts becoming slow. What starts out as a one-second build starts taking four to five to ten to fifteen seconds to build as you add more libraries and modules. This really can slow down your development process, as the edit → save → rebuild → livereload → debug → edit cycle starts taking longer and longer.

Enter Watchify

Luckily there is a tool called Watchify designed to work around this. It essentially is a tool that caches the incremental results of Browserify, sets up a watcher for every file in the dependency graph of your app, and quickly rebuilds when any of those files changes. A five-second Browserify build can be rebuilt in 100 milliseconds if only a single file changes.

What does this look like? Lets take the command from the end of my last article:

$ browserify app/lib/lib.js app/main.js -t browserify-shim -r lodash -o dist/js/app.js

After you npm install -g watchify, this simply becomes:

$ watchify app/lib/lib.js app/main.js -t browserify-shim -r lodash -o dist/js/app.js

Watchify has the exact same interface as Browserify. When it starts up, it will do the lengthy full-build process, but touch one of your source files in another terminal, and the rebuild will be instantaneous in comparison — usually around 100ms.

You'll notice that I'm only building a single bundle, rather than the lib/main build I've described in past articles. The only reason I recommended the dual-bundle strategy in the past was to shorten the rebuild step in the development cycle. However, Watchify is fast enough that the dual-bundle strategy is no longer necessary. This greatly streamlines your build configuration.

Integrating with Grunt

This is great and good, but how would you integrate Watchify in the Grunt workflow? Simple. First of all, Watchify is already included in grunt-browserify 2.x, so all you have to do is add a watch: true flag to the options.

With a dual-bundle strategy, you may have had a browserify config that looked something like this:

//...
browserify: {
  lib: {
    files: {
      "tmp/lib.browserify.js": ["app/vendor/lib.js"]
    },
    options: {
      transform: ["browserify-shim"],
      require: sharedModules
    }
  },
  main: {
    files: {
      "tmp/main.browserify.js": ["app/main.js"]
    },
    options: {
      external: sharedModules
    }
  }
}
//..

The require/external juggling between bundles no longer has to happen, so this can be condensed to:

browserify: {
  all: {
    files: {
      "public/app.browserify.js": ["app/vendor/lib.js", "app/main.js"]
    },
    options: {
      transform: ["browserify-shim"],
      watch: true
    }
  }
}

Much cleaner. Watchify will play nicely with transforms, such as browserify-shim or es6ify. It's also fast enough that we don't have to worry about browserify-shim being run on every module. Also, source maps will be created properly as well. Normally source maps would be lost in the concat step with a dual-bundle strategy, but since there is now a single bundle that information doesn't get wiped out.

grunt-contrib-watch

You will have to remove the existing watch configuration for browserify. grunt-browserify with watch: true will handle file-system watching.

grunt.initConfig({
  watch: {
    lint: {
      files: ["lib/**/*.js", "test/**/*.js"],
      tasks: ["jshint"]
    },
    livereload: {
      files: ["public/**/*"]
      options: {
        livereload: true;
      }
    }
    // no configuration for browserify
    //...
  },
  connect: { // local server with livereload support
    all: {
      options: {
        base: "public",
        livereload: true
      }
    }
  }
  //...
});

However, the programmatic interface to watchify (that grunt-browserify uses) will only stay running as long as the parent process (in this case Grunt) stays open. You will need to rely on grunt-contrib-watch to keep the process running. To accomplish this, just run the browserify before you run the usual watch task.

$ grunt browserify watch

I usually create a dev task that handles all of this, so you only have to type grunt dev to kick everything off.

grunt.registerTask("dev", ["build_non_js", "browserify", "connect", "watch"])

Doing a one-off build requires no configuration — watchify will just do its initial build, then exit.

$ grunt browserify

Closing thoughts

I am still in awe at how fast watchify is, even after using it for a month. I'll reiterate that the dual-bundle strategy as described in past articles is now pointless. It's much better to use a single bundle now.

Another observation: CommonJS/Browserify's one objective weakness as compared to something like RequireJS/AMD is that it does require a build step. With RequireJS, you can simply define a base directory, and all of your individual files can be loaded without any transformation. Just Ctrl+S, Alt+Tab, and Ctrl+R to test out a javascript change — very fast, very convenient. However, now that a Browserify bundle can be rebuilt in 100 milliseconds, and can be coupled with a livereload setup, that advantage is not as great as it used to be...

Enjoy your sub-second rebuilds!

code javascript modules commonjs browserify grunt Gruntfile livereload

Grunt-Browserify 2.x and Browserify-Shim

Much has changed with Grunt and Browserify over the past several months. Unfortuantely, at the time of my last Grunt/Browserify guide, the versions of the plugins I was using were already out of date. Since then, Browserify, grunt-browserify, and browserify-shim have all gone through major version updates, and as the semantic version implies, there are many backwards-incompatible changes. This guide will go over the changes needed to get something like what is described in the previous guide working. It is targeting browserify@3.44.2, grunt-browserify@2.0.3 and browserify-shim@3.4.1, although newer versions will likely work.

Major Changes

The largest change, is that grunt-browserify no longer includes browserify as a dependency -- it is now a peerDependency. You have to npm install both browserify and grunt-browserify. Also, grunt-browserify no longer includes browserify-shim, so you will also have to npm install it.

Much of the configuration now has to go directly in the package.json for everything to work. Unfortunately, the Gruntfile.js can no longer be the single point of configuration for your builds. On the upside, this does give more flexibility in how you can create bundles, and opens up the use of other tools besides Grunt.

Example Configuration

This example project will set up the build for a Backbone/Marionette application that uses jQuery and Lodash, as well as dustjs-linked in for templating. It will also use the dual-bundle strategy of separating the third-party libraries from the main app for faster rebuilds.

package.json

Here are the relevant bits from the package.json:

  "devDependencies": {
    "browserify": "3.44.2",
    "browserify-shim": "3.4.1",
    "grunt": "0.4.4",
    "grunt-browserify": "2.0.3",
    //...
  },
  "dependencies": {
    "dustjs-linkedin": "2.3.4",
    "lodash": "2.4.1",
    //...
  },
  "browser": {
    "jquery": "./app/vendor/jquery-2.1.0.js",
    "lodash": "./node_modules/lodash/dist/lodash.compat.js",
    "backbone": "./app/vendor/backbone-1.1.js",
    "marionette": "./app/vendor/backbone.marionette.js"
  },
  "browserify-shim": "./shims.js"

We have updated the browserify plugins, and are including Dust.js and Lodash from npm. We are also simply keeping static versions of jQuery, Backbone, and Marionette in a vendor/ directory.

What is new here is the browser field. This was created so modules could direct Browserify to alternate versions of packages for use on the browser. It is very similar to the alias feature from previous versions of grunt-browserify. Here we are telling browserify where to find jquery, backbone, and marionette, as well as using the compatibility version of lodash, rather than the node-optimized version.

Also new is the browserify-shim field. Browserify-shim is currently only configurable through the package.json. This is similar to the shims field from previous versions of grunt-browserify. We could have placed the configuration here, but it is more useful to have it in a separate file.

shims.js

module.exports = {
  jquery: {exports: "jQuery"},
  lodash: {exports: "_"},
  backbone: {
    exports: "Backbone",
    depends: {lodash: "underscore", jquery: "jQuery"}
  },
  marionette: {
    exports: "Marionette",
    depends: {lodash: "underscore", jquery: "jQuery", backbone: "Backbone"}
  }
};

The browserify-shim config must be the long form described in the readme, and is just a simple JS module. It defines what each library's global export is, as well as what the dependencies are (if it requires them). We can specify simply "jquery" rathern than "./path/to/jquery-2.1.js" due to the browser field in the package.json. Browserify-shim also takes that field in to account.

Another thing to point out is that browserify-shim is not as fast as it used to be, since it now parses each source file, rather than just appending a footer and a header. There actually is no speed benefit to shimming jQuery and Lodash, since they now support CommonJS out of the box, but I included them here for demonstration purposes. I would say the dual-bundle strategy is even more important, as builds for large applications can take several seconds.

Gruntfile.js

  //...
  var
    shims = require("./shims"),
    sharedModules = Object.keys(shims).concat([
      // place all modules you want in the lib build here
      "dustjs-linkedin"
    ]);

  grunt.initConfig({
    //...
    browserify: {
      lib: {
        files: {
          "tmp/lib.browserify.js": ["app/vendor/lib.js"]
        },
        options: {
          transform: ["browserify-shim"],
          require: sharedModules
        }
      },
      main: {
        files: {
          "tmp/main.browserify.js": ["app/main.js"]
        },
        options: {
          external: sharedModules
        }
      }
    },
    //...
  });

Many things are going on here. First of all, we are requiring the same config file the browserify-shim uses to prevent some duplication. Each of the keys of that file's exports (the names of the modules) will need to be included in the lib build, and marked as external in the main build. We can also add more module names to that list to move them from the main build to the lib build, as we are doing with dustjs-linkedin.

In the lib build, we use the browserify-shim transform, and explicitly require each of the sharedModules. In the main build, we do not use the browserify-shim transform, since it significantly slows things down, and mark each of the sharedModules as external. We also dont use the prelude option described in the previous guide to specify an un-minified module loader. There is now no way to override this, so debugging require()s in the browser will be a bit more cumbersome.

These two bundles will be concatenated together exactly as described in the previous guide. Don't forget the semicolon separator!

All in all, the browserify config is much simpler. However, there is still one more thing we need to do.

app/vendor/lib.js

var
  lodash = require("lodash"),
  jquery = require("jquery"),
  backbone = require("backbone");

require("marionette");

// Backbone and Marionette rely on $, _, and Backbone being in the global
// scope. Call noConflict() after the libs initialize above.
lodash.noConflict();
jquery.noConflict(true);
backbone.noConflict();

In the entry module for the lib build, we need to require each of the shimmed modules in order, and then call noConflict() on each of them to avoid polluting the global window object. This is because when a shimmed module depends on another, browserify-shim will insert a global variable with a reference to that dependency. This might not be a big deal for every app, but it is generally best practice to avoid attatching polluting the global scope, especially if your app wil live with other third-party scripts. It is a bit annoying and inelegant that we have to do this, but browserify-shim does this to ensure maximum compatibility with non-CommonJS-aware libraries. It's just one of the ways the Browserify abstraction leaks.

Conclusion

I hope this guide was helpful to those of you who are having issues getting the build to work as it did with prior versions of these tools. I do think a project set up in this way is in a better place than a pure Grunt approach. For example, since most things are configured through the package.json you can build your app entirely through the command line:

$ browserify app/lib/lib.js app/main.js -t browserify-shim -r lodash > dist/js/app.js

Granted, you lose the speed advantages of a slower lib build and a fast main plus a concatenation, but it opens your app up to powerful analysis tools like colony and disc and anything else that respects the browser field. I would say these benefits offset the drawbacks of losing Gruntfile.js as the single source of configuration.

Happy Browserifying!

code javascript modules commonjs browserify grunt Gruntfile livereload

A Year with Browserify

NOTE: this guide is out of date. It is only vaild for grunt-browserify@~1.3.0 and its included versions of browserify and browserify-shim. An updated guide is here.

It's been a year since I wrote the long Javascript Module Systems article, in which I concluded that CommonJS and Browserify was the best way to write modular javascript for the browser. Over the last year, I've had a chance to test that conclusion. I've used Browserify for every client side project since then, as well as the planned migration of a large application that previously used a concatenation-style build. Along the way, I've learned a lot about the whole Browserify process, some tricks, and some pitfalls.

Grunt is awesome

Although Browserify's command-line interface is pretty good, it's programmatic interface is much more powerful. Grunt in conjunction with grunt-browserify is perhaps the best way to set your configuration. While overkill for a simple case, it is invaluable as your config gets more complicated.

//...
browserify: {
  main: {
    "dist/main.js": ["src/main.js"],
    options: {
      transform: ["brfs"],
      alias: [
        "node_modules/lodash/lib/lodash.compat.js:lodash",
      ],
      external: ["jquery"]
    }
  }
}
//...

I will explain everything going on here in more detail later.

Grunt's watch mode -- where it monitors the file system for changes and runs tasks based on what has updated -- has revolutionized development. Coupled with live-reload, you could not have a more efficient front-end code/build/run/debug loop. Having your application refreshed automatically in the time it takes to alt-tab to the browser can't be beat.

Node Modules are awesome

The biggest advantage of Browserify is that you can use Node modules from NPM right out of the box. npm install module-foo --save and require("module-foo") is available to you. A lot of utility libraries, such as Lodash, Underscore, Async, and now even jQuery 2.x are available through a simple npm install, and require()able with no extra configuration. If bundling these heavyweight libraries isn't desireable, you can also include smaller, more-focused modules from Component as well. There is a decomponentify transform that can convert a Component module to a node-style module, but many components are also dual-published to NPM as well. Since both Component and Browserify use the CommonJS module style they are very inter-operable.

The real advantage comes from writing and using your own node modules. Encapsulating some functionality in a module allows you to completely isolate it from the rest of your application. It helps you think about defining a clear API for whatever part of your system a module may cover. You can also test it in isolation from the rest of your system. You do not have to publish it to NPM to use it, if it is very specific to your application -- you can simply refer to it using a git url in your package.json:

//...
  "my-module": "git+ssh//git@bitbucket.org/username/my-module#0.3.0",
//...

You can then require("my-module") as you would expect. You can also refer to a specific branch or tag by using the hash at the end. While it does not use the semver rules, you can at least freeze a dependency to a specific version and upgrade when ready. If you do not care about semver and always want to use the latest version, you can just use my-module#master.

Managing the Menagerie

Scaffolding and Project Templating

Using many node modules can be cumbersome, but there are extra steps you can take to make the whole process more manageable. First of all, to make creating new modules as friction-free as possible, I'd recommend using a tool like ngen to be able to quickly create scaffolding for a project. Customize one of the built-in templates to your liking. You can also pull commonly-used Gruntfile snippets and other tools into a common build-tools repository that every project uses as a devDependency. For example, we have a template that includes:

  • a lib/.js
  • a lib/.test.js spec file
  • a README.md with a customizeable description and the basics on how to develop the module
  • a Gruntfile.js with watch, browserify (to build the spec files for testing), jshint, and release (for tagging)
  • .editorrc and .jshintrc files for maintaining code consistency.
  • a testem.json for configuring testem for running the tests continuously in multiple browsers.
  • a Makefile that serves as the catalog for all tasks that can be run and for bootstrapping development quickly. a make dev will install all the dependencies, build, and launch testem in watch mode with a single command. It also contains a make initialize command that will create the repository on Github/Bitbucket, and add all the hooks for things like Campfire and Continuous Integration.

You get all of this out of the box with a single ngen command. It is somewhat similar to Yeoman, except not opinionated. Instead, you can tailor your ngen templates to your own opinions.

Npm Link

Developing a sub-module concurrently with a parent module can be cumbersome. Oftentimes you do not want to have to push/publish a new version to test a child module in a parent module. This can be easily solved with npm link. If your child module is checked out in the same directory you can simply npm link ../child-module and it will create suitable symlinks for you.

Meta-Dependency Management

If you have dozens of modules it may become tedious to keep everything up to date with the latest versions. I would recommend writing a tool or script that reads the package.jsons of all your modules using the Github/Bitbucket APIs and detects out-of-date packages. If your project is open-source on Github, you can use David for free. A more advanced version of this tool would automatically update the dependencies, test the project, and publish a new version.

Continuous Integration is a must. Use something like Travis CI and put the badges on your README. I will also note that Bitbucket has more favorable pricing than Github for many small private repositories, and near feature parity otherwise. Bitbucket charges per collaborator, while Github charges per private repository.

Speeding up your build

So you are sold on Browserify. You set up a new project, and start including all your dependencies. You include jQuery, Lodash, Backbone, Marionette, Async, some snippets from Foundation, a couple jQuery plugins from Bower, a handful of private modules, some Components, and a few dozen classes from your app's core. You then notice that Browserify takes ten seconds to build and you spend an eternity waiting for your tests to re-run each time after you hit save in watch mode. How can you improve things?

Shim It

Browserify's default behavior when it encounters a source file is actually pretty involved. It has to parse the file and generate an AST. It then has to walk the AST to find any relevant require() calls as well as other Node globals. It then has to require.resolve each module and recursively parse those. However, this is not needed if you know a library doesn't contain any require() statements. Parsing the 100k lines of jQuery only to find it doesn't have any require()s is a waste. Instead what you can do is use browserify-shim, which is automatically included in grunt-browserify,

//...
browserify: {
  main: {
    "dist/main.js": ["src/main.js"],
    shim: {
      jquery: {
        path: "node_modules/jquery/jquery.js"
        exports: "$"
      }
    }
  }
}
//...

Since jQuery exports a global variable in lieu of a module system, we can take advantage of this to avoid expensive parsing. We just create an on-the-fly module named jquery and make sure that its known global exports ($) end up being the module.exports. Every module in the bundle can just var $ = require("jquery"). Also note that window.$ and window.jQuery will still be created in this case unless you call jQuery.noConflict(true) somewhere in your code.

If a shimmed module relies on other modules you can just add a depends object to the config:

shim: {
  jquery: {path: "node_modules/jquery/jquery.js", exports: "$"},
  backbone: {
    path: "app/vendor/backbone-1.1.js",
    exports: "Backbone",
    depends: {
      lodash: "_",
      jquery: "jQuery"
    }
  },

The key of the object is the name of the module to Browserify, and the value is the global name the shimmed module expects for the dependency. In this example, Backbone expects Lodash and jQuery to be available as window._ and window.jQuery. window.Backbone is then captured and made available as if there was a node module named backbone.

Shimming is usually faster that using a transform like debowerify or decomponentify, since those both involve parsing. (This only works if those modules export a global as a fall-back.) If a module does not export anything (such as a jQuery plugin), set exports: null to not export anything. You will have to call require("jquery.plugin.foo") somewhere in your code for it to be mixed in to the main jQuery module. Gluing together disparate module systems can get a bit ugly, I'm afraid.

Splitting up the build

You may also notice that re-bundling all your dependencies for every small change in your core app code is a bit inefficient. It is strongly recommended to create a separate build for your large, seldom-changing libraries.

One of the interesting features of the Browserify require() function is that it will defer to a previously defined require() function if a module can't be found in its bundle. Step through a Browserified require() using a debugger and you will see the logic. If you include two Browserify bundles on a page, module thats cant be found in the second will be looked up in the first. Very handy for splitting up the build and making everything work.

This is what the grunt config would look like:

//...
browserify {
  libs: {
    "dist/libs.js": ["src/libs.js"],
    shim: {
      jquery: {path: "node_modules/jquery/jquery.js", exports: "$"}
    }
  },
  main: {
    "dist/main.js": ["src/main.js"],
    external: ["jquery"]
  }
},
//...

The shim makes the libs build register jquery, and the external parameter instructs the main build to not include jQuery. On its own, the main bundle would throw an error on load, but when you load both dist/libs.js and dist/main.js on a page the main require function wont find the module named jquery, and defer to the libs require function, where jquery will actually be found. Now you can configure your grunt-contrib-watch to only build browserify:main when your JS in src/ change, as opposed to building everything all at once. This is actually quite speedy -- parsing a few dozen src/ files is generally 2 to 5 times faster than bundling all the libraries a typical application would include. This means your dev build and tests can be refreshed in one to two seconds.

Also, if you still want a single JS file in the end, you can just concatenate the libs.js and main.js -- it works equivalently to including two scripts.

Collapsing dependencies

Once you fall in love with private node modules, you may find it conflicts with your love for handy utility libraries like Lodash. You may find that your private-module-a depends on lodash@1.3.x and private-module-b depends on lodash@1.5.x, and your parent project depends directly on lodash@2.4.1. You inspect your bundle and find that three different versions of Lodash are included, needlessly adding to your app's file size.

While I would argue that this might be desirable in certain cases for certain modules (according to semver, a major version increment could include backwards-incompatible changes), you probably only want to include one version of Lodash in your final build. There is a clever and counterintuitive way to fix this in your browserify config:

//...
  alias: ["lodash:lodash"]
//...

By aliasing Lodash to itself, it guarantees that any require("lodash") encountered while parsing will resolve to the parent project's version of Lodash. We basically just short-circuit the module lookup process. Normally Browserify would use the version of Lodash local to private-module-a, but aliasing creates a global name that will override the default module lookup logic.

Circular Dependencies

In my previous article, I recommended temporarily using a global namespace or deferred require()s as a way to work around circular dependencies. However, I quickly came to realize that neither solution was ideal. Global namespaces are a form of JS pollution, leak your internals to 3rd party code, and can be overwritten. They also don't show up in the myriad of tools that can do dependency analysis on CommonJS modules.

Deferred require()s in theory can work, but you have to be certain that the second module in the cycle wont actually need the first module until the next tick in the JS event loop. If the second needs the first before that deferred require, it will be undefined and create a subtle bug.

I concluded that it was better to re-factor away the circular dependencies than deal with these two problems. I eliminated them using a few techniques, depending on the nature of each dependency cycle: pulling shared functionality into a 3rd module, using event-driven programming to decouple components, and dependency injection.

A really common pattern I use is when something like a "controller" needs something from a main "application" class, but the main application needs to instantiate the controller. In this case, I just use dependency injection:

//app.js
//...
var controller = require("./controller.js")(app)
//...
module.exports = app;

//controller.js
//...
module.exports = function (app) {
  var controller = new Controller({
    // do something with `app`...
  });

  return controller
}
//...

Using non-relative paths

There are cases when you don't want to use relative paths for everything and just want to use a path relative to your project root. It would be nice to simply require("myApp/views/buttons/foo_button") from src/controllers/foo.js rather than figuring out how many ../s to add to the front of your path. Luckily you can do this by dynamically creating aliases for every module in the core of your application using grunt-browserify's aliasMapping option. Here's what it looks like:

aliasMappings: [{
    cwd: 'src',
    dest: 'myApp',
    src: ['**/*.js']
}]

What this tells Browserify to do is to take every file in your src/ directory and create an alias based on the file path, prefixed with myApp. src/views/buttons/foo_button.js does become available everywhere as require("myApp/views/buttons/foo_button"). Very handy.

However, I will say that if you need crazy relative paths or deeply nested folders, its either a sign that parts of your app are becoming too tightly coupled, or your app might need to be split up into smaller, more autonomous and manageable modules. Some view needs to talk to a model way on the other side of the application? Rather than call it directly, use a global event bus/Pub-Sub. Another classic telltale sign is require("../../../utils/foo"). Just make utils/foo.js into its own private module, write some tests, and refer to it in your package.json. Then it's available everywhere as require("utils-foo").

Other tips and tricks

Dont have grunt-contrib-watch listen to your entire node_modules directory for changes to rebuild your libraries bundle. You can quickly run into file-system limits this way. Instead, only listen to the first-level package.json's -- they will be updated by npm installs. For your own npm linked modules, have those watchers touch their package.json's when their source files are changed -- as a way to signal that the parent needs to rebuild.

Colony is a handy little tool for generating a dependency graph of your code. If your code is a spider a web, its time to decouple and re-factor. Colony was very helpful in detecting dependency cycles as well -- I was able to feed its intermediary JSON into graphlib. Some cycles were 10 links long. I never would have found them, otherwise. Once caveat of Colony is that it doesn't use your Browserify config, just the default node require() logic, so it can be slightly inaccurate if you use aliases. The author of Colony also has a tool called Disc that can monitor filesizes, albeit with stricter CJS module lookups.

The brfs transform in-lines fs.readFileSync() calls - replaces the function calls with a string containing the contents of the file. This is a convenient way to load templates. Keep in mind that it can only use static paths to files -- the only variable it can evaluate is __dirname.

Finally, here is an annotated Gruntfile for a sample project. It follows all of the recommendations laid out in this article, if you want to see what everything looks like in action.

code javascript modules commonjs browserify grunt Gruntfile livereload

Monads

After my last article, I've done some more research on what monads are. They seem to be one of the last core concepts of functional programming that is still a bit elusive to me. This is a good introductory video that finally made things start to click for me. Also, this article titled You Could Have Invented Monads made me realize that they really are not that complicated.

So what is a monad?

The simplest definition of a monad is "a container for computation" -- not very helpful. Monads apparently come from category theory in mathematics, and were introduced a way to contain side effects in a pure functional language (such as Haskell). If you're writing in a pure functional style -- absolutely no side effects, no mutable state -- your programs can't be very useful -- they can't do anything except provide a return value. With monads, you could use them to wrap these useful side effects (such as reading from a file, writing to console, handling socket input), and still maintain a pure functional "core".

A more useful definition of a monad is a system such that:

  • There is a class of function that takes some sort of value and returns a monad-wrapped result. f = (a) → M b
  • There exists a "bind" function that takes a wrapped value, and a function that returns a wrapped value, and returns a wrapped value. This is a mouthful. The notation looks something like ((M b), ((b) → M c) → M c. This allows you to compose two monad functions together like so: ((a) → M b) >>= ((b) → M c) where >>= is the bind operator in this notation.
  • The bind operator is associative, e.g. (f >>= g) >>= h is equivalent to f >> (g >>= h), where f, g, and h are these special monad functions (of type (a) → M b).
  • There exists a "unit" function that can be inserted into a chain of binds and have no effect on the end result. f >>= u equals u >>= f equals f.

This is a lot of notation, I'd recommend watching that first video to make things make more sense. One key point is that the definition of what M b actually equals is incredibly broad. It could be an object, a function, or a specific function signature. (Correct me in the comments if I'm wrong here.)

An Async Monad in Javascript

So what would a monad actually look like in Javascript? Can we actually do anything useful with them? Let's define our monadic type as a function that expects a single arg: a node-style callback. That's it.

function fooM (callback) {
    //... something happens here
}

fooM(function (err/*, results*/) {
    //... you can do stuff with results here
});

Any function that conforms to this signature would be a part of our monadic system. This means you can "lift" any node-style async function to be monadic through partial application.

function liftAsync(asyncFn/*, args...*/) {
  var args = _.rest(arguments);
  return function (callback) {
    asyncFn.apply(this, args.concat([callback]));
  };
}

/* example */

readFileM = liftAsync(fs.readFile, "./filename.json", "utf8");

/* or */

readFileM = fs.readFile.bind(fs, "./filename.json", "utf8");

This liftAsync function satisfies condition 1 above -- a function that takes something in and returns a monad-wrapped result. Now let's define the "bind" operation.

function bindM(fM, g) {
  return function (callback) {
    fM(function (err/*, results*/) {
      if (err) {return callback(err); }
      var results = _.rest(arguments)
      g.apply(this, results.concat([callback]));
    });
  };
}

/* example */

function asyncParse = function (text, callback) {/*...*/}

var readAndParseM = bindM(readFileM, asyncParse);

readAndParseM(function (err, data) {
  // parsed data from filename.json is here
});

(I call the function bindM to distinguish it from Function.bind. Same term, different operation.) It basically takes in a result, and a continuation, and specifies how to tie the two together. In the example, calling readAndParseM(callback) would be equivalent to:

fs.readFile("./filename.json", "utf8", function (err, text) {
  asyncParse(text, callback);
});

Condition 2 from the list is satisfied.

I'm going to gloss over point 3 a bit, but it's pretty easy to see that if you introduced a 3rd function uploadFields(data, callback) {}, these two snippets would be equivalent:

var parseAndUpload = function (text, callback) {
  return bindM(
    liftAsync(asyncParse, text),
    uploadFields
  )(callback);
}

bindM(readAndParseM, uploadFields);

/* equals */

bindM(readFileM, parseAndUpload);

Note that parseAndUpload is not a monad-wrapped function since it takes more than the callback argument. This is needed since we need to capture the value of data in a closure. Binding is not supposed to take in two monads, but a monad and a function that can be converted to a monad.

The "unit" function would be pretty simple:

function unitM(/*...args*/) {
  var args = _.toArray(arguments);
  return function (callback) {
    callback.apply(this, [null].concat(args));
  }
}

It just passes though what was passed in to it. You could easily see how binding this to any function, before or after, would have no effects on the result. Condition 4 satisfied.

So what?

So we have just defined a series of convoluted functions that allow us to tie together async functions. What use is it? It allows us to easily do higher-order operations with any function that expects a node-style callback.

We could use the unit function to make our example more consistent. We then could do:

var readFileM = bindM(unitM("./filename.txt", "utf8"), fs.readFile);

var readAndParseM = bindM(readFileM, asyncParse);

var readAndParseAndUploadM = bindM(readAndParseM, uploadFields);

We can also define an composeAsync function with bind:

function composeAsync(f, g){
  return function (/*...args, callback*/) {
    var args = _.toArray(arguments),
      callback = args.pop();
    return bindM(liftAsync.apply(null, [f].concat(args)), g)(callback);
  }
}

var readAndParseAndUploadM = bindM(
  readFileM,
  composeAsync(asyncParse, uploadFields)
);

Pretty cool. Our async functions become lego pieces we can combine together. It becomes tedious to progressively bind functions together, though. We could just reduce an array of operations:

[
  unitM("./filename.txt", "utf8"),
  fs.readFile,
  asyncParse,
  uploadFields
].reduce(bindM, unit())(function (err) { /* ... */ });

...or define a helper:

function doM(functions, callback) {
  functions.reduce(bindM, unit())(callback);
}

doM([
  unitM("./filename.txt", "utf8"),
  fs.readFile,
  asyncParse,
  uploadFields
], callback);

You may notice that the signature of doM is equivalent to async.waterfall. We have just recreated it in a monadic way! Calling our async functions is done in a purely functional manner, completely separated from the state each individual function may create.

This is but one of many possible monads in javascript -- the possibilities literally are endless. It's all in how you define your type signature, and your bind and unit functions. They don't always have to be asynchronous, but they work best when they wrap some modicum of external state change.

In my last article, I said that promises were also async monads, but object oriented. (i.e. the monad-wrapped value M b is an actual object.) It's pretty clear when you think about it. promisify(asyncFn) is simiar to liftAsync(asyncFn). promise.then() becomes the "bind" operator, since promisify(fn1).then(fn2).then(fn3).then(done, error) is equivalent to a non-OO when(when(when(promisify(fn1), fn2), fn3), done, error) that looks a lot like our bindM operator above. Same thing, different style.

code javascript functional programming monads async callbacks

Async and Functional Javascript

Following in the spirit of my previous post, I realized that Functional Programming can help solve one of the problems that often arises in javascript: Callback Hell.

The Setup

Say there are some async functions we need to run in sequence. Here are their signatures for reference:

getConfig = function (filename, callback) {}

DB.prototype.init = function (config, callback) {}

DB.prototype.read = function (query, callback) {}

processRecord = function (data) {}

uploadData = function (data, destination, callback) {}

A bit contrived, but it at least resembles a real-world task. All the functions are asynchronous and expect Node-style callbacks, except for processRecord, which is synchronous. (By convention, a node style callback is of the form function (err, result, ...) {} where err is non-null in the case of an error, and the callback is always the last argument to an async function.) read() and init() are methods of a DB object.

The Problem

Let's naïvely combine these methods together into what I call the "async callback straw-man". You may also know it as the "nested callback pyramid of doom".

function (configFile, callback) {
  getConfig(configFile, function (err, config) {
    var db = new DB();
    db.init(config, function (err) {
      db.read("key1234", function (err, data) {
        uploadData(processRecord(data), "http://example.com/endpoint",
        function (err) {
          console.log("done!");
          callback(null);
        });
      });
    });
  });
}

Pretty ugly, each operation increases the indentation. Reordering methods is extremely inconvenient, as is inserting steps in the sequence. Also, we are ignoring any errors that might happen in sequence. With error checking, it looks like:

function (configFile, callback) {
  getConfig(configFile, function (err, config) {
    if (err) {return callback(err); }
    var db = new DB();
    db.init(config, function (err) {
      if (err) {return callback(err); }
      db.read("key1234", function (err, data) {
        if (err) {return callback(err); }
        var processed;
        try {
          processed = processRecord(data);
        } catch (e) { return callback(e); }
        uploadData(processed, "http://example.com/endpoint",
        function (err) {
          if (err) {return callback(err); }
          console.log("done!");
          callback(null);
        });
      });
    });
  });
}

Even uglier. Code like this makes people hate Node and Javascript. There has to be a better way.

Enter Async.js

After the Node developers standardized on their eponymous callback style, they recommended that developers write their own async handling libraries as an exercise -- learn how to aggregate, serialize and compose asynchronous functions in elegant ways to avoid the nested callback pyramid. Some people published their libraries, and the best and most-widely used ended up being caolan's async. It resembles an asynchronous version of Underscore, with some extra control-flow features. Let's re-write our example to use async.series.

function (configFile, callback) {
  var config, db, data, processed;
  async.series([
    function getConfig(cb) {
      getConfig(configFile, function (err, cfg) {
        if (err) {cb(err); }
        config = cfg;
        cb();
      });
    },
    function initDB(cb) {
      db = new DB();
      db.init(config, cb);
    },
    function readDB(cb) {
      db.read("key1234", function (err, res) {
        if (err) {return cb(err); }
        var processed;
        try {
          processed = processRecord(data);
        } catch (e) { return cb(e); }
        cb();
      });
    },
    function upload(cb) {
      uploadData(processed, "http://example.com/endpoint", cb);
    }
  ], function done(err) {
    if (err) {return callback(err); }
    console.log("done");
    callback(null);
  });
}

Not much of an improvement, but a small one. The pyramid has been flattened since we can simply define an array of functions, but only somewhat. The number of lines has increased. Since subsequent operations rely on data returned from previous steps, you have to store the data values in the closure scope. This would make re-using any of these functions hard. async.series does short-circuit to the final callback (function done(err) {}) if any of the steps callback with an error, which is convenient. However, you can see that getConfig has to handle its own error as a consequence of having to modify the closure scope. Re-ordering steps is simple, but things are still pretty tightly coupled.

Waterfall

Luckily there is a better function: async.waterfall(). async.waterfall() will pass any callback results as arguments to the next step in the sequence. Let's see how this improves things:

function (configFile, callback) {
  var db;
  async.waterfall([
    function getConfig(cb) {
      getConfig(configFile, cb);
    },
    function initDB(config, cb) {
      db = new DB();
      db.init(config, cb);
    },
    function readDB(cb) {
      db.read("key1234", cb);
    },
    function processRecord(data, cb) {
      var processed;
      try {
        processed = processRecord(data);
      } catch (e) { return cb(e); }
      cb(null, processed);
    },
    function upload(processed, cb) {
      uploadData(processed, "http://example.com/endpoint", cb);
    }
  ], function done(err) {
    if (err) {return callback(err); }
    console.log("done");
    callback(null);
  });
};

A little bit flatter, and we don't have to manually handle any async errors. There is less reliance on the closure scope, but in its place, we have made the order matter, so the functions are still rather tightly coupled.

I also moved the synchronous processRecord() to its own step in the sequence for clarity. You can see that this would be a common operation for any synchronous function you wish to insert into a waterfall. Let's write a higher-order function for this change:

function asyncify(fn) {
  return function (/*...args, callback*/) {
    // convert arguments to an array
    var args = Array.prototype.slice.call(arguments, 0),
      // the callback will always be the last argument
      callback = args.pop(),
      result;
    try {
      // call the function with the remaining args
      result = fn.apply(this, args)
    } catch (err) {return callback(err); }
    callback(null, result);
  };
}

This would "asyncify" a function of any arity, and allow you to use it like an async function. Our waterfall becomes:

function (configFile, callback) {
  var db;
  async.waterfall([
    function getConfig(cb) {
      getConfig(configFile, cb);
    },
    function initDB(config, cb) {
      db = new DB();
      db.init(config, cb);
    },
    function readDB(cb) {
      db.read("key1234", cb);
    },
    asyncify(processRecord),
    function upload(processed, cb) {
      uploadData(processed, "http://example.com/endpoint", cb);
    }
  ], function done(err) {
    if (err) {return callback(err); }
    console.log("done");
    callback(null);
  });
};

It cuts down on the number of lines, since the signature of the asyncified processRecord matches exactly what the waterfall expects.

What really makes this ugly in my eyes is the fact that we have to declare functions explicitly in sequence. I really like that processRecord became a single line in the waterfall. Could we transform the rest of the functions like this?

bind() and Partial Application

Function.bind() is a powerful addition to Javascript. Not only does it allow you to set this for a function call, but it also allows you to partially apply functions. In other words, allow create functions that have certain arguments pre-bound. Let's re-write our waterfall:

function (configFile, callback) {
  var db = new DB();
  async.waterfall([
    getConfig.bind(this, configFile),
    db.init.bind(db),
    db.read.bind(db, "key1234"),
    asyncify(processRecord),
    function upload(processed, cb) {
      uploadData(processed, "http://example.com/endpoint", cb);
    }
  ], function done(err) {
    if (err) {return callback(err); }
    console.log("done");
    callback(null);
  });
};

Much simpler. We bind all the arguments we need except what is passed in by the waterfall. We have decomposed most of the steps to single-line expressions. Also worth noting, its that we could not simply pass db.init to the waterfall -- we had to bind it to the db object, or else it any references to this in the init() call would default to the global scope. (On the other hand, if the DB class bound all its prototype methods to itself in its contructor, we would not have to do this.)

The next problem is uploadData. It relies on an explicit argument, as well as one passed in by waterfall. We cannot use bind() because that can only bind arguments from the left, whereas the explicit argument is in the middle of the function signature. We could redefine uploadData so that the destination is the first argument, but that would be too easy, and we might not have control over uploadData. Let's write another higher-order function:

// partially apply a function from the right, but still
// allow a callback
function rightAsyncPartial(fn, thisArg/*, ..boundArgs*/) {
  // convert args to an array
  var boundArgs = Array.prototype.slice.call(arguments, 2);
  return function (/*...args, callback*/) {
    var args = Array.prototype.slice.call(arguments, 0),
      callback = args.pop();

    // call fn with the args in the right order
    fn.apply(thisArg, args.concat(boundArgs).push(callback));
  };
}

A complicated method, due to handling variable numbers of arguments, but it basically re-orders the arguments to make things work. Study it until it makes sense.

We can now simplify our waterfall even more:

function (configFile, callback) {
  var db = new DB();
  async.waterfall([
    getConfig.bind(this, configFile),
    db.init.bind(db),
    db.read.bind(db, "key1234"),
    asyncify(processRecord),
    rightAsyncPartial(uploadData, this, "http://example.com/endpoint"),
  ], function done(err) {
    if (err) {return callback(err); }
    console.log("done");
    callback(null);
  });
};

uploadData is now called with a null this, the processedData from the waterfall, the bound endpoint, and the callback from the waterfall.

One more step and our sequence is free of function declarations:

function (configFile, callback) {
  var db = new DB();
  async.waterfall([
    getConfig.bind(this, configFile),
    db.init.bind(db),
    db.read.bind(db, "key1234"),
    asyncify(processRecord),
    rightAsyncPartial(uploadData, this, "http://example.com/endpoint"),
    asyncify(console.log.bind(console, "done"))
  ], callback);
};

This is the same length as the first naïve implementation, and it even handles errors to boot. We do not have to declare any functions in the waterfall, nor modify any functions used. We did have to define a few helpers, but these helpers would be very reusable.

Refactoring

Even though this is a contrived example, you can see that there is an obvious optimization -- we don't need to initialize the database every time we run this sequence. We can use async.memoize. We could also use async.apply() (basically a simpler bind()) to make things more clear. We also could bind all methods to this in the DB object. The code changes slightly:

var db = new DB();
db.bindAllMethods();
initDB = async.memoize(db.init);
function (configFile, callback) {
  async.waterfall([
    async.apply(getConfig, configFile),
    initDB,
    async.apply(db.read, "key1234"),
    asyncify(processRecord),
    rightAsyncPartial(uploadData, this, "http://example.com/endpoint"),
    asyncify(console.log.bind(console, "done"))
  ], callback);
};

All very simple. I really like this end result because the code is very sequential -- it's easy to see the steps involved.

Another thing you could do, is tie reading from the database and processing the record into a single action, if you found yourself doing that often. You could do it with async.compose():

var readAndProcess = async.compose(
  asyncify(processRecord),
  async.apply(db.read, "key1234")
);

or with another waterfall:

var readAndProcess = async.waterfall.bind(async, [
  async.apply(db.read, "key1234"));
  asyncify(processRecord),
]);

// or

var readAndProcess = function (query) {
  return async.waterfall.bind(async, [
    async.apply(db.read, query));
    asyncify(processRecord),
  ]);
}

// and in the waterfall

    // ...
    initDB,
    readAndProcess("key1234"),
    rightAsyncPartial(uploadData, this, "http://example.com/endpoint"),
    // ...

async.compose is basically an asynchronous version of traditional function composition, just like async.memoize is an async version of _.memoize. There are also async versions of each, map, and reduce. They just treat the callback results as return values, and manage the control flow. Since in Node there is a standard way to define callbacks, you can re-write any traditional higher-order function to handle asynchronous functions this way. This is the true power of callbacks.

What About Promises?

Promises (a.k.a. Futures or Deferreds) are an alternative way to handle asynchronous actions in javascript. At the core, you wrap an operation in a "thenable" object -- a Promise. When the operation completes, you call promise.resolve(), and the function passed to promise.then() is executed with the results. promise.then() also accepts an optional error handler. Promises can be chained and composed, and there are many frameworks that allow you to do higher-order operations with promises, similar to async. They are also way to make async programming look more like synchronous code.

I don't really have a strong opinion on promises, to me they seem like another solution -- just another style -- of async programming. There was a popular article writen a few months ago titled Callbacks are imperative, promises are functional: Node’s biggest missed opportunity. I disagree with the title on two levels. First of all, promises are not functional -- they are Object Oriented. You are wrapping an operation in an object on which you call methods. It reminds me of the Command Pattern. Whereas Node's callback style is reminiscent of Continuation Passing Style. Callbacks only become imperative when you build the callback hell straw-man. Second, saying that Node's biggest missed opportunity is not using promises in the core is a bit hyperbolic. At its worst it is just a quibble over coding style.

The author also claims that a Promise is Javascript's version of a Monad. Granted, monads are a pretty esoteric concept, and I'm only beginning to understand them myself, but Promises are not monads. Promises are objects that encapsulate operations, nothing more. Update: This is not true. Promises can be thought of as Object-Oriented Async Monads. They satisfy the basic properties of monads: a unit operation, a bind operation, and associativity. These operations end up being methods on the promise object, so you do lose functional purity. See the second half of this talk by Douglas Crockfordfor an explanation.

For functional async monads, see Deriving a Useful Monad in javascript (strongly reccommended read) for an example of what they would look like. Node-style async functions themselves could be though of as monads, because they conform to a standardized type signature (the function (err, result) {} as the last arg). You only need to define unit() and bind() functions and they become fully-fledged monads (an exercise left to the reader). However, I will point out that the end result looks a lot like async.waterfall, and async.waterfall is a bit more easy to follow.

I think Node made the right decision in deciding to use callbacks rather than promises for all their core methods. They offer the greatest flexibility, don't require an external library or extra type/class to use, and are dead simple. They're just functions. Node's callback style just requires a little more care to become elegant. If you want to use promises, just promisify(). I'm perfectly happy with functional programming techniques.

For more on promises vs callbacks, read this rebuttal to the "Promises are Functional" article. This discussion also talks about the pros and cons of each approach.

code javascript functional programming async callbacks

Functional Javascript

As I code more and more, I'm coming to find that traditional Object-Oriented techniques aren't well suited to Javascript. OO techniques really shine when you have a compiler to tell you that Foo does not have a method named bar(), and that bar() expects an object that implements interface Qux with methods baz() and bogo(), etc... In JS it is impossible to know what the properties of an object will be until runtime, or what its type will be. (This really frustrates a lot of my coworkers who like strongly-typed languages.) Tools and IDE's can make some fairly good assumptions, but they always seem to fall short -- either your code has to be written in a restrictive style so every property is detected, or you have to accept that certain dynamic properties will not be picked up.

This is not to say that static analysis of JS is useless, in fact I am a big fan of JSHint, especially the unused and undef options. Here is an example:

var
  Foo = require("./Foo"),
  Bar = require("./Bar"), // Bar is not used, so JSHint will complain
  Baz;

module.exports = Baz = function () {
  // constructor...
}

_.extend(Baz.prototype, Foo.prototype);

Baz.prototype.qux = function (something) { // something is unused
  this.foo = somehting.getFoo(); //"somehting" is undefined because I typo'd it

  return this.bar + this.foo; // no way to know if these properties exist until runtime
}

JSHint helps out with explicit variables and arguments, but falls short with properties of objects. (I also use this example to show just how clunky creating classes are in JS. ES6 will fix this, but for the time being, just replicating class Baz extends Foo is not obvious and there are a million ways to do it wrong.) Some JS IDEs are really clever in detecting methods and properties, but I don't see how any IDE could efficiently handle something like this:

_.each(JSON.parse(fs.readFileSync(configFile)), function (prop, key) {
  this["get" + key] = function () {
    return prop + this.foo;
  };
}.bind(this));

It literally would have to execute the module to know what those dynamic properties would be.

Situations like these have made me realize that it is better to write JS in more of a functional style, rather than try to shoe-horn traditional OO into javascript.

That being said, I'm not saying you should write JS like it is Lisp, and eschew objects altogether. There is a really cool thought experiment in JS: List out of Lambda. It is a good introduction to creating constructs from pure functional building blocks. However, the obvious thing to point out is that if you actually were to use those pure-functional lists, they would be terribly slow. Functional languages like Lisp, Haskell, or Clojure rely on the compiler to do optimizations that make things as fast as imperative languages like C or Java. Interpreted Javascript cannot make these optimizations (yet, at least).

Here are my reccommendations, my list of currently unsubstantiated claims. Each one of these bullet points could be an article on its own:

  • Use built-in Objects and Arrays. Rather than creating lists out of lambda, using the built in "collection" types is a logical place to draw the functional line. A good tradeoff between speed and functional purity.
  • Use higher-order functions. Rather than explicit iteration, get used to each, map, filter, reduce, compose, and all the methods of Underscore. Also, write your own functions that mutate other functions when you find yourself writing the same code over and over.
  • Avoid using this. this is really an implicit argument passed to every function. It's hard to know what its properties are until runtime. However, if every variable or argument is explicit -- not contained within an object -- JSHint can detect problems statically.
  • Avoid state. Related to the previous point, if you're using this, you are probably creating state. If you do need state, pass it in as an explicit argument, or encapsulate it in as small of an object as possible.
  • Write pure functions. In a perfect world, calling a function should have no side effects. If a function does not modify its arguments or any external state, and returns a completely new result, is considered "pure". Think const in C. Pure functions are also very easy to test.
  • Create functions that operate on data structures, rather than objects that encapsulate data structures. Think getNameFromConfig(jsonConfig) rather than var config = new Config(json); config.getName(). In this example, getNameFromConfig is a pure function.
  • Master call, apply, bind, partial application, and currying. These are all powerful functional programming techniques that allow you to use higher-order functions more effectively.

I will reiterate that these are more of guidelines than actual rules. For example, avoiding state completely is impossible: using JS to add a DOM Element is changing the state of the browser. However, the core of your system could strive to be as stateless as possible -- you could rely purely on events rather than reading state variables and flags.

There are three follow-up articles I will write in the future to expand on this topic:

  • Show how functional javascript solved a common problem: Fixing callback hell in Node.
  • Refactoring an object-oriented, imperative module into a more functional-styled module
  • Creating a functional Vanilla-JS example for TodoMVC

code javascript functional programming

Dependency Injection

I had heard the term "dependency injection" thrown around many times before, but hadn't really taken the time to research it. However, I recently had an epiphany and I realized what it is and why it is an important (and simple) idea.

Say you had a class or module, I will use javascript as an example:

var FooBar = function () {
    this.foo = new Foo();
    this.bar = new Bar();
    //...   
};

//...

FooBar is a class that has 2 member variables, that are themselves classes. They are both instantiated and assigned in the constructor. However, we could rewrite this to be:

var FooBar = function (foo, bar) {
    this.foo = foo;
    this.bar = bar;
    //...
};
//...

...and elsewhere, where FooBar is actually used, you have:

SomeFactory.createFooBar = function () {
    return new FooBar(new Foo(), new Bar());    
};

That's it. The dependencies Foo and Bar are simply passed in ("injected") to the FooBar constructor. The class FooBar does not need to explicitly state it's dependencies, it just relies on what is passed to it during instantiation. If these were modules, FooBar would not have an explicit dependency on Foo and Bar -- it would be implicit.

Why is this important?

First of all, it enables easy testing. Say Bar was a module that interfaced with a remote service, and was slow. However, you want to be able to test the basics quickly in a unit test. You could then simply do:

TestFactory.createTestFooBar = function () {
    return new FooBar(new Foo(), new MockBar());    
};

MockBar would just be a module that implemented all the same methods as Bar. In a strongly typed world, Foo and Bar would be defined as interfaces, and you would simply pass in a concrete implementation depending on whether you wanted the real, mock, or otherwise alternate functionality. In a scripting language or dynamically typed language (or go), you can just rely on duck-typing, and pass in any object that has all the requisite methods.

It simplifies your app's dependency graph. Instead of FooBar depending on Foo and Bar, whatever depends on FooBar also depends on Foo and Bar, eliminating a level in the tree. The order in which FooBar, Foo, and Bar are included is also irrelevant.

This is also a way to simplify a module's explicit dependencies. This can be useful in the NodeJS/NPM world. A lot of NPM modules rely on the same modules, but slightly different versions. One module will depend on underscore@1.22, another at underscore@1.24, etc. and each would end up with a separate copy in it's node_modules folder. If your app includes both of these modules, you will have some redundancy and duplication. (A concern if you are building for the browser!) However, if a module expected to have an underscore module passed in during initialization, this would remove the explicit dependency, and your app could rely on a single library instead of several.

Grunt plugins do this well -- every plugin expects to have a grunt object passed in to it's main function. This avoids the problem of your 5 plugins each including a slightly different Grunt version, each of which would have to parse your Gruntfile, and talk to the other Gunt versions. Really inelegant, and kind of a nightmare.

This did cause some issues when there were some breaking changes between Grunt 0.3.x and Grunt 0.4.x -- plugins had to be updated to support the API changes. However, it was simply up to the implementor to specify the versions of grunt plugins that were compatible with the relevant grunt version.

AMD can also be thought of as dependency injection at the module level. Your module factory function expects to have it's dependencies passed into it. RequireJS even supports alternate configs where you can override modules -- in other words, swap out ModuleFoo for ModuleMockFoo behind the scenes for testing.

All in all, it is a simple way to write more decoupled and modular code.

code javascript modules dependency injection