Back in March of 2016 there was a rather large hiccup in the npm ecosystem. Basically, an author unpublished a prominent package that many projects happened to depend on. Cue mass panic as builds around the world started to fail. This post is not about the situation leading up to the removal, or the reasoning behind it, or even how npm could solve the issue in the future. Instead of dwelling on how the situation came to be, I’d rather focus on some practical solutions for how we can prevent an event like this from affecting us in production with the tools and methods available to us today.
The root of the problem comes from having to rely on a very large dependency graph that can go down dozens of levels and potentially receive updates without our explicit buy-in. How do we protect ourselves from a transitive dependency breaking 8 levels down the dependency graph? What if it’s worse than merely a breaking change? What if a dependency was unpublished or even more horrifying, pushed up a “patch” with malicious code? It could leave an entire local project in ruins.
The following are a few ideas to handle this kind of situation with npm specifically. Feel free to pick and choose from these suggestions, as not all the solutions presented here are right for every project. As always, it is important to do your own evaluation and vetting.
This is a native npm config setting that effectively adds a psuedo-offline mode to npm. There are definite benefits to this, chief among them is – once you have a version of a package on your machine, you have it for good. If a package author unpublishes or changes the specific version you have on your machine, you aren’t effected locally. You have to specifically run `npm cache clean` or manually remove the cached dependency if you want to grab the same version from the npm registry again. (Note: this can be helpful for those long plane rides without internet access.)
This is an easy, quick command that can ease some burden from the development process, however it won’t help much when building to things like a generated docker container or a continuous integration tool, as they generally won’t have an npm cache to begin with. (There are ways around that, though it is perhaps a little involved and not in the scope of this blog post).
If you take anything away from this article, please let it be this. Use npm shrinkwrap for production.
Shrinkwrap locks in your dependencies all the way down the dependency graph. This is useful for making sure some package down the line doesn’t mess up semver conventions and submit a breaking change as a feature or patch ‘fix’. It can also add some protection from the ‘malicious’ code situation, as you’ve locked a specific version down that is presumably not malicious. Npm shrinkwrap by itself doesn’t necessarily do anything though if the package gets unpublished. This is because when executing `npm install` with shrinkwrap, it still hits the npm registry, it just makes sure that each dependency down the graph is an exact number and not a range.
If you want some added security to your `
npm install` s, add npm-seal to the shrinkwrap process and you’ll get some added protection. Npm-seal adds checksum compatibility to ensure you are always getting the exact correct files from the npm registry that you think you are.
Also worth mentioning is mozilla’s npm-lockdown. It is a shrinkwrap alternative that is currently a bit more robust than the native npm-shrinkwrap. It comes with checksum compatibility and addresses an optionalDependency issue that shrinkwrap currently has problems with. The end goal of npm-lockdown, however, is to help move npm-shrinkwrap forward and get these features native npm support.
The idea behind shrinkpack is similar to committing the node_modules folder to a project’s github repository, an old practice that was actually endorsed by NPM in the past, though it is no longer recommended. Committing the node_modules directory was beneficial in that it checked in a single ‘immutable’ source of dependencies for a particular project, and running a project didn’t require hitting the npm registry. Heck, it technically didn’t even require you to type `
npm install`. But, this technique brought along with it some not-so-desirable side effects. The most commonly cited usually has to do with exploding lines of code during pull requests and impossible to review diffs which can hinder high quality development. Let’s not even get into packages that use operating system specific binaries during install.
Shrinkpack solves these issues by grabbing the individual tarballs from each dependency in the registry and puts those in a folder called `node_shrinkwrap`, which you commit. Committing the `.tgz` assets and rewriting the `shrinkwrap.json` file to point to them gives many benefits: it cleans up the git diffs, reduces lines of code explosions, and makes everything local-by-default, thus guaranteeing the exact dependency code, every time. It even correctly installs binaries based on operating systems when switching from one OS to another. This method prevents any call out to the npm registry, which means we can now build to a docker container and be extremely confident that the dependencies are correct, and that it will succeed even if the npm registry is down.
This technique is perhaps geared a bit more toward library authors than application developers. The idea here is to use a tool such as webpack, browserify or rollup to bundle all your code and your dependencies together when you publish your package. This has quite a few added benefits, chief among them is it freezes your dependencies for a specific version, and speeds up installs of your package. If you are a library maintainer and this method sounds intriguing to you, I highly recommend reading this article by Rich Harris as he goes more in depth into the Pros and Cons of this technique.
Another option is to set up our own private npm registry. This definitely gives us more control over packages. It even allows us to publish our own versions of a package to the private registry for us to use, thus completely bringing the trust of the package internal. Although, when running `npm install` we would hit our own registry, which means we still need to go out and hit a server. There is potential for registry downtime still in this model, but when combined with some other the other methods above, this can really add some security and features to package publishing and installing. There are a handful of private npm server solutions out there with different features. Here are a few to get you started:
Okay, so now we got our dependencies cached and locked down, we know exactly the files we are getting every single time. We have explicitly set and checked specific versions of our transitive dependencies. If the npm registry goes down or a package gets unpublished, we are protected. Awesome.
But wait…what do we do if want to update our dependencies? At some point or another, we will have to grab a new package from the official npm registry. We have to update at some point in time. What tools do we have available to test and trust package updates?
We now have a list of methods available to us for protecting ourselves from some of the issues a simple `npm install` gives us. Armed with these tools we can sidestep the npm registry, cache our dependencies, lockdown our versions and ensure dependency updates are safe.