Vendoring gems with style

A couple of months ago Ryan McGeary wrote an article about why @defunkt‘s idea of “Vendor Everything” still applies. I’m playing with this for a while, and I came up with these best practices. First of all, we need a ruby platform (yes, we need rvm):

$ echo rvm 1.9.2 > .rvmrc
$ rvm rvmrc load

We’ll going to store project-related binaries in bin folder, why don’t we add this to the search path? This line is a good candidate of a .bashrc or .zshenv line:

$ export PATH="./bin:$PATH"

UPDATE: I have to admit this is the weakest part of the solution, or, in other words, this is the least elegant way of doing things. However, we use bundler anyways, why don’t we use it? Thus, open our development console with this command instead:

$ bundle exec $SHELL

This command just lets you peek inside the bundle, with gems, PATH settings, and such funny things. Just to the point, and you don’t have to worry again of malicious binaries in world-writable directories (funny things in /tmp/bin/ls for example). I have a couple of projects in parallel, why should we store all these gems multiple times?

$ mkdir ~/.bundle
$ ln -s ~/.bundle vendor/ruby

Now, we can bundle our stuff:

$ bundle --path vendor --binstubs
$ bundle --path vendor

(UPDATE: --binstubs is not required since we use bundle exec $SHELL). You have to keep bundled environments out of versioned files:

$ (echo bin/; echo vendor/ruby/) > .gitignore
$ echo vendor/ruby/ > .gitignore

That’s it (UPDATE: putting bin/ to .gitignore is unnecessary now). I found only one issue with this setup: I’m using guard for a while, and while nearly everything works well, I was not able to get guard-bundler working. Until a recent update. The only issue was this: guard (just like all the commands living in ./bin) runs in bundler’s environment, however, in order to run bundle command, you actually have to have it installed. However, bundler is not installed inside the bundled environment. As it turns out the answer is easy, but not intuitive: let’s install bundler into the bundled environment:

$ bundle exec gem install bundler
$ gem install bundler

(UPDATE: bundle exec is not needed in development console.)

Now we have the executable inside, which guard-bundler can run in a Bundler::with_clear_env block (which resets environment as it was outside), and it can install gems, flowers, unicorns, double rainbows. With these settings we get pretty much the same experience as a non-rvm environment, or a dedicated gemset for every project. Sometimes, when .bundle becomes too big, I just dump it and start over. It doesn’t take too long anyways. To wrap up, we achieved the following:

  1. We use rvm
  2. However, we got rid of it’s bundler helpers
  3. We don’t rely on the bundler pool using executables / scripts
  4. … but we have a pool, and we don’t have multiple copies the same gems (ssd is still not cheap)
  5. Most shortcomings of vendoring gems are hidden

Enjoy.

  • Anonymous

    I’m not sure I follow your reasoning.  The article you linked to says rvm gemsets are unnecessary, but doesn’t say why.  They then say to vendor everything inside your application saying “disk space is cheap”.  I’m not sure why they don’t just go ahead and use an rvm gemset for that.

    In your example, you vendor everything in your application, but then symlink the directory to a shared directory.  This seems like the worst of all worlds.  If I do a regular `bundle install` bundler will look for the correct version in system gems and use that, or otherwise install it alongside whatever version I have.  I haven’t looked through how bundler does vendoring, but in your situation if you have two apps with different dependencies both symlinked to the same directory I think the apps will end up overwriting each other’s gems.

    If you don’t have the space to vendor things in your app, why not just use regular `bundle install`+ binstubs and let bundler meet dependencies?

    Lastly, it’s fairly minor, but be aware that adding “./bin” to your path has security implications.  If I’m a lower privileged user I could add my malicious binary to /tmp/bin/ls and then when you cd into tmp and `ls` you’ll end up running my command.  The bigger concern is just general confusion with your PATH randomly picking up things in bin directories.

  • Anonymous

     RVM gemsets are not unnecessary, but using them may cause configuration issues. You can have three fellow developers. Maybe all of them use rvm, but maybe not. Maybe they like your naming conventions, maybe not. You can’t please them equally.
    You can have multiple deployments on the same server. You can’t use too specific .rvmrc, because it would mess up your deployments, especially if you serve production too.This is why people tend to leave out .rvmrc from version control.Our point in these blog posts are to encourage everyone to add .rvmrc to source control, even it means tradeoffs.

    By default, bundler installs gems to $GEM_HOME, which means your bundled items can be managed by gem command too.

    It might be convenient, but after a while, your gem list will be overcrowded with old versions, gathering dust. Therefore, you start cleaning it up. You can do it on your own machine, but you definitely should not do such a thing on a production server.

    Of course, you can separate bundled environments with gemsets, but if you start using multiple places (more developers, more deployment targets), this start being a burden. Now I don’t even want to mention gems which have .rvmrc in them.

    This means we have two issues to fight against: the resistance of putting .rvmrc to source control, and being too specific.

    Anyways, bundler does a good job in vendoring: it puts everything in $GEM_HOME, therefore it maintains the same directory structure as rubygems do. It also adds a `bundler` dir to add gems referenced with git URLs.

    While I understand the weaknesses of putting ./bin into a PATH, but in a systems administrator’s perspective, putting executables into a world-writable area is dangerous, and therefore executing them should be restricted in OS level.

    I tried the same with adding PATH settings into .rvmrc, but it’s not a robust solution.

  • Anonymous

    I’m thinking about the weakness of adding ./bin to PATH for  a while now, and I think I found a solution: start a dev console (bundle exec $SHELL). It sets the whole environment for your needs (eg. $GEM_* variables, $PATH, $BUNDLER_* and such), without start using $PWD/bin especially if it’s in a world-writable area.