The objective

We need to be able to create VMs able to run Sybil, typically to bring and install them inside a customer environment. Packaging Sybil as a VM allow us to be reasonably independent of the customer technical infrastructure.

The current situation

We are creating VMs using VirtualBox, and provision them using a simple Bash script. While sort-of ok, this approach miss a lots of points:

  • It is not fully automated, so it is easy to miss some steps (for example, defining interfaces or bridges are manual operations on the VM itself)
  • It is not really extensible: sure, I can work on the resulting VM, but it will stay a “one time” job
  • It is not easily customizable if we need specific packages at a certain customer

The use case

Without going to boring details, our current scripts do many things, among them:

  • install various Linux packages
  • create a Linux user with a given group and password
  • install rvm
  • install a specific Ruby version
  • install Apache
  • install and enable Passenger inside apache
  • create a virtualhost for Sybil
  • install and configure PostgreSQL (creating at least a user and DB for Sybil)
  • install various Rails and various other gems inside a rvm gemset

In other words, it sets up the machine to be ready to receive a version of Sybil.

The goal

My objective would be something like:

$ <1. get VM building script>
$ <2. create VM with all needed package and config>
$ <3. deploy Sybil>

Now, those are not exactly commands, but two of those can be done easily using tools we use, Git and Capistrano:

  • <1> is actually : git co, sybil_ops being a (for now imaginary) git repository where I would store all my config
  • <3> is simply cap deploy (or something similar, as we use cap multistage)

So, I only need my <2> command. Starting with a “config”, it should give as output a fully functional VM, which could receive the cap deploy command.

One the road

My second step is actually two steps in one:

  • creating a VM
  • setup it with all the needed packages etc

Vagrant or how to script a VM


VirtualBox (Photo credit: Wikipedia)

Vagrant is something that is very simple to say and is probably very complicated to implement. Vagrant is a command line API that is able to manipulate VMs:

  • create them, given some parameters like the OS to use
  • destroy them
  • ssh to them

Think of it as a command line to VirtualBox GUI: each operation you do with VirutalBox, you can do it using Vagrant. The beauty of Vagrant is that it can recreate a VM with as an input a single configuration file named “Vagrantfile”. Here we go:

$ vagrant init

This create a basic configuration file in the directory. Opening it, there is some useful settings that I need: do |config|
  #set VM name inside virtual box
  config.vm.customize ["modifyvm", :id, "--name", "precise", "--  memory", "512"]

  #set base box to use = "precise64"

  #vm host name
  config.vm.host_name = "precise"

The file is quite simple and almost self explanatory. The “base box” needs some explanation: Vagrant use a specific format for its boxes, allowing you to reuse “base” boxes for multiple purposes. But you do not have to create those yourself (even if it's not that complicated), you can just go fetch the OS and version you need at

To download one, just use vagrant box add <name> <url>. You can then refer to your box using simply <name>.

Another option is to specify not the box, but the box url. This has the additional advantage to remove the required box add operation.

Now is time to test it:

$ vagrant up

This will create and boot the VM. Once done (less than two minutes), you can ssh in it. So far so good, I can commit my modified Vagrantfile, but I still need to run my shell script, or find something better.

Puppet or how to provision a VM

About Puppet

A first note: Puppet can do much, much more than what I will show here, as this Puppet use case is somewhat contrived: Puppet can manage multiple machine (“nodes”), and pushing operations to each of them when needed. For this, Puppet use a “client-server” architecture, with a agents on the various machines talking to a central... Puppetmaster.

My use case is much simpler: I just want Puppet to provision the VM I’ve created with Vagrant, to get it in a state where I can actually deploy Sybil on it.

Puppet and Vagrant

Vagrant does provide integration with provisioning providers (mainly Chef and Puppet). In this mode, Vagrant does not use an agent, and does not even require Puppet to be installed on the VM: it uses an “embedded” Puppet that is uploaded on the machine with the Puppet scripts but install Puppet (and Chef) by default on its boxes, so they can be executed locally in a temp dir (thanks Fabian for pointing this). This distinction may appear as a detail for now, but it will be important a bit later when we’ll talk about modules.

Using Puppet



Puppet can basically execute anything you would do using Bash, from creating users to installing packages, using configuration files with a sort of DSL, with a .pp extension. One of the more representative example is probably the setup of the Apache service:

package {
  ensure => present,

While puppet can execute (using the “exec” command) any shell command, it does possess a lot of higher level functions for typical operation. This package “resource” tells puppet to install the apache2 package, and continue only if it is present. While a bit more verbose than “sudo apt-get install apache2”, this puppet script would work correctly with yum or apt, being platform independent.

Another resource, service, allows to start apache as a service :

service {
    ensure => true,
    enable => true,
    subscribe => File["/etc/apache2/apache2.conf"]

The subscribe point tells puppet to restart the service after applying any change to this specific file. Again, the service resource is distribution or service agnostic.

Puppet and Vagrant

Once satisfied with our Puppet script, we can have Vagrant runs it each time we create or start the VM: do |config|
  config.vm.provision :puppet

This simple line will tell Vagrant to look for a Puppet file in manifests/default.pp, from the directory where the Vagrantfile is (this is only the default value, any other can be passed as parameter to the config.vm.provision command).

$ vagrant up

It give us a VM with the apache2 service running. To confirm this (without needing to ssh on the VM), we can tell Vagrant to forward some ports: do |config|
  config.vm.forward_port 80, 4567

This will forward the port 4567 on the host to the port 80 (used by apache) on the VM. A simple look at localhost:4567 in a browser should get us the default Apache page.

We can now go forward by adding more and more resources to the Puppet script, for example to setup our database (PostgreSQL for me). While this could be possible to do “manually” (I could write a module that would install the package, fix some config and starts the Postgresql service), I’m not the first trying to setup a PostreSQL using Puppet, which leads us to an interesting Puppet feature: modules.

Using Puppet modules with Vagrant

Puppet modules are self contained “recipes” that can be reused. You use them in two main cases: to organize your own scripts (I do not want to have all my config in a very long default.pp file) and to use existing modules. PuppetLabs, the company behind Puppet does maintain a good list of modules for the typical cases, and many others are available on GitHub.

Including a module is as simple as:

include postgresql::server

Should the module be installed on your machine (using Puppet modules tool), it will be found and loaded with this simple command,

giving you a PostgreSQL installation with sensible defaults (you can always customize it further in the script).

Something important at this time, that did bite me quite hard is the way Vagrant uses modules. You need to remind that Vagrant does not install Puppet on the VM, it sort of bundles it with your config. Vagrant uploads your puppet configuration to a temporary directory on the VM (/tmp/vagrant_puppet) and executes it there. This means that it will not found the modules installed on your host (like my PostgreSQL module), unlike you tell it where they are in the Vagrantfile:

config.vm.provision :puppet, :module_path => ["modules","~/.puppet/modules"]

This asks Vagrant to look for modules at two locations: in the modules directory (relative to the Vagrantfile), where I’ll store my own modules, and in ~/.puppet/modules (where Puppet installs them by default). Vagrant will look at those two places, upload the modules on the VM with the script, so they will be there when it will apply the Puppet config.


This was not by any stretch a complete overview of either Vagrant or Puppet, but an experience of using them to resolve a the peculiar problem of creating and configuring VMs on a stable and reliable way. I now have my three steps to Sybil:

git co
vagrant up
cap deploy

Of course, I still need to add a lot of stuff to the Puppet config, but this can be done one step at a time, backed by git. Due to the modular nature of Puppet, it is also easier to maintain than our large Bash script.

Added benefit, the day we need to manage “live” configurations, we’ll probably already have a good base.

Why not Chef ?

Chef is another popular option for provisioning, and is also integrated with Vagrant. My (current) choice of investigation using Puppet is not the result of a major study, and is mostly due to puppet being the tool of choice in the excellent Deploying Rails pragmatic programmer book. My quick overview of the Internet opinion on this subject may be summarized as: those are two good tools, use the one you feel comfortable with (reminds me of other difficult choices).

I may try the same procedure with Chef, but to be honest, as a startup, we do not have much time to evaluate other options once we have one that works (and of course there is Chef, but also Ansible, and Synapse and probably others I do not even know...).


I use Sublime Text for all my text editing needs, including those. Vagrantfile are just Ruby files, so it is easy to gets Sublime Text to recognize them as such (just add <string>Vagrantfile</string> to the list of files in Ruby.tmLanguage). For Puppet, there is a plugin that can be installed via Package Control.

Never tried Sublime Text ? Do it now ! (disclaimer : I’ve no interest whatsoever in Sublime Text, except as a happy customer)

Enhanced by Zemanta