Object

We have been following quite closely the raging debate in the Rails world about Object Oriented Programming (OOP), not confusing Rails with your application and how to achieve fast tests. While not exactly identical, all those discussions are interrelated: good testing implies small well identified pieces of code, OOP means (among other things) classes and modules with good (and not too large) responsibilities.

A case study

As we are developing Sybil, we have its codebase growing and evolving. As good gardeners, we try to keep it as nimble as possible, carving and pruning it along the way. One of my own little itchs about it was that everything about Sybil was bundled as one Rails application.

Of course, part of Sybil is a series of views on a database-backed model, which fits Rails perfectly. But another (quite large) part of it is all our analyzers, classes and modules able to parse code and extract the skills that are presented in the Web interface. Those have nothing to do with the Rails app.

The objective

This statement in hand, I decided to extract the whole “analyzer” package from the Rails app. The origin was clear, and the destination also: the analyzer was to become a stand alone Ruby library, in other words, a gem. The gem could then be referenced as a dependency in Rails application, along the dozen we already use.

The expected benefits were various:

  • Ensure proper decoupling of the application’s parts (with some kind of interface between them). I’ve learned the hard way that without some boundaries, it is so easy to create spaghetti code.
  • Ease tests on the analyzers, which is the part that requires them most
  • Speed tests on the analyzers, by not requiring to load Rails at all
  • Ease tests on the Rails app, as we could take the analyzer code working as a given (i.e. as any gem, it is supposed to be tested before it becomes a dependency of the Rails app. You are probably not unit testing ActiveSupport, and should not). This would allow use to test really the application (controllers, persistance, interaction with the user).

And now, to work!

The road

Initial situation

The analyzers classes were already sort of isolated in the app/lib directory (a rather discutable choice in itself. For us as many other, it was a bad way to say: "we know this does not belong to app/models, but are not sure where it exactly belongs").

Step I : A test extraction

My first move was to copy the code of analyzers, and put it in another directory on my machine. The objective of this step was not to have the Rails app working, but to see if the analyzers could work in isolation (they were supposed to).

The directory was the base of the future sybil-analyzers gem. As such, I started creating the structure required for a gem, following the excellent Rubygems guide "Make your own gem". As expected, I had to do some changes, notably expliciting the “require”, as I had no more the benefits of Rails autoload feature (not having to require anything is most often a boon, but sometime a curse).

My base test loop was quite simple: package the gem, install it, open an irb, require and play with the API. Being able to build the gems does not means anything (only that your gemspec is well formed), so the first tries were quite unsuccessful (even requiring it caused problems).

As I was adding more and more files to the future gem, listing them all in the gemspec became a hassle (you have to list all your files in your gemspec).  Fortunately, a gemspec is ultimately a piece of Ruby code, meaning that I could replace :

s.files =["analyzer1.rb", "analyzer2.rb", ...]

With :

s.files = Dir["{lib}/**/*.rb", "bin/*", "LICENSE", "*.md"]

Step II : Cleaning unwanted dependencies

Once my gem was one (all required files well bundled in), it was time to discover that my “clean separation” was not as clean as I thought. Trying to initialize an analyzer yield several “Unknown symbol” exceptions, the main culprits being Rails logger and Rails.root path.

Pluggable login

The logger I used in many place, and until now, there was no reason to use a different one there than in the rest of the application. Now I needed a way to have the classes logging (for testing purpose), without depending on (even a small part of) Rails, while keeping the option of using Rails logger for everything in production. My solution was to have a single point of initialization for a logger, and allow to pass it from the outside.

My answer is a (very simple) strategy pattern: as both loggers are using the same logging method, I can test everything using Ruby logger, and plug the Rails logger when appropriate.

Directories

The analyzers are reading some basic YAML configuration files. Those were referenced from Rails.root. Again, I introduced some “base_dir” notion, letting the outside world set it before starting to analyze.

I noted those two points as modifications that would be required.

Step III : Preparation work

Having noted the various problems found, I got back to the Rails app, and started to apply the various modifications needed on the analyzer code to have it “ready for extraction”, effectively reproducing the various modification I had to do to make it working. This time, I did commit and push my result to our git repo, testing the app along the way to be sure I was not breaking anything.

Step IV : Gem as a folder

I was now ready to extract the code to its own gem. But an important question at this stage was how I thought the Rails app was going to use it? I was not going to upload my gem to rubyforge (Sybil is not open source), and I wanted some flexibility to work on both the web and analyzers part at some time.

Bundler was there to the rescue: in a Gemfile, you can refer the source of a gem as being rubyforge (the default), but also a git repository or even a path on your machine. Let’s take a look at those alternatives.

Using a git repository looks like what I want: I can setup a second private repository on Github to host the analyzer gem. Yet better, I could host various gems in a single repository (which can be handy). Problem was, I wanted some middle step before going completely this way.

All my work until this point has been (obviously) done on a feature branch, and I fully intended to create a pull request from it when it will be finished, to allow some discussion on something impacting our design quite heavily.

The local path was quite handy: I can extract my gem, have my Rails app uses it, without having to create any other repository for now, simply using a folder on my computer. Better yet, I could defined a gems/ directory in the webapp itself, and host the gem there temporarily.

gem 'sybil_analyzer', :path => "./gems/sybil_analyzer"

The benefit here was clear: this would allow for a lot of tests, without changing anything in the structure or breaking any deployment script (as capistrano deploy the whole Rails directory, it could be deployed this way in production, even if we did not do it). Even working on a branch, it was nice having the application working “as usual” - well, almost.

Step V : Initializers and integration

Finally, to make the Rails app working, I had to review some of the configurations. The parts of the config that was specific to the analyzers and not application-specific did end up in the gem itself as a YAML file.

On the other sides, Rails has to provide my new package with a logger and an appropriate directory to read and write stuff. This can be done several different ways, the simpler being just an initializer: Rails will run once all files under config/initializers at the start of the application. I created a short analyzer.rb to set the logger and directory, and was done.

#Create the Registerregister = Register.instance#Logger to use (delayed job's as all analyzer work happen in jobs)register.base_logger = Rails.logger#Base directory for indexes, name configured in the app, base depends on rails rootdir = Sybil::Application.config.index_dirregister.base_dir = Rails.root.join(dir)

This looked like a good time for a pull request.

Step VI : Gem as git repo

After some exchange and updates, it was time to merge the pull request and move the code toward our brand new “8thcolor-gems” repository (another nice option would have been to create a private gem server using something like Gem in a Box, but GitHub allowed us to solve both the reference and the hosting in one clean shot).

Bundler can use a git repository as gem source, with some rather nice options: you can specify a tag or branch (so we can have a “bleeding edge” gem version on the “developer” branch, and use the last stable one on master). Yet better, a single repository can host several gems, as long as its root respects a structure (basically: one folder per gem, with the name being the gem name) using this synthax:

git "git://github.com/8thcolor/8thcolor_gems.git" do  gem "sybil-analyzer"  gem "sybil-othermodule"  gem "yet-anothe- module"end

We can now move this code to a proper place, and begin reviewing and extending our test suite. Should we require at anytime to develop on the gem and the Rails app at the same time, we can always get the gem back as a folder for some time.

Conclusion

Modularization is one of our best weapons against complexity, and a good design indicator (if you cannot extract anything from your code base, it probably needs some refactor). The experience to do so in Rails and Ruby world has beenboth  efficient (less than a day) and pleasant (Rubygems and Bundler give you all the flexibility and options you need).

What are your experiences and techniques to manage your Rails application complexity?

Enhanced by Zemanta