In the previous post, we saw how we can divide and speed up test suite by using Knapsack gem from ~1 hour to 11 minutes. We also cached our dependencies on CI and started looking out for randomly failing tests. This time we will take a look at how to setup DatabaseCleaner in order to make use of different strategies for wiping out data between test runs in order to squeeze in even more from the suite.

DatabaseCleaner gem

DatabaseCleaner is a set of strategies for cleaning your database in Ruby. It’s a simple yet powerful tool that You most probably already use. Even though it’s simple, its configuration can get tricky at times. Messing up configuration can cause extra failures related to database pollution. So let us see how we can set it up in order to have a stable test suite.

Basic setup

Add database cleaner to test group in Gemfile (if it’s not yet there).

group :test do
  gem 'database_cleaner'
end

Make sure you load rspec/support helpers, it’s loaded by default with RSpec setup in spec/rails_helper.

# depending on age of the project look into rails or spec helper file
Dir[Rails.root.join('spec', 'support', '**', '*.rb')].each { |f| require f }

Create spec/support/database_cleaner.rb with

RSpec.configure do |config|

  config.before(:suite) do
    DatabaseCleaner.clean_with(:deletion)
  end

end

This is a good starting point, it will make sure the database is clean before the whole suite starts in case there are any leftovers from previous run.

Transactions - fastest all-rounder

Transactions should be used by default, when you need to use database calls it’s the fastest strategy available. DB Cleaner will open new transaction on each test case and roll it back when it’s finished. This way it doesn’t need to truncate all the tables, even though You used just one.

RSpec.configure do |config|

  # ...

  config.before(:each) do
    DatabaseCleaner.strategy = :transaction
    DatabaseCleaner.start
  end

  config.after(:each) do
    DatabaseCleaner.clean
  end

end

Deletion - rescue for after commit callbacks

Sometimes transactions are not enough, especially when you use after commit callbacks in ActiveRecord. As the name implies those are executed after committing the transaction. This can’t work with the previous strategy as transaction would still be open on the DB Cleaner level.

For this kind of unit tests where you touch DB and need access to data after committing the transaction, you can use deletion strategy. Which won’t be significantly slower than the transaction for small data sets.

To fix this up without switching all the tests to deletion, we can use RSpec examples metadata, in order to pick correct strategy on demand.

  # ...

  config.before(:each) do
    DatabaseCleaner.strategy = example.metadata[:strategy] || :transaction
    DatabaseCleaner.start
  end

  # ...

With this whenever we need to use the deletion strategy, we will just add strategy: :deletion, like:

# for whole block
context "executes after commit callbacks", strategy: :deletion do
  # ...
end

# or a single example
it "creates thumbnails on save", strategy: :deletion do
  # ...
end

Truncation - safe acceptance examples

Truncation is the slowest but also the safest and most stable strategy. There are no compromises when it comes to clean state here. It’s the best strategy for acceptance tests when using capybara with javascript driver.

Again we can make use of metadata, and use it by default for all js related tests.

  # ...

  config.before(:each) do
    DatabaseCleaner.strategy =
      if example.metadata[:js]
        :truncation
      else
        example.metadata[:strategy] || :transaction
      end

    DatabaseCleaner.start
  end

  # ...

Truncation or Deletion?

Rule of thumb should be:

For big setups before the tests, when you create a lot of objects with associations that use foreign keys, etc - use truncation, as it’s a fixed time regardless of the amount of data. This also means the more tables you have the more time it will take, as it takes the same amount of time for empty tables as well. You can think about it as drop plus create table, indexes, etc.

On the other hand, when you have a simple setup, with unit alike tests - use deletion as it’s faster for small datasets, it won’t recreate tables or indexes.

If you are interested in details, then here’s a great answer for PostgreSQL on StackOverflow explaining it thoroughly.

Bonus no.1: Cleaning on demand

Life is hard, and usually, you will find yourself with a really big setup of objects. This can happen especially when you try to test some kind of filtering or querying classes, which do read-only queries against data, without any side effects. In such cases, it would be great if we could disable DB Cleaner per example, and run it only once in before/after all blocks. Again this is easy to achieve with examples metadata, lets us see.

  # before :suite ...

  config.before(:all, :cleaner_for_context) do
    DatabaseCleaner.strategy = :truncation
    DatabaseCleaner.start
  end

  config.before(:each) do |example|
    next if example.metadata[:cleaner_for_context]

    DatabaseCleaner.strategy = # ...
    DatabaseCleaner.start
  end

  config.after(:each) do |example|
    next if example.metadata[:cleaner_for_context]

    DatabaseCleaner.clean
  end

  config.after(:all, :cleaner_for_context) do
    DatabaseCleaner.clean  
  end

  # ...

Lets go into detail what do we do here:

before :all, :cleaner_for_context block, will run once for whole context/describe block. We used truncation, as it will usually be faster in such examples.
We skip before and after :each blocks whenever we set :cleaner_for_context metadata.
We clean database once for whole context/describe block with after :all

Then in our test, we can do

RSpec.describe CoreDashboardQuery, :cleaner_for_context do

  before :all do
    @data = # ...
  end

  let(:data) { @data }

  # it ...
end

Note: Remember that whenever you use before/after all blocks, those are run once for the given context/describe block, so:

don’t do any data updates or additions within those blocks, unless you want to shoot yourself in the foot with a canon
whenever you use this technique, create the setup on top of the file, don’t hide it somewhere nested on the bottom

Bonus no.2: Catch data pollution issues

Getting back to our test suite example, which is still dark and full of terrors. We had random failures that were related to test leftovers in the database. Those were caused in examples that were running with :transaction strategy, were before :all block was lurking in the depths of nested context blocks.

The easiest way to find those is to crash loudly whenever the database is not clean in after :all block. To know what to count, look out for errors like “Couldn’t create Project because of unique key validation” or similar. To catch those drop this line in our setup.

# still in our spec/support/database_cleaner.rb file
RSpec.configure do |config|

  class DirtyDatabaseError < RuntimeError
    def initialize(meta)
      super "#{meta[:full_description]}\n\t#{meta[:location]}"
    end
  end

  # ...

  config.after(:each) do |example|
    next if example.metadata[:cleaner_for_context]
    DatabaseCleaner.clean
    raise DirtyDatabaseError.new(example.metadata) if Project.count > 0
  end
end

This will crash as soon as there are records still in DB, and will print example which caused it:

Failures:

  1) ProjectsQuery#build_query sorts projects by title
     Failure/Error: raise DirtyDatabaseError.new(example.metadata) if Project.count > 0

     DirtyDatabaseError:
        ProjectsQuery#build_query sorts projects by title
         ./spec/queries/projects_query_spec.rb:278

Of course, you don’t want to run it every time, so just drop it in whenever You see some suspicious errors.

Summary

Our final DatabaseCleaner configuration for RSpec.

class DirtyDatabaseError < RuntimeError
  def initialize(meta)
    super "#{meta[:full_description]}\n\t#{meta[:location]}"
  end
end

RSpec.configure do |config|

  config.before(:suite) do
    DatabaseCleaner.clean_with(:deletion)
  end

  config.before(:all, :cleaner_for_context) do
    DatabaseCleaner.strategy = :truncation
    DatabaseCleaner.start
  end

  config.before(:each) do |example|
    next if example.metadata[:cleaner_for_context]

    DatabaseCleaner.strategy =
      if example.metadata[:js]
        :truncation
      else
        example.metadata[:strategy] || :transaction
      end

    DatabaseCleaner.start
  end

  config.after(:each) do |example|
    next if example.metadata[:cleaner_for_context]

    DatabaseCleaner.clean

    # raise DirtyDatabaseError.new(example.metadata) if Record.count > 0
  end

  config.after(:all, :cleaner_for_context) do
    DatabaseCleaner.clean  
  end
end

In our case, we were using :truncation for everything. In order to speed things up, we added above test configuration to use :transaction by default for everything, and we left :truncation for JS tests only.

We also used :deletion instead of :truncation for unit alike tests, that used after commit callbacks, i.e. attachment objects that are processing files when it’s saved. In those cases, the deletion was taking a fraction of the time of truncation.

On the way, we found out which tests were polluting database, and we were able to fix it easily. When we finished the move, our suite runtime dropped by another 2 minutes and now stays stable at ~9 minutes per job.

Next time we will try making use of knapsack to run the test suite in parallel locally, so stay tuned and happy hacking!