Advanced database seeding in Rails applications


By default, Rails ships with a built-in feature for seeding initial data into a database. You can learn more about it here. It’s a great way to set initial data in development, staging or even in some production cases. It’s something I believe developers often overlook (we did), so this post will outline a few ways to make it work.



Why it’s important

Regardless if you are doing TDD, BDD or any other development practice, eventually you’ll need to run your application in development. At first, that may not be a problem, but as your application grows, it will need some pieces of specific data in order to operate normally. For example, if you are working on an e-shop project, a blank database won’t do - you’ll need at least some products and perhaps some tax configurations. You can go and create those records manually, but as you are testing stuff out, eventually these will go bad and you’ll have to fix them over and over again. That can and will quickly become a frustrating overhead.

Now, imagine that there are couple other developers working on different aspects of the system - it gets even messier. What if you want to bring a new developer to the team, should she automagically know how to set all the development data needed?

We encounter these situations daily and solving these is what Rails seeds is supposed to fix.



The basics

Before else, let’s look at the very basic usage, it would look something like this (from the Rails guide):

5.times do |i|
    Product.create(name: "Product ##{i}", description: "A product.")
end

You need to place that under the /db folder and then run rails db:seed to populate the database. Sweet and simple. However, setting records manually like that can get a bit time-consuming so let’s look at some other options.



seed_dump

seed_dump is probably my favorite seeds related gem because of its straightforwardness and ease of use, yet I find it so useful. The idea behind it is simple - run a single command and all the data from your development database will be outputted to the seeds.rb file (or any other file you may want).

This would allow you to create the appropriate records from the interface of your application and then export them to the seeds file instead of creating them manually.

There is one thing to keep in mind with seed_dump - if you are exporting records that have associations, these associations might get misplaced in the seeds file so you would have to reorder them manually.

Learn more about seedbank on github: https://github.com/rroblak/seed_dump



Multiple seed files

If you are working on a bigger project, the standard db/seeds.rb file can get a bit cluttered and messy so it would be a good idea to organize it into separate files. To do that, you can create a new /seeds folder within the /db one and add something like this to the original seeds.rb file:

Dir[File.join(Rails.root, 'db', 'seeds', '*.rb')].sort.each { |seed| load seed }

Now if you run rails db:seed, all seed files within the /seeds folder will be executed.

That’s a good optimization, but it might still not be enough so let’s look at another option.



seedbank

Seedbank provides an even more advanced approach to segmenting seed files. Besides having individual files, you can also group by environment. The folder structure would look something like that:

db/seeds/
    bar.seeds.rb
    development/
        users.seeds.rb
    foo.seeds.rb

Learn more about seedbank on github: https://github.com/james2m/seedbank

seed_dump together with seedbank or the simplified file splitting should make managing the seed files a lot easier. Another thing needed is a process in place that ensures developers always update the seed files accordingly. To achieve that, our PR templates have a check for making sure the seed files are always up to date.

Need help?

Book a 1h session with an expert on this very matter

€75/h

Pair programming

Pair programming is an agile software development technique in which two programmers work together at one workstation. One, the driver, writes code while the other, the observer or navigator,[1] reviews each line of code as it is typed in. The two programmers switch roles frequently.