Seeding a database using the Rails command line

Oats

Ruby on Rails has very good tools to seed a database, and thanks to the community efforts, there are several gems to make that task easier. Besides seeding the database, we have useful tools to check the database and ways to organize better important data seeds.

Creating a sample application

Let’s start typing rails in a Rails application directory to check out all the available commands:

rails new sample
cd sample
rails

We will see the common commands, and below some additional ones. We are going to use some of them. In fact you used one: new, used to bootstrap a new application.

Creating a model

Next I’m going to generate a new model. Documentation is very good, because typing rails generate (or the rails g shortcut) displays all available generators.

rails g model Movie title director storyline:text watched_on:date

Here I’m setting the title and director as string (default type if not specified), storyline as text, and watched_on as date (when setting dates, not datetimes, a convention is to append on to the field).

Rails will generate a migration for us adapted to the default database, which is SQLite. Migrations are saved in db/migrations. Let’s see how it looks like!

class CreateMovies < ActiveRecord::Migration
  def change
    create_table :movies do |t|
      t.string :title
      t.string :director
      t.text :storyline
      t.date :watched_on

      t.timestamps null: false
    end
  end
end

Very straightforward. The only remarkable thing is the timestamps statement: it will generate the created_at and updated_at fields automatically, very handy. Let’s run it.

$ rake db:migrate
== 20150731183607 CreateMovies: migrating =====================================
-- create_table(:movies)
   -> 0.0010s
== 20150731183607 CreateMovies: migrated (0.0010s) ============================

So now Rails has actually created the table. Just in case you did something wrong, you can always rollback:

rake db:rollback

This commands accepts an optional step parameter to go back as many migrations as needed.

So now that it’s created, let’s see how the schema looks like in db/schema.rb:

ActiveRecord::Schema.define(version: 20150731183607) do

  create_table "movies", force: :cascade do |t|
    t.string   "title"
    t.string   "director"
    t.text     "storyline"
    t.date     "watched_on"
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
  end

end

Cool! This will contain the entire database schema as we run more migrations.

Rake commands

As a side note: how can you know about rake commands? By using the -T parameter you can see a list of them:

rake -T

You can even limit by namespace, such as db:

rake -T db

Creating some seeds

Let’s go for the interesting part of this article! Open the db/seeds.rb file, and paste this:

Movie.destroy_all

Movie.create!([{
  title: "Ant-Man",
  director: "Peyton Reed",
  storyline: "Armed with the astonishing ability to shrink in scale but increase in strength, con-man Scott Lang must embrace his inner-hero and help his mentor, Dr. Hank Pym, protect the secret behind his spectacular Ant-Man suit from a new generation of towering threats. Against seemingly insurmountable obstacles, Pym and Lang must plan and pull off a heist that will save the world.",
  watched_on: 5.days.ago
},
{
  title: "Pixels",
  director: "Chris Columbus",
  storyline: "When aliens misinterpret video feeds of classic arcade games as a declaration of war, they attack the Earth in the form of the video games.",
  watched_on: 3.days.ago
},
{
  title: "Terminator Genisys",
  director: "Alan Taylor",
  storyline: "When John Connor, leader of the human resistance, sends Sgt. Kyle Reese back to 1984 to protect Sarah Connor and safeguard the future, an unexpected turn of events creates a fractured timeline. Now, Sgt. Reese finds himself in a new and unfamiliar version of the past, where he is faced with unlikely allies, including the Guardian, dangerous new enemies, and an unexpected new mission: To reset the future...",
  watched_on: 10.days.ago
}])

p "Created #{Movie.count} movies"

First we destroy all movies to have a clean state, and add 3 movies passing an array to the create method. The seeds file uses Rails ActiveSupport, so we can use those handy Ruby day.ago statements to define dates.

At the end I give some feedback about the total movies created. Let’s run it!

$ rake db:seed
"Created 3 movies"

You can run this command as many times as you need thanks to the destroy statement in the first line.

To check them you can use the Rails runner:

$ rails runner 'p Movie.pluck :title'
["Ant-Man", "Pixels", "Terminator Genisys"]

Using a custom rake task to seed important data

All our seeds are considered development data, not important data for production use. So, don’t seed like we did in production! Especially because the first step deletes all movies!

To seed important data is better to create a custom rake task. Let’s create one to add genres:

rails g model Genre name
...
rake db:migrate
...
rails g task movies seed_genres
      create  lib/tasks/movies.rake

This creates a movies rake file in the lib directory containing the seed_genres task (you could add more from the command line). It looks like this:

namespace :movies do
  desc "Seeds genres"
  task seed_genres: :environment do
    Genre.create!([{
      name: "Action"
    },
    {
      name: "Sci-Fi"
    },
    {
      name: "Adventure"
    }])
  end

  p "Created #{Genre.count} genres"
end

It’s listed in the rake commands list:

$ rake -T movies
rake movies:seed_genres  # Seeds genres

Time to invoke it!

$ rake movies:seed_genres
"Created 3 genres"

Loading seeds using the console

The console if very useful to play around with your data. Let’s fire it up:

$ rails c
Loading development environment (Rails 4.2.3)

Did you know that you can load your seeds from inside? You don’t need to run rakeanymore! Try this:

Rails.application.load_seed

Playing with data using the console sandbox

Sometimes you will need to run destructive commands on your development or production environment, but without affecting your real data. It’s something like a safe mode where you can do whatever you like and then rollback to the previous state.

To access this mode just run rails c --sandbox. Then you can do something like this:

  • Movie.update_all(title: ‘Foo’)  to update all movies titles.
  • Movie.first.title  will display Foo.

If you exit the console, and run it again to check the first movie title, it will have the original title.

This is very useful for production debugging, like when a user says that updating the profile name he gets some weird error. We could try to reproduce that error directly using the sandbox mode, without affecting the application.

Loading more seeds using Faker

If you need, say, 100 movies, you can replace your seeds file with this:

Movie.destroy_all

100.times do |index|
  Movie.create!(title: "Title #{index}",
                director: "Director #{index}",
                storyline: "Storyline #{index}",
                watched_on: index.days.ago)
end

p "Created #{Movie.count} movies"

But that doesn’t look realistic at all:

$ rails runner 'p Movie.select(:title, :director, :storyline).last'
#<Movie id: nil, title: "Title 99", director: "Director 99", storyline: "Storyline 99">

Time to use Faker, a gem that generates random values. Add it into the development group inside your Gemfile:

group :development, :test do
  # ...

  gem 'faker'
end

Run bundle install and use these seeds:

Movie.destroy_all

100.times do |index|
  Movie.create!(title: Faker::Lorem.sentence(3, false, 0).chop,
                director: Faker::Name.name,
                storyline: Faker::Lorem.paragraph,
                watched_on: Faker::Time.between(4.months.ago, 1.week.ago))
end

p "Created #{Movie.count} movies"

Check one out again:

$ rails runner 'p Movie.select(:title, :director, :storyline, :watched_on).last'
#<Movie id: nil, title: "Minus perferendis delectus", director: "Scot Jenkins", storyline: "Quisquam aut dicta similique est repellendus. Maxi...", watched_on: "2015-06-25">

Much better!

Conclusion

Seeding your application when developing it is very important because you will feel it like it has real data. That’s interesting to see how it looks like because data will have random length.

Also, knowing the available tools to work with seeds makes us feel more comfortable and more productive, so it’s worth it to invest some time to learn and practice.

Do you use any other interesting seeding technique or tool worth trying?

Did you like it? Please share it:

Get my ebook for free

10 ideas that helped me become a better developer (and may help you too)

Subscribe to my mailing list and get my ebook on 10 ideas that helped me become a better developer.

About Me

David Morales

David Morales

I'm David Morales, a computer engineer from Barcelona, working on projects using Ruby on Rails and training on web technologies.

Learn More