An experiment: Anonymized production data rather than custom seed data script

This blog post explains the experiment of trying out an alternative approach to seed development data in a typical Rails application.

Click here if you want to skip the commonly used approaches and read about an an alternative.

Two of the most commonly used patterns for development seed data that I’ve seen so far:

1) idempotent seed data script that doesn’t generate unnecessary DELETE/INSERT SQL queries, it doesn’t assume your database is empty before you run this script. It is tolerant of the existing data and updates it(if necessary) to match the desired state. Although this approach seems very purist and graceful there is a significant drawback that I experienced in real production projects. It is rather time-consuming to maintain that kind of seed data setup. That’s something that you always have to keep in mind everytime you make some model/db schema changes.

2) FactoryBot models that generate all the necessary associations starting with a clean database. This approach is much easier in maintenance in comparison with the first approach assuming your factories are up to date.

In one of my recent production project I’ve tried something else and greatly enjoyed the result.

Instead of spending time supporting manually written development seed data script, use an anonymized dump of a production database.

Since this is a rather uncommon and opinionated approach, and they always come in with their pros and cons, let’s review them all.

Read More

Rake: How to export PostgreSQL dump?

You may find this Rake task handy for generating PostgreSQL database dumps in a typical Rails project:

namespace :db do
  desc "Dumps the database to db/APP_NAME.dump"
  task dump: :environment do
    raise("Could only be executed in development environment") unless Rails.env.development?

    with_config do |app, host, db, user|
      cmd = "pg_dump --host #{host} --username #{user} -Fp --verbose --clean --no-owner --no-acl --format=c #{db} > #{Rails.root}/db/#{app}.dump"
      puts cmd
      exec cmd
    end
  end

  private

  def with_config
    yield Rails.application.class.parent_name.underscore,
     'localhost',
      ActiveRecord::Base.connection_config[:database],
      ENV['USER']
  end
end

Although this may not work in case you have different database access configuration.

How to find Git commits from the past week

You may find this little snippet useful for your weekly stand up reports.

This command counts all commits authored by “Joe” in the currently opened branch:

git log --since=7.days | grep Joe | wc -l

 

This command counts all commits authored by “Joe” in all branches:

git log --since=7.days --all | grep Joe | wc -l

Heroku command-line interface for running code snippets remotely

This blog post is an explanation about a method for executing one-off scripts and snippets on Heroku that might be better than other solutions.

Starting with drawbacks of alternatives

1) pasting prior written code into running heroku run rails console console. Depending on settings of your editor/IDE/clipboard-manager you may experience some issues trying to run this code. There is a chance of it being corrupted and not looking like a valid Ruby code, especially if it is a long multi-line snippet.

2) adding this code as a temporary method/class and running it in production like heroku run lib/temp_snippet.rb. It generates additional commit and a deploy(although these 2 might be important logging artifacts in some cases). What’s definitely not desirable is some downtime caused by that deploy(unless you use Heroku Preboot)

3) Running it as regular Rails db migration has the drawbacks of (2) plus additional maintenance cost.

An alternative is to run your snippet like so:

cat foo.rb | heroku run console --app=bar --no-tty

Read More

Bitbucket Gemfile git_source

Any private gems in Gemfile that are hosted on bitbucket?

Add separate git_source for bitbucket provider to clean up your Gemfile. Here’s an example:

git_source(:bitbucket) { |repo| "git@bitbucket.org:#{repo}.git" }

gem 'my_private_gem', bitbucket: 'username/my_private_gem'

Describe PostgreSQL table from command line

You may be familiar with annotate gem(GitHub, Rubygems). Neat library that adds comments summarizing the current DB schema to the top of each of your model/factory/spec. First time I’ve used it in production project(5+ yrs ago) it felt convenient, especially in big apps with lots of models and fields. I no longer needed to keep in my mind specific field names neither needed I to look them up in DB administration tools. That was big productivity boost.

After using this whole model annotation concept in plenty of projects and thinking about it I would prefer not to use it in future projects. Here’s why:

Read More

Expanding context in diffs

Whether it’s reviewing open source project, or using git on a private project, we all spend a lot of time looking at diffs. Although some prefer graphical interface clients I use the plain command line.

But to understand a code change, we sometimes need to see more than just the default three lines of context especially in cases when long method names.

That’s when this git option come in handy:

git diff -U15 app/models

It generates diffs with 15 lines of context instead of the usual three.

Depending on your commit you may set the most appropriate number for lines of context to get the exact amount of context info.