An experiment: Anonymized production data rather than custom seed data script
This blog post explains the experiment of trying out an alternative approach to seed development data in a typical Rails application.
Click here if you want to skip the commonly used approaches and read about an an alternative.
Two of the most commonly used patterns for development seed data that I’ve seen so far:
1) idempotent seed data script that doesn’t generate unnecessary DELETE/INSERT SQL queries, it doesn’t assume your database is empty before you run this script. It is tolerant of the existing data and updates it(if necessary) to match the desired state. Although this approach seems very purist and graceful there is a significant drawback that I experienced in real production projects. It is rather time-consuming to maintain that kind of seed data setup. That’s something that you always have to keep in mind everytime you make some model/db schema changes.
2) FactoryBot models that generate all the necessary associations starting with a clean database. This approach is much easier in maintenance in comparison with the first approach assuming your factories are up to date.
In one of my recent production project I’ve tried something else and greatly enjoyed the result.
Instead of spending time supporting manually written development seed data script, use an anonymized dump of a production database.
Since this is a rather uncommon and opinionated approach, and they always come in with their pros and cons, let’s review them all.