Search code examples
ruby-on-railsrubyfactory-botfaker

Should we be using Faker in Rails Factories?


I love Faker, I use it in my seeds.rb all the time to populate my dev environment with real-ish looking data.

I've also just started using Factory Girl which also saves a lot of time - but when i sleuth around the web for code examples I don't see much evidence of people combining the two.

Q. Is there a good reason why people don't use faker in a factory?

My feeling is that by doing so I'd increase the robustness of my tests by seeding random - but predictable - data each time, which hopefully would increase the chances of a bug popping up.

But perhaps that's incorrect and there is either no benefit over hard coding a factory or I'm not seeing a potential pitfall. Is there a good reason why these two gems should or shouldn't be combined?


Solution

  • Some people argue against it, as here.

    DO NOT USE RANDOM ATTRIBUTE VALUES

    One common pattern is to use a fake data library (like Faker or Forgery) to generate random values on the fly. This may seem attractive for names, email addresses or telephone numbers, but it serves no real purpose. Creating unique values is simple enough with sequences:

    FactoryGirl.define do   
      sequence(:title) { |n| "Example title #{n}" }
    
      factory :post do
        title
      end 
    end
    
    FactoryGirl.create(:post).title # => 'Example title 1' 
    

    Your randomised data might at some stage trigger unexpected results in your tests, making your factories frustrating to work with. Any value that might affect your test outcome in some way would have to be overridden, meaning:

    Over time, you will discover new attributes that cause your test to fail sometimes. This is a frustrating process, since tests might fail only once in every ten or hundred runs – depending on how many attributes and possible values there are, and which combination triggers the bug. You will have to list every such random attribute in every test to override it, which is silly. So, you create non-random factories, thereby negating any benefit of the original randomness. One might argue, as Henrik Nyh does, that random values help you discover bugs. While possible, that obviously means you have a bigger problem: holes in your test suite. In the worst case scenario the bug still goes undetected; in the best case scenario you get a cryptic error message that disappears the next time you run the test, making it hard to debug. True, a cryptic error is better than no error, but randomised factories remain a poor substitute for proper unit tests, code review and TDD to prevent these problems.

    Randomised factories are therefore not only not worth the effort, they even give you false confidence in your tests, which is worse than having no tests at all.

    But there's nothing stopping you from doing it if you want to, just do it.

    Oh, and there is an even easier way to inline a sequence in recent FactoryGirl, that quote was written for an older version.