Search code examples
ruby-on-railscapybararuby-debug

Rails, Capybara.using_session(...), visiting a page, and having ruby-debug causes rspec suite seem to hang after run even though all specs passing


Rails 7.0.4

Rspec 3.11.0

Capybara 3.37.1

selenium-webdriver (gem) 4.5.0

with Ruby-debug (gem 'debug', platforms: %i[ mri mingw x64_mingw ]) in Gemfile

whenever I try to add Capybara.using_session call, my suite hangs. it hangs after it completes all the specs, even when passing.

my driver is set to :

Capybara.default_driver = :selenium_chrome_headless

but I noticed it happens if I change the driver to Capybara.default_driver = :selenium_chrome also

here's the spec that reproduces it:

require 'rails_helper'

describe "client app", type: :feature  do
  describe "when starting the experience", type: :feature do
    # TODO: figure out why capy is hanging here

    it "can load the home page" do
      # THIS CAUSES AFTER-RSPEC HANG even though all specs pass
      Capybara.using_session("client session") do
        visit "/"
      end
    end

    it "loads with a client id" do
      visit '/'
    end
  end
end

I have isolated it to these THREE conditions :

  1. using Capybara.using_session and visiting a page using visit inside the using_session block (if I remove either the visit or the using_session, it works)

  2. In another test in the same suite, visiting any other page using visit (if I remove the visit call in the other spec, it works)

  3. the debug gem (ruby-debug) (if I remove gem 'debug', platforms: %i[ mri mingw x64_mingw ] from the Gemfile)

This is a very strange bug.

Capy hang looks like: enter image description here

my spec/rails_helper.rb file is

require 'spec_helper'
ENV['RAILS_ENV'] ||= 'test'
require_relative '../config/environment'

abort("The Rails environment is running in production mode!") if Rails.env.production?
require 'rspec/rails'

require 'capybara/rails'
require 'capybara/rspec'

Dir[Rails.root.join('spec', 'support', '**', '*.rb')].sort.each { |f| require f }

RSpec.configure do |config|
  config.use_transactional_fixtures = true
  config.infer_spec_type_from_file_location!
  config.filter_rails_from_backtrace!

end

Capybara.default_driver = :selenium_chrome_headless

IMPORTANT: Although I have isolated it this far, I am strangely unable to isolate it any further or even able to reproduce it in a new Rails app, even though this Rails app is just a month old.

Here, I have taken the app where it is broken and removed everything else (really, I removed EVERYTHING and the hang still happens).

I have compared the remaining code carefully to a newly generated app and bizarrely I cannot reproduce the same bug (Using the exact same spec and setup) in a new app, so it must involve a 4th unknown variable I cannot see yet.

REPRODUCTION

Bare-bones reproduction app here: https://github.com/jasonfb/StrangeCapybaraHang

This app was created as a FORK of my app, then I removed everything non-essential. Notice that it contains almost nothing, and only 1 spec.

When I run rspec on this app (broken), see "DEBUGGER: Attaching after process 77726 fork to child process 77733" AND the suite hangs after it runs enter image description here

Here, on this 2nd attempt to recreate the problem, I tried to go forward to create the bug: https://github.com/jasonfb/StrangeCapyHangForward

I setup a new Rails app and then attempted to recreate all of the conditions explained above to produce the bug... BUT... I can't reproduce the bug in this new app!!

So that means even though I've identified the 3 elements of the bug, there MUST be a 4th element I am not seeing yet.

Notice that in this app, it has same exact spec (see spec/system/test_capy_hand_spec.rb), the same exact rails_helper file, the same exact Gems, etc as the other repo where the bug manifests.

However, on this app, I never see "DEBUGGER: Attaching after process 77726 fork to child process...." even when debug Gem is in the Gemfile. Why is that?

I also DO NOT see the hang.

enter image description here

2022-12-15 SOLVED!


Solution

  • TL;DR

    The issue is in the selenium-webdriver gem. Upgrade from 4.5.0 in your StrangeCapybaraHang project to 4.7.1 and it should solve your problem immediately.

    Reproducing from Scratch

    1. Fork the StrangeCapyHangForward repo.
    2. Run ./bin/setup && yarn build. Note, if esbuild is not globally installed, you'll need to add it with yarn add esbuild since it's missing from package.json.
    3. Downgrade the selenium-webdriver gem to the same version that is in StrangeCapybaraHang. For example, in your Gemfile's :test group: gem 'selenium-webdriver', '4.5.0'
    4. bundle install && rspec. It hangs.

    Example of Gemfile with not working selenium-webdriver gem added

    # Gemfile of StrangeCapyHangForward
    group :development, :test do
      gem 'debug', platforms: %i[ mri mingw x64_mingw ]
      gem 'rspec-rails'
      # add the older version of the gem
      gem 'selenium-webdriver', '4.5.0'
    end
    

    Steps to Fix

    1. Add gem 'selenium-webdriver', '4.7.1' to your Gemfile in StrangeCapyHangForward.
    2. bundle install
    3. Run rspec. No more hang!

    Note: You may have to grep and kill existing rspec processes. For example:

    ps -ax | grep rspec
    # take process ids for all the above and run (substituting 71529 for your pids): 
    kill -9 71529
    

    Example Gemfile with working selenium-webdriver gem added

    # Example Gemfile with working version of selenium-webdriver
    group :development, :test do
      gem 'debug', platforms: %i[ mri mingw x64_mingw ]
      gem 'rspec-rails'
      # add selenium-webdrive gem coded to 4.7.1
      gem 'selenium-webdriver', '4.7.1'
    end
    

    I believe the issue is related to concurrency and rspec not properly killing all the test processes, although this is difficult to confirm. We had big trouble with our test suite and concurrency, too, until we forcibly upgraded many of our gems (including dependencies).

    For example, before upgrading run rspec a few times and grep your processes: ps -ax | grep rspec. You'll see lots of existing ones there. I did check the selenium-webdrive changelog to see if there was anything obvious that would cause this, but I didn't see anything.

    Anyways, hope this helps! Good luck!