Search code examples
rubyruby-on-rails-3spreadsheetxlsspreadsheet-gem

Spreadsheet Gem unbearably slow on Ruby 1.9.2


I'm building an Excel parser for my data team, and I've run into something of an issue with the Spreadsheet 0.6.5.1 gem.

In Ruby 1.9.2, use of the Spreadsheet.open method immediately hops up to 700m-1.3g of memory, and hangs indefinitely there, even on small (1 sheet, 300 row) workbooks. Meanwhile, in Ruby 1.8.7, Spreadsheet.open is snappy and flawless.

Right now I'm doing a lot of my work in irb, so that I can control the environment I'm using to just the basics (rubygems/spreadsheet gem), but I need to eventually move this parser into a Rails 3 project, so settling with 1.8.7 isn't an option.

There is no documentation on this issue or even evidence of other folks experiencing this problem. Whenever I abort the Spreadsheet.open call, I'm left with this error spill every time:

gems/spreadsheet-0.6.5.1/lib/spreadsheet/worksheet.rb:181:in 'call'

I'd like to avoid monkey patching this, or diving directly into the gem to hack out a resolution. Has anyone else experienced this problem? Or anything similar?


Solution

  • Tweak you GC and see if that fixes anything:

    For REE:

    export RUBY_HEAP_MIN_SLOTS=1000000
    export RUBY_HEAP_SLOTS_INCREMENT=1000000
    export RUBY_HEAP_SLOTS_GROWTH_FACTOR=1
    export RUBY_GC_MALLOC_LIMIT=1000000000
    export RUBY_HEAP_FREE_MIN=500000
    

    Something similar should work on 1.9.x, YMMV.

    With these tweaks, an 25k lines excel export using the spreadsheet gem went from 10+ minutes to ~2 mins for us.