Search code examples
databaseherokunokogiriamazon-rdsrake-task

Rake task runs twice on Heroku


I have an app on heroku which scrapes pricing data off a website and saves it periodically. For some reason whenever I run the rake task locally, it acts as expected, but when I run the task on heroku it adds extra output and is saving the object twice for some reason.

Output when I run locally

a,12.80 saved at 2012-10-26 03:36:17 UTC.
b,38.03 saved at 2012-10-26 03:36:24 UTC.
c,22.38 saved at 2012-10-26 03:36:31 UTC.

Output when I run on heroku

a,12.80 saved at 2012-10-26 03:36:17 UTC.
b,38.03 saved at 2012-10-26 03:36:24 UTC.
c,22.38 saved at 2012-10-26 03:36:31 UTC.
#<Stock:0x000000047bb5e8>,12.80 saved at 2012-10-26 03:36:45 UTC.
#<Stock:0x000000047baaf8>,38.03 saved at 2012-10-26 03:36:48 UTC.
#<Stock:0x000000047b00a8>,22.38 saved at 2012-10-26 03:36:52 UTC.

Code

require 'open-uri'
require 'date'

namespace :data do 
  desc "import current blah stock price data to database"
  task :importblah => :environment do

  #pass in a stock and price and save it to the database
  def save_stock(stock, price)
    #store stock data in database
    p = stock.prices.build
    p.price = price
    p.datetime = Time.now.utc.to_datetime

    if p.save
      puts "#{stock.symbol},#{price} saved at #{p.datetime.to_s}."
    else
      puts "#{stock.symbol} didn't save."
    end
  end

  actives = Parent.where("test1 = ?", true)

  actives.each do |m|

    stocks = m.stocks.where('test2 = ?', false)
    stocks.each do |stock|
      if stock.title.start_with?('blah')
        #grab stock price data from blah.com
        url = "blah"+stock.symbol
        doc = Nokogiri::HTML(open(url))
        price = doc.at_css(".value").text[/\d+\.\d+/]
        save_stock(stock, price)
      end


    end
  end


  end
end

I have an almost identical rake task for a different site and it is not saving the pricing data twice. I am using Amazon RDS db if that affects anything.


Solution

  • Turns out it was a problem with git tracking where I deleted files locally but forgot to

    git rm oldraketask.rake
    

    which happened to contain a rake task with the same name as the one I was running so heroku was running both rake tasks consecutively, first the new one and then the one from the old file still on heroku's servers.