I asked nearly the same question in probably the wrong way, so I apologize for both the near duplicate and lousy original phrasing. I feel like my problem now is attempting to fight Rails, which is, of course, a losing battle. Accordingly, I am looking for the idiomatic Rails way to do this.
I have a table containing rows of user data which is scraped from a third party site periodically. The old data is just as important as the new data; the old data is, in fact, probably used more often. There are no performance concerns about referencing the new data, because only a couple people will ever use my service (I keep my standards realistic). But thousands of users are scraped periodically (i.e., way too often). I have named the corresponding models "User" and "UserScrape"
Table users
has columns: id
, name
, email
Table user_scrapes
has columns: id
, user_id
, created_at
, address_id
, awesomesauce_preference
Note: These are not the real models - user_scrapes has a lot more columns - but you probably get the point
At any given time, I want to find the most recent user_scrapes
values associated with the data retrieved from an external source from a given user. I want to find out that my current awesomeauce_preference
is, because lately it's probably 'lamesauce' but before, it was 'saucy_sauce'.
I want to have a convenient method that allows me to access the newest scraped data for each user in such a way that I can combine it with separate WHERE
clauses to narrow it down further. That's because in at least a dozen parts of my code, I need to deal with the data from the latest scrape.
What I have done so far is this horrible hack that selects the latest user_scrapes
for each user with a regular find_by_sql
correlated sub-query, then I pluck
out the id
s of the scrapes, then I put an additional where
clause in any relevant query (that needs the latest data).
This is already an issue performance-wise because I don't want to buffer over a million integers (yes, a lot of pages get scraped very often) then try to pass the MySQL driver a list of these and have it miraculously execute a perfect query plan. In my benchmark it took almost as long as it did for me to write this post, so I lied before. Performance is sort of an issue, but not really.
My question
So with my UserScrape
class, how can I make a method called 'current', as in: UserScrape.find(1337).current.where(address_id: 1234).awesomesauce_preference
when I live at addresses 1234 and 1235 and I want to find out what my awesomsauce_preference
is at my latest address?
I think what you are looking for are scopes:
http://guides.rubyonrails.org/active_record_querying.html#scopes
In particular, you can probably use:
scope :current, order("user_scrapes.created_at DESC").limit(1)
Update:
Scopes are meant to return an ActiveRecord object, so that you can continue chaining methods if you wish. There is nothing to prevent you (last I checked anyways) from writing this instead, however:
scope :current, order("user_scrapes.created_at DESC").first
This returns just the one object, and is not chainable, but it may be a more useful function ultimately.
UserScrape.where(address_id: 1234).current.awesomesauce_preference