Search code examples
rubyscreen-scrapingscrubyt

How to export scrubyt extractor?


I've written a scrubyt extractor based on the 'learning' technique - that is, specifying the current text on the page and getting it to work out the XPath expressions itself. However, I now want to export the extractor so that it can be used even when the page has changed.

The documentation for scrubyt seems to be all over the place now, but from what I can find I should be able to put the line extractor.export(__FILE__) and it should work. It doesn't - I just get an error saying that there is the wrong number of arguments for export, it should have 0. I've tried it without any arguments and it still fails.

I would ask on the scrubyt forum, but it seems like no-one's been there for ages!

Any ideas what to do here?


Solution

  • Just had the same problem and tried "puts google_data.export()" (trying to get some stuff from google)

    This gave me the following:

    === Extractor tree ===

     export() is not working at the moment, due to the removal or
    

    ParseTree, ruby2ruby and RubyInline. For now, in case you are using examples, you can replace them by hand based on the output below. So if your pattern in the learning extractor looks like

     book "Ruby Cookbook" 
    
     and you see the following below:
    
     [book] /table[1]/tr/td[2]
    
     then replace "Ruby Cookbook" with "/table[1]/tr/td[2]" (and all the
    

    other XPaths) and you are ready! [link] /body/div/div/div/div/div/ol/li/h3/a

    which gave me the xpath I was looking for

    scrubyt version is 0.4.06