Search code examples
javascriptmysqljsonphantomjscasperjs

Saving table data obtained while scraping a webpage using casperjs


Which would be the best method to save table data obtained while scraping a webpage using casperjs?

  1. Using a json object and store it as a file after serializing.

  2. Using ajax request to php then storing it in a mysql db.


Solution

  • For simplicity sake, view CasperJS as a way to getting data & handle it after in another language. I would go for option #1 - get the data in JSON format, and save it to a file to do work on later.

    To do this, you can use the File System API that PhantomJS provides. You can also couple this with CasperJS's cli interface to allow you to pass arguments into the script (a temporary file to write to for example).

    Your script to handle all of this would look like:

    1. Get temporary file path (mktemp on linux systems).
    2. Call your CasperJS script, passing in that temporary file path as an argument.
    3. Get your data, write it to that file using the File System API, and exit.
    4. Read in the file, do work with it (save to database, etc), remove the temporary file.