Search code examples

Web Scraping with Nokogiri::HTML and Ruby - Output to CSV issue

I have a script that scrapes HTML article pages of a webshop. I'm testing with a set of 22 pages of which 5 article pages have a product description and the others don't.

This code puts the right info on screen:

if doc.at_css('.product_description')
  doc.css('div > .product_description > p').each do |description|
    puts description
    puts "no description"

But now I'm stuck on how to get this correctly to output the found product descriptions to an array from where I'm writing them to a CSV file.

Tried several options, but none of them works so far. If I replace the puts description for @description << description.content, then all the descriptions of the articles end up in the upper lines in the CSV although they do not belong to the articles in that line.

When I also replace "no description" for @description = "no description" then the first 14 lines in my CSV recieve 1 letter of "no description" each. Looks funny, but it is not exactly what I need.

If more code is needed, just shout!

This is the CSV code I use in the script:"artinfo.csv", "wb") do |row|
    row << ["category", "sub-category", "sub-sub-category", "price", "serial number",  "title", "description"]
    ([email protected] - 1).each do |index|
    row << [


  • It sounds like your data isn't lined up properly. If it were you should be able to do:"artinfo.csv", "w") do |csv|
      csv << ["category", "sub-category", "sub-sub-category", "price", "serial number",  "title", "description"]
      [@categories, @subcategories, @subsubcategories, @prices, @serial_numbers, @title, @description].transpose.each do |row|
        csv << row