Search code examples
rubymechanize-ruby

How can I download a file using its generated filename with ruby and mechanize?


I'm attempting to download files from a website that uses a CDN for distribution. The URLs on the download page all end with file.pdf but clicking on the link in a browser results in the download of a file with a descriptive file name (e.g. 'invoice1234.pdf'). Obviously parsing the URL to get the file name results in every file being named file.pdf - I would like to use the same file name that is used when downloading via the browser. My code looks something like this:

  filename = File.basename(download.href)
  agent.pluggable_parser.default = Mechanize::Download
  agent.get(mov_download_link.href).save("#{path}/#{filename}")
  agent.pluggable_parser.default = Mechanize::File

Any ideas would be appreciated!


Solution

  • That filename is probably in a header that looks like this:

    {'content-disposition' => 'filename="invoice1234.pdf"'}
    

    If so:

    f = agent.get(mov_download_link.href)
    filename = f.header['content-disposition'][/"(.*)"/, 1]
    f.save("#{path}/#{filename}")