Search code examples
rubyhttpurluriopen-uri

Ruby - How to get the name of a file with open-uri?


I want to download a music file by this way:

require 'open-uri'

source_url = "http://soundcloud.com/stereo-foo/cohete-amigo/download"

attachment_file = "test.wav"

open(attachment_file, "wb") do |file|  
  file.print open(source_url).read
end

In that example I want to change "Test.wav" to the real file name (like for example JDownloader program does).

EDIT: I don't mean the temporal file, I mean the stored file in the web like Jdownloader gets: "Cohete Amigo - Stereo Foo.wav"

Thankyou for read

UPDATE:

I've tried this to store the name:

attachment_file = File.basename(open(source_url))

I think that has no sense but i don't know the way to do it, sorry.


Solution

  • The filename is stored in the header field named Content-Disposition. However decoding this field can be a little bit tricky. See some discussion here for example:

    How to encode the filename parameter of Content-Disposition header in HTTP?

    For open-uri you can access all the header fields through the meta accessor of the returned File class:

    f = open('http://soundcloud.com/stereo-foo/cohete-amigo/download')
    f.meta['content-disposition']
    => "attachment;filename=\"Stereo Foo - Cohete Amigo.wav\""
    

    So in order to decode something like that you could do this:

    cd = f.meta['content-disposition'].
    filename = cd.match(/filename=(\"?)(.+)\1/)[2]
    => "Stereo Foo - Cohete Amigo.wav"
    

    It works for your particular case, and it also works if the quotes " are not present. But in the more complex content-disposition cases like UTF-8 filenames you could get into a little trouble. Not sure how often UTF-8 is used though, and if even soundcloud ever uses UTF-8. So maybe you don't need to worry about that (not confirmed nor tested).

    You could also use a more advanced web-crawling framework like Mechanize, and trust it to do the decoding for you:

    require 'mechanize'
    
    agent = Mechanize.new
    file = agent.get('http://soundcloud.com/stereo-foo/cohete-amigo/download')
    file.filename
    => "Stereo_Foo_-_Cohete_Amigo.wav"