Search code examples
elixir

File.Stream for a file on S3?


Is it possible to stream a file from a URL ? ex: Amazon S3 ?

I tried to do:

stream= File.stream!(public_s3_file_path)

I got error, although the file is there and public:

** (File.Error) could not stream "filepath...": no such file or directory
    (elixir) lib/file/stream.ex:78: anonymous fn/2 in Enumerable.File.Stream.reduce/3
    (elixir) lib/stream.ex:1240: anonymous fn/5 in Stream.resource/3
    (elixir) lib/stream.ex:785: Stream.do_transform/8
    (elixir) lib/stream.ex:1403: Enumerable.Stream.do_each/4
    (elixir) lib/task/supervised.ex:216: Task.Supervised.stream_reduce/10
    (elixir) lib/stream.ex:570: Stream.run/1
    (elixir) lib/task/supervised.ex:85: Task.Supervised.do_apply/2
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Function: &Stream.run/1
    Args: [#Function<0.55381211/2 in Task.Supervisor.build_stream/5>]

Here is the minimum code to try:

stream= File.stream!("https://s3.eu-central-1.amazonaws.com/trackware.staging.schools/school-2/uploads/process%3A15073648797260")
Task.start_link(Stream, :run, [stream])

the error:

16:19:49.569 [error] Task #PID<0.168.0> started from #PID<0.165.0> terminating
** (File.Error) could not stream "https://s3.eu-central-1.amazonaws.com/trackware.staging.schools/school-2/uploads/process%3A15073648797260": no such file or directory
    (elixir) lib/file/stream.ex:78: anonymous fn/2 in Enumerable.File.Stream.reduce/3
    (elixir) lib/stream.ex:1240: anonymous fn/5 in Stream.resource/3
    (elixir) lib/stream.ex:570: Stream.run/1
    (elixir) lib/task/supervised.ex:85: Task.Supervised.do_apply/2
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Function: &Stream.run/1
    Args: [%File.Stream{line_or_bytes: :line, modes: [:raw, :read_ahead, :binary], path: "https://s3.eu-central-1.amazonaws.com/trackware.staging.schools/school-2/uploads/process%3A15073648797260", raw: true}]

Any idea? maybe File.stream is only for local files?!


Solution

  • File.stream only works for local files. You need an HTTP client to read from HTTP servers. The httpoison package for example supports streaming requests in which the response is sent to a process via messages as and when the remote server writes something to the connection socket. You can read more about it in httpoison's README.

    iex> HTTPoison.get! "https://github.com/", %{}, stream_to: self
    %HTTPoison.AsyncResponse{id: #Reference<0.0.0.1654>}
    iex> flush
    %HTTPoison.AsyncStatus{code: 200, id: #Reference<0.0.0.1654>}
    %HTTPoison.AsyncHeaders{headers: %{"Connection" => "keep-alive", ...}, id: #Reference<0.0.0.1654>}
    %HTTPoison.AsyncChunk{chunk: "<!DOCTYPE html>...", id: #Reference<0.0.0.1654>}
    %HTTPoison.AsyncEnd{id: #Reference<0.0.0.1654>}
    :ok
    

    (An example from the README of httpoison.)