Search code examples
elixir

Is it safe to use Task.start() inside of a Task.async_stream()?


I'm learning the Elixir Task module. https://hexdocs.pm/elixir/1.12/Task.html

Is it safe to use Task.start() inside of a Task.asyn_stream()?

Will this cause a process leak?


Solution

  • Will this cause a process leak?

    You could write this:

    defmodule A do
    
      def go do
    
        Task.async_stream(
          [500, 100], 
          fn x -> 
            {:ok, pid } = Task.start(fn -> Process.sleep(:infinity) end)
            IO.inspect pid
            Process.sleep(x)
          end
        )
    
      end
    
    end
    

    In iex:

    ~/elixir_programs% iex a.ex
    Erlang/OTP 24 [erts-12.3.2] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1]
    
    Interactive Elixir (1.14.4) - press Ctrl+C to exit (type h() ENTER for help)
    
    iex(3)> stream = A.go                             
    #Function<3.111522547/2 in Task.build_stream/3>
    
    iex(4)> Stream.run(stream)                        
    #PID<0.121.0>
    #PID<0.123.0>
    :ok
    
    iex(5)> IEx.configure(inspect: [limit: :infinity])
    :ok
    
    iex(5)> Process.list      
    [#PID<0.0.0>, #PID<0.1.0>, #PID<0.2.0>, #PID<0.3.0>, #PID<0.4.0>, #PID<0.5.0>,
     #PID<0.6.0>, #PID<0.7.0>, #PID<0.10.0>, #PID<0.42.0>, #PID<0.44.0>,
     #PID<0.46.0>, #PID<0.47.0>, #PID<0.49.0>, #PID<0.50.0>, #PID<0.51.0>,
     #PID<0.52.0>, #PID<0.53.0>, #PID<0.54.0>, #PID<0.55.0>, #PID<0.56.0>,
     #PID<0.57.0>, #PID<0.58.0>, #PID<0.59.0>, #PID<0.60.0>, #PID<0.61.0>,
     #PID<0.62.0>, #PID<0.63.0>, #PID<0.64.0>, #PID<0.65.0>, #PID<0.66.0>,
     #PID<0.67.0>, #PID<0.68.0>, #PID<0.69.0>, #PID<0.70.0>, #PID<0.71.0>,
     #PID<0.78.0>, #PID<0.79.0>, #PID<0.80.0>, #PID<0.82.0>, #PID<0.85.0>,
     #PID<0.86.0>, #PID<0.87.0>, #PID<0.88.0>, #PID<0.89.0>, #PID<0.91.0>,
     #PID<0.92.0>, #PID<0.93.0>, #PID<0.94.0>, #PID<0.95.0>, #PID<0.96.0>,
     #PID<0.99.0>, #PID<0.100.0>, #PID<0.101.0>, #PID<0.102.0>, #PID<0.103.0>,
     #PID<0.104.0>, #PID<0.105.0>, #PID<0.114.0>, #PID<0.121.0>, #PID<0.123.0>]
    

    You can see that the Task.start processes, 0.121.0 and 0.123.0, are still running even after Stream.run(stream) has returned, which means all the processes started by Task.async_stream have completed.

    Any code can cause a process leak: if you create a process and you lose the pid of the process, then you have no control over the process, unless you shut down the node, or beam instance, in which the process is running. Task.start() returns a pid; if you lose the pid, then you will have a process leak if the process runs forever.

    In elxir/erlang a process is very lightweight and doesn't require many resources, so you can start millions of processes without overloading your system. As a result, if a thousand processes leak, it's not really a big deal. Of course, you wouldn't want to continually leak thousands of processes over a relatively long time period because that could crash your system.