Search code examples
performanceioerlanginputstreamprocessing-efficiency

Erlang: Read from an input stream in a efficient way


I'm writing a program that reads from an input stream, i.e.

erl -run p main -noshell -s erlang halt < input

The problem is that it takes a lot of time to read it (the input stream is huge) using this read function:

read_input(L) ->
    case io:get_line("") of
        eof ->
            lists:reverse(L);
        E0 ->
            read_input([E0|L])
    end.

I have been looking for more efficient alternatives, but I have found nothing. I have tried to read the file using

{ok, Binary} = file:read_file("input")

This is by far much more efficient. The problem is that I have to run this program in a platform where the name is unknown so I'd need some alternative to do so. additionally, I can't select the flags used when running, e.g. flag -noinput cannot be added to the command line.

Whatever help you can give will be welcomed.


Solution

  • Although Steve's solution is fastest known to me solution there can be used file module solution with quite good performance:

    -module(p).
    
    -export([start/0]).
    
    -define(BLK_SIZE, 16384).
    
    start() ->
        do(),
        halt().
    
    do() ->
        Bin = read(),
        io:format("~p~n", [byte_size(Bin)]).
    
    read() ->
        ok = io:setopts(standard_io, [binary]),
        read(<<>>).
    
    read(Acc) ->
        case file:read(standard_io, ?BLK_SIZE) of
            {ok, Data} ->
                read(<<Acc/bytes, Data/bytes>>);
            eof ->
                Acc
        end.
    

    It works with invocation like:

    erl -noshell -s p < input
    

    Note both approaches could be used for line-oriented input using {line, Max_Line_Size} option for port or file:read_line/1 for file module solution. Since version 17 (if I recall correctly) there is fixed performance bug in file:read_line/1 I found so it is good now. Anyway, you should not expect performance and comfort of Perl.