Search code examples
iobufferjulia

What is the best way to set up multiple I/O buffers in Julia?


I am experimenting with different methods of dealing with long text strings (e.g. book length) in Julia. Specifically I am looking at transposition ciphers and have been testing speed and memory usage using (1) string concatenation, (2) arrays and (3) I/O buffers. In the last case I need to be able to 'print' individual characters to different, indexable, I/O buffers. My first (simplified) attempt was as follows:

text = fill(IOBuffer(), 3)
print(text[1], 'a')
print(text[2], 'b')
print(text[3], 'c')
for i in 1:3
    println(String(take!(text[i])))
end

This produces:

"abc"
""
""

In other words, the first index returned the whole string "abc" rather than just the desired character 'a', and the other indices produced empty strings "" as the buffer was reset after the first take!() function.

My next attempt worked, but does not seem very sophisticated:

text = Vector(3)
for i in 1:3
    text[i] = IOBuffer()
end
print(text[1], 'a')
print(text[2], 'b')
print(text[3], 'c')
for i in 1:3
    println(String(take!(text[i])))
end

This produces the required output:

"a"
"b"
"c"

I'm still not entirely sure why the first method fails and the second works, but does anyone know a better method for setting up multiple I/O buffers that can be written to using indices?


Solution

  • The reason for your problem is that in text = fill(IOBuffer(), 3) the call IOBuffer() is evaluated only once so all entries of text are pointing to the same object. You can check it by running:

    julia> all(x->x===text[1], text)
    true
    

    Or you can see this when you run:

    julia> fill(println("AAA"), 3)
    AAA
    3-element Array{Void,1}:
     nothing
     nothing
     nothing
    

    to find that println was called only once - before passing its value to fill method.

    The simplest way to solve it is to use a comprehension:

    julia> text = [IOBuffer() for i in 1:3]
    3-element Array{Base.AbstractIOBuffer{Array{UInt8,1}},1}:
     IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
     IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
     IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
    
    julia> map(x->x===text[1], text)
    3-element Array{Bool,1}:
      true
     false
     false
    

    or map (a bit less clean):

    julia> map(i->IOBuffer(), 1:3)
    3-element Array{Base.AbstractIOBuffer{Array{UInt8,1}},1}:
     IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
     IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
     IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
    
    julia> map(x->x===text[1], text)
    3-element Array{Bool,1}:
      true
     false
     false
    

    Actually you could fill the array with IOBuffer types and invoke them (it is not a recommended approach, but shows you the difference):

    julia> text = invoke.(fill(IOBuffer, 3), [Tuple{}])
    3-element Array{Base.AbstractIOBuffer{Array{UInt8,1}},1}:
     IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
     IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
     IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
    
    julia> map(x->x===text[1], text)
    3-element Array{Bool,1}:
      true
     false
     false
    

    Finally, one catch to remember, as discussed here, is that a comprehension will invoke a function for each entry of a created array, but if you use a macro, it will be called only once. Here is a short example explained in the link in detail:

    julia> rx = [Regex("a") for i in 1:3]
    3-element Array{Regex,1}:
     r"a"
     r"a"
     r"a"
    
    julia> map(x->x===rx[1], rx)
    3-element Array{Bool,1}:
      true
     false
     false
    
    julia> rx = [r"a" for i in 1:3]
    3-element Array{Regex,1}:
     r"a"
     r"a"
     r"a"
    
    julia> map(x->x===rx[1], rx)
    3-element Array{Bool,1}:
     true
     true
     true