I am experimenting with different methods of dealing with long text strings (e.g. book length) in Julia. Specifically I am looking at transposition ciphers and have been testing speed and memory usage using (1) string concatenation, (2) arrays and (3) I/O buffers. In the last case I need to be able to 'print' individual characters to different, indexable, I/O buffers. My first (simplified) attempt was as follows:
text = fill(IOBuffer(), 3)
print(text[1], 'a')
print(text[2], 'b')
print(text[3], 'c')
for i in 1:3
println(String(take!(text[i])))
end
This produces:
"abc"
""
""
In other words, the first index returned the whole string "abc"
rather than just the desired character 'a'
, and the other indices produced empty strings ""
as the buffer was reset after the first take!()
function.
My next attempt worked, but does not seem very sophisticated:
text = Vector(3)
for i in 1:3
text[i] = IOBuffer()
end
print(text[1], 'a')
print(text[2], 'b')
print(text[3], 'c')
for i in 1:3
println(String(take!(text[i])))
end
This produces the required output:
"a"
"b"
"c"
I'm still not entirely sure why the first method fails and the second works, but does anyone know a better method for setting up multiple I/O buffers that can be written to using indices?
The reason for your problem is that in text = fill(IOBuffer(), 3)
the call IOBuffer()
is evaluated only once so all entries of text
are pointing to the same object. You can check it by running:
julia> all(x->x===text[1], text)
true
Or you can see this when you run:
julia> fill(println("AAA"), 3)
AAA
3-element Array{Void,1}:
nothing
nothing
nothing
to find that println
was called only once - before passing its value to fill
method.
The simplest way to solve it is to use a comprehension:
julia> text = [IOBuffer() for i in 1:3]
3-element Array{Base.AbstractIOBuffer{Array{UInt8,1}},1}:
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
julia> map(x->x===text[1], text)
3-element Array{Bool,1}:
true
false
false
or map
(a bit less clean):
julia> map(i->IOBuffer(), 1:3)
3-element Array{Base.AbstractIOBuffer{Array{UInt8,1}},1}:
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
julia> map(x->x===text[1], text)
3-element Array{Bool,1}:
true
false
false
Actually you could fill the array with IOBuffer
types and invoke
them (it is not a recommended approach, but shows you the difference):
julia> text = invoke.(fill(IOBuffer, 3), [Tuple{}])
3-element Array{Base.AbstractIOBuffer{Array{UInt8,1}},1}:
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
julia> map(x->x===text[1], text)
3-element Array{Bool,1}:
true
false
false
Finally, one catch to remember, as discussed here, is that a comprehension will invoke a function for each entry of a created array, but if you use a macro, it will be called only once. Here is a short example explained in the link in detail:
julia> rx = [Regex("a") for i in 1:3]
3-element Array{Regex,1}:
r"a"
r"a"
r"a"
julia> map(x->x===rx[1], rx)
3-element Array{Bool,1}:
true
false
false
julia> rx = [r"a" for i in 1:3]
3-element Array{Regex,1}:
r"a"
r"a"
r"a"
julia> map(x->x===rx[1], rx)
3-element Array{Bool,1}:
true
true
true