Search code examples
rubyshelltac

Conditionally created array not seen after creation


Suppose I am writing a type of tac in Ruby that will reverse the lines of a file or stream given to it.

So Line 1\nLine 2\nLine 3\n [Ruby script] => Line 3\nLine 2\nLine1\n

Here are some test files:

printf "f1, Line %s\n" $(seq 3) >f1
printf "f2, Line %s\n" $(seq 5) >f2
printf "f3, Line %s\n" $(seq 7) >f3

A straightforward way to write that is:

ruby -e ' # read each ARGF and reverse it
$<.each_line{|line| 
    lines=Array.new if $<.file.lineno==1 
    lines.unshift(line)
    p lines if $<.eof?
}'

However, with that version, I get the error:

-e:4:in `block in <main>': undefined method `unshift' for nil:NilClass (NoMethodError)

    lines.unshift(line)
         ^^^^^^^^
    from -e:2:in `each_line'
    from -e:2:in `each_line'
    from -e:2:in `<main>'

I can fix that by changing the script to:

ruby -e 'BEGIN{lines=[]}
$<.each_line{|line| 
    lines=Array.new if $<.file.lineno==1 
    lines.unshift(line)
    p lines if $<.eof?
}'

But why is the BEGIN block necessary? Isn't the lines array created with the first go through? It seems like a throw-away definition of the array...


The final version there does work:

cat f1 | ruby -e 'BEGIN{lines=[]}
$<.each_line{|line| 
    lines=Array.new if $<.file.lineno==1 
    lines.unshift(line)
    p lines if $<.eof?
}' - f2 f3
["f1, Line 3\n", "f1, Line 2\n", "f1, Line 1\n"]
["f2, Line 5\n", "f2, Line 4\n", "f2, Line 3\n", "f2, Line 2\n", "f2, Line 1\n"]
["f3, Line 7\n", "f3, Line 6\n", "f3, Line 5\n", "f3, Line 4\n", "f3, Line 3\n", "f3, Line 2\n", "f3, Line 1\n"]

But why do I have to define lines in the BEGIN block only to define it again in the loop? It does not matter what lines is defined as in the BEGIN block; it can be numerical, boolean, hash, whatever -- but the name has to exist.

Ideas?

% ruby -v
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin23]

Thanks for the comments and answers. Please see this Python as I think why my muscle memory may have gotten confused:

def f():
    # local li is created first iteration and used on subsequent... 
    # Similar to Ruby, li is local to this scope of f()
    for x in [1,2,3,4]:
        if x==1: li=[] 
        li.append(x)

    return li 

Solution

  • First of all, it has nothing to do with Unix pipes or input / output. You get the same error in a Ruby-only variant, e.g.

    [1, 2, 3].each do |i|
      ary = [] if i == 1 
      ary.unshift(i)
    end
    # undefined method `unshift' for nil:NilClass
    

    The exception is raised because ary is defined on the 1st iteration but not on the subsequent iterations – here, ary will be nil. In Ruby, a block creates a new local variable scope and

    [...] any local variables created inside it do not leak to the surrounding scope.

    This also applies to calling the same block multiple times:

    def foo
      yield
      yield
    end
    
    foo do
      p before: defined? a
      a = 1
      p after: defined? a
    end
    

    Output:

    {:before=>nil}
    {:after=>"local-variable"}
    {:before=>nil}
    {:after=>"local-variable"}
    

    As you can see, the variable scope is not retained between block invocations. The same applies to each which also calls the block multiple times.

    To get the desired behavior, you can simply create the variable outside the block, e.g.:

    ary = []
    [1, 2, 3].each do |i|
      ary.unshift(i)
    end
    ary #=> [3, 2, 1]