Search code examples
rubyparsinguniquetranspose

Organize values of unique elements based on occurrence time Ruby


Please help in this. I have the 2 arrays below. Array a contains hours and array b contain the same hours and then values that happened at those hours.

a = ["1015","1240","1732"]
b = ["1015","X|2","D|5","1240","B|11","F|8","X|7","1732","D|9","X|1","B|3"]

So in array b:

Elements "X|2","D|5" happened at hour 10:15

Elements "B|11","F|8","X|7" happened at hour 12:40

Elements "D|9","X|1","B|3" happened at hour 17:32

First part of each element in B could be repeated, for example, X happened in 3 hours with different values, so in the output, I´d like to print hours and unique values, this is X, D, B and F

The output I´m looking for is:

HOUR    X    D    B    F 
1015    2    5
1240    7         11   8
1732    1    9    3

The code I have so far is below but I still not able to organize the output in desired order.

val=[]
headers=[]
b.each{|v|
if v.include? "|"
    headers << v.split("|")[0]
    val << v.split("|")[1]
else
    val << ["HOUR",v]
end
}

puts ["HOURS",headers.uniq].join(" ")
puts val

Current output of my code:

HOURS X D B F

HOUR
1015
2
5
HOUR
1240
11
8
7
HOUR
1732
9
1
3

Solution

  • I have assumed that a merely contains the times in b, sorted. As that can be computed there is no need to provide that information as an input.

    Code

    def print_table(data, time_label, column_spacing)
      h = data.slice_before { |s| !s.include?('|') }.
               each_with_object({}) { |(t,*a),h|
                 h[t] = a.map { |s| s.split('|') }.to_h.tap { |g| g.default = '' } }
      row_labels = h.keys.sort
      column_labels = h.values_at(*row_labels).reduce([]) { |a,g| a | g.keys }
      image = [[time_label, *column_labels],
              *row_labels.map { |time| [time, *h[time].values_at(*column_labels)] }]
      row_label_width, *column_widths = image.transpose.map { |r| r.map(&:size).max }
      print_image(image, row_label_width, column_widths, column_spacing)
    end
    
    def print_image(image, row_label_width, column_widths, column_spacing)
      image.each do |time, *values|
        print time.ljust(row_label_width)
        values.zip(column_widths).each { |s,width| print s.rjust(width + column_spacing) }
        puts
      end
    end
    

    Example

    b = ["1240", "B|11", "F|8", "X|7",
         "1015", "X|2",  "D|5",
         "1732", "D|9",  "X|1", "B|3"]
    time_label = "HOUR"
    column_spacing = 2
    
    print_table(b, time_label, column_spacing)
    

    prints

    HOUR  X  D   B  F
    1015  2  5
    1240  7     11  8
    1732  1  9   3
    

    Note that the times-of-day in b are not in sorted order.

    Explanation

    For the values in the Example section, the first step is to group the elements of the array b into groupings (arrays) by time-of-day.

    groups = b.slice_before { |s| !s.include?('|') }
      #=> #<Enumerator: #<Enumerator::Generator:0x000000022b2490>:each>
    

    See Enumerable#slice_before. We can see the objects that will be generated by this enumerator by converting it to an array.

     groups.to_a
       #=> [["1240", "B|11", "F|8", "X|7"],
       #    ["1015", "X|2", "D|5"],
       #    ["1732", "D|9", "X|1", "B|3"]]
    

    Next, let's convert groups to a hash.

    h = groups.each_with_object({}) { |(t,*a),h|
      h[t] = a.map { |s| s.split('|') }.
               to_h.
               tap { |g| g.default = '' } }
      #=> {"1240"=>{"B"=>"11", "F"=>"8", "X"=>"7"},
      #    "1015"=>{"X"=>"2", "D"=>"5"},
      #    "1732"=>{"D"=>"9", "X"=>"1", "B"=>"3"}}
    

    See Enumerable#each_with_object, Array#to_h, Object#tap and Hash#default=. g.default = '' assigns the hash a default value of an empty space. This means that g[k] returns an empty space if g does not have a key k. For example, h["1015"]["B"] #=> "". g.default = '' returns '', which is why it is is enclosed in a tap block, which returns g with the default defined.

    This article provides an explanation of the use of the splat operator. (Here, in a nutshell: [1, *[2, 3]] #=> [1, 2, 3]).

    For the column labels we have a few options. Regardless, we first need the unique keys in the values (hashes) of h corresponding to the keys in row_labels.

    row_labels = h.keys.sort
      #=> ["1015", "1240", "1732"]
    column_labels = h.values_at(*row_labels)
      #=> [{"X"=>"2", "D"=>"5"},
      #    {"B"=>"11", "F"=>"8", "X"=>"7"},
      #    {"D"=>"9", "X"=>"1", "B"=>"3"}]
    column_labels = column_labels.reduce([]) { |a,g| a | g.keys }
      #=> ["X", "D", "B", "F"]
    

    See Enumerable#values_at, Enumerable#reduce (aka inject) and Array#|. I have assumed this gives the desired column order, but the elements of column_labels could be reordered if desired. I present two possible options in the last section of my answer.

    We next construct an array containing all the values in the table to be printed.

    image = [[time_label, *column_labels],
              *row_labels.map { |time| [time, *h[time].values_at(*column_labels)] }]
      #=> [["HOUR", "X", "D", "B", "F"],
      #    ["1015", "2", "5", "", ""],
      #    ["1240", "7", "", "11", "8"],
      #    ["1732", "1", "9", "3", ""]]
    

    Enumerable#values_at pulls out the values (hashes) in h[time] that correspond to each rows of the table, in their desired order.

    We may then print the table as follows.

    row_label_width, *column_widths = image.transpose.map { |r| r.map(&:size).max }
      # => [4, 1, 1, 2, 1]
    

    so that

    row_label_width
      #=> 4
    column_widths
      #=> [1, 1, 2, 1]
    
    image.each do |time, *values|
      print time.ljust(row_label_width)
      values.zip(column_widths).each { |s,width| print s.rjust(width + column_spacing) }
      puts
    end
    

    prints the table shown in the Example section.

    Column order

    As I said earlier the elements of column_labels could be reordered if desired. One possibility is to sort the labels alphabetically.

    column_labels = h.values_at(*row_labels).reduce([]) { |a,g| a | g.keys }.sort!
      #=> ["B", "D", "F", "X"]
    

    Another is that we are given a desired order of all possible column labels.

    desired = ["Y", "F", "D", "Z", "B", "X"]
    

    Then compute the following.

    column_labels = desired & h.values_at(*row_labels).reduce([]) { |a,g| a | g.keys }
      #=> ["F", "D", "B", "X"]