I am getting data as a string from a remote device. I need to parse the data. The data usually come like this:
MO SCGR SC RSITE ALARM_SITUATION
RXOTG-59 59 0 EK0322 ABIS PATH FAULT
RXOCF-59 EK0322 LOCAL MODE
RXOTRX-59-0 4 EK0322 LOCAL MODE
RXOTRX-59-1 EK0322 LOCAL MODE
RXOTRX-59-4 0 EK0322 LOCAL MODE
RXOTRX-59-5 1 3 EK0322 LOCAL MODE
RXOTRX-59-8 EK0322 LOCAL MODE
RXOTRX-59-9 EK0322 LOCAL MODE
I will love to have the data as an array of arrays or any other programmatically sensible structure.
I am splitting the data into an array using:
str.split("\r\n")
and then removing the extra space on each element in the array with:
tsgs.map! {|tsg| tsg.gsub(/\s+/, " ").split(" ") }
but this has limitation in that the empty cells are not considered. I expect the array to contain five elements, but it instead contains less than five.
Case 1: In this case, I get the expected result:
RXOTG-59 59 0 EK0322 ABIS PATH FAULT
converts to
["RXOTG-59", "59", "0", "EK0322", "ABIS PATH FAULT"]
Case 2: In this case, I get an unexpected result:
RXOTRX-59-9 EK0322 LOCAL MODE
converts to
["RXOTRX-59-9", "EK0322", "LOCAL MODE"]
def getCommandResult(tgdatas)
tgdatas_arr = tgdatas.split("\r\n")
tsgs = tgdatas_arr[5..tgdatas_arr.index("END")-2]
tsgs.map! {|tsg| tsg.gsub(/\s+/, " ").split(" ")[0] }
return tsgs
end
Your string1, modified slightly:
data = <<END
MO SCGR SC RSITE ALARM_SITUATION
RXOTG-59 59 0 EK0322 ABIS PATH FAULT
RXOCF-59 EK0322 LOCAL MODE
RXOTRX-59-0 4 EK0322 LOCAL MODE
RXOTRX-59-1 EK0322 LOCAL MODE
RXOTRX-59-4 0
RXOTRX-59-5 1 3 EK0322 LOCAL MODE
RXOTRX-59-8 EK0322 LOCAL MODE
RXOTRX-59-9 EK0322 LOCAL MODE
END
This string looks very much like CSV data structure, so we might be tempted to convert it to a CSV string, thereby allowing us to bring to bear the methods provided by the CSV class.
Convert string to CSV string
Code
def convert_to_csv(data)
cols = data[/.+?\n/].gsub(/ \S/).map { |s| Regexp.last_match.begin(0) }
data.each_line.map do |s|
cols.each { |i| s[i] = ',' if s.size > i+1 }
s.gsub(/ *, */, ',')
end.join
end
Convert string
Now convert the string data
to a CSV string.
csv_data = convert_to_csv(data)
puts csv_data
MO,SCGR,SC,RSITE,ALARM_SITUATION
RXOTG-59,59,0,EK0322,ABIS PATH FAULT
RXOCF-59,,,EK0322,LOCAL MODE
RXOTRX-59-0,4,,EK0322,LOCAL MODE
RXOTRX-59-1,,,EK0322,LOCAL MODE
RXOTRX-59-4,,0
RXOTRX-59-5,1,3,EK0322,LOCAL MODE
RXOTRX-59-8,,,EK0322,LOCAL MODE
RXOTRX-59-9,,,EK0322,LOCAL MODE
Explanation
The steps are as follows.
s = data[/.+?\n/]
#=> "MO SCGR SC RSITE ALARM_SITUATION\n"
e0 = s.gsub(/ \S/)
#=> #<Enumerator: "MO ... ALARM_SITUATION\n":gsub(/ \S/)>
cols = e0.map { Regexp.last_match.begin(0) - 1 }
#=> [17, 23, 34, 50]
e1 = data.each_line
#=> #<Enumerator: "MO ... LOCAL MODE\n":each_line>
a = e1.map do |s|
cols.each { |i| s[i] = ',' if s.size > i+1 }
s.gsub(/ *, */,',')
end
#=> ["MO,SCGR,SC,RSITE,ALARM_SITUATION\n",
# "RXOTG-59,59,0,EK0322,ABIS PATH FAULT\n",
# ...
# "RXOTRX-59-9,,,EK0322,LOCAL MODE\n"]
a.join
#=> < return value above >
Let's have a closer look at the calculation of a
. First, the block variable s
is assigned to the first element generated by the enumerator e1
:
s = e1.next
#=> "MO SCGR SC RSITE ALARM_SITUATION\n"
The block calculation is then performed:
cols.each { |i| s[i] = ',' }
s #=> "MO ,SCGR ,SC ,RSITE ,ALARM_SITUATION\n"
s.gsub(/ *, */,',')
#=> "MO,SCGR,SC,RSITE,ALARM_SITUATION\n"
The regular expression used with gsub
reads, "match zero or more spaces followed by a comma, followed by zero or more spaces".
When the short line is passed to the block the following calculation is performed.
s = "RXOTRX-59-4 0"
s.size
#=> 25
cols
#=> [17, 23, 34, 50]
cols.each { |i| s[i] = ',' if s.size > i+1 }
s #=> "RXOTRX-59-4 , ,0"
s.gsub(/ *, */,',')
#=> "RXOTRX-59-4,,0"
The remaining elements of e1
are processed similarly.
Convert the CSV string to a hash
We may now make use of CSV methods. For example, suppose we wish to create an array of hashes whose keys are the header elements, downcased and converted to symbols and values of "SCGR"
and "SC"
are to be converted to integers. To do that we make use of the class method CSV::new, specifying appropriate values for method options.
Construct the hash
require 'csv'
CSV.new(csv_data, headers: true, header_converters: :symbol,
converters: :all).to_a.map(&:to_h)
#=> [{:mo=>"RXOTG-59", :scgr=>59, :sc=>0, :rsite=>"EK0322",
# :alarm_situation=>"ABIS PATH FAULT"},
# {:mo=>"RXOCF-59", :scgr=>nil, :sc=>nil, :rsite=>"EK0322",
# :alarm_situation=>"LOCAL MODE"},
# {:mo=>"RXOTRX-59-0", :scgr=>4, :sc=>nil, :rsite=>"EK0322",
# :alarm_situation=>"LOCAL MODE"},
# {:mo=>"RXOTRX-59-1", :scgr=>nil, :sc=>nil, :rsite=>"EK0322",
# :alarm_situation=>"LOCAL MODE"},
# {:mo=>"RXOTRX-59-4", :scgr=>nil, :sc=>0, :rsite=>nil,
# :alarm_situation=>nil},
# {:mo=>"RXOTRX-59-5", :scgr=>1, :sc=>3, :rsite=>nil"EK0322",
# :alarm_situation=>"LOCAL MODE"},
# {:mo=>"RXOTRX-59-8", :scgr=>nil, :sc=>nil, :rsite=>"EK0322",
# :alarm_situation=>"LOCAL MODE"},
# {:mo=>"RXOTRX-59-9", :scgr=>nil, :sc=>nil, :rsite=>"EK0322",
# :alarm_situation=>"LOCAL MODE"}]
Explanation
The steps are as follows.
csv = CSV.new(csv_data, headers: true, header_converters: :symbol,
converters: :all)
#=> <#CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:",
# " row_sep:"\n" quote_char:"\"" headers:true>
a = csv.to_a
#=> [#<CSV::Row mo:"RXOTG-59" scgr:59 sc:0 rsite:"EK0322" alarm_situation:"ABIS PATH FAULT">,
# #<CSV::Row mo:"RXOCF-59" scgr:nil sc:nil rsite:"EK0322" alarm_situation:"LOCAL MODE">,
# ...
# #<CSV::Row mo:"RXOTRX-59-9" scgr:nil sc:nil rsite:"EK0322" alarm_situation:"LOCAL MODE">]
a.map(&:to_h)
#=> < hash shown above >
1 To run the code you will need to un-indent this heredoc (or change the first line to data = <<-END.lines.map(&:lstrip).join
).