Watir: Difference in time taken for reading using different DOM selection methods

I am using Watir for testing my web application. I use CSS selectors to access various elements. In the below example I am trying to read all the data from a table.

In the first method, I get all table rows and then read the text from each cell within the rows.

In the second method, I am reading each cell with its selectors. I am surprised with the fact that the second one was around three times faster compared to the first.

Method 1:

rows = $browser.table(:id,"bin_die_count").rows
while index < rows.count
    bin_details = {}
    speed_grade = rows[index][2].text
    die_count = rows[index][3].text
    bin_value = rows[index][0].text
    bin_details = {"speed_grade" => speed_grade, "die_count" => die_count}
    all_bin_details[bin_value] = bin_details
    puts bin_details
    index = index + 1
end

Method 2:

row_count = $browser.table(:id,"bin_die_count").rows.count
while index < rows.count
    bin_details = {}
    speed_grade =  $browser.element(:css,"#bin_die_count > tbody > tr:nth-child(#{index}) > td:nth-child(3)").text
    die_count = $browser.element(:css,"#bin_die_count > tbody > tr:nth-child(#{index}) > td:nth-child(4)").text
    bin_value = $browser.element(:css,"#bin_die_count > tbody > tr:nth-child(#{index}) > td:nth-child(1)").text
    bin_details = {"speed_grade" => speed_grade, "die_count" => die_count}
    all_bin_details[bin_value] = bin_details
    puts bin_details
    index = index + 1
end

Here method 1 took 46.555 seconds to finish and method 2 took 16.025 seconds. I was expecting method 1 to be faster because it refers to text relative from rows but method 2 refers to each text with absolute CSS selector.

Why is this?

Solution

Problem

The largest factor in determining the performance is the number of wire calls - ie the number of times Watir asks Selenium to talk to the browser.

In the first approach, which I have simplified to just being rows[0][0].text, you see the following 8 wire calls:

2017-07-31 11:45:37 INFO Selenium -> POST session/a248a69dfd9ae930072e4a3dbe5a979f/elements
2017-07-31 11:45:37 INFO Selenium    >>> http://127.0.0.1:9515/session/a248a69dfd9ae930072e4a3dbe5a979f/elements | {"using":"tag name","value":"tr"}
2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":[{"ELEMENT":"0.9824074557261091-1"},{"ELEMENT":"0.9824074557261091-2"}]}
2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-1/enabled
2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":true}
2017-07-31 11:45:37 INFO Selenium -> POST session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-1/elements
2017-07-31 11:45:37 INFO Selenium    >>> http://127.0.0.1:9515/session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-1/elements | {"using":"xpath","value":"./th | ./td"}
2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":[{"ELEMENT":"0.9824074557261091-3"},{"ELEMENT":"0.9824074557261091-4"},{"ELEMENT":"0.9824074557261091-5"},{"ELEMENT":"0.9824074557261091-6"}]}
2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-3/name
2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":"td"}
2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-4/name
2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":"td"}
2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-5/name
2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":"td"}
2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-6/name
2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":"td"}
2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-3/text
2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":"Product"}

In contrast, the second approach, again simplified to browser.td(:css,"#bin_die_count > tbody > tr:nth-child(1) > td:nth-child(1)").text, has only 3 wire calls:

2017-07-31 11:46:33 INFO Selenium -> POST session/f19ba6fd8cf948b590b36f3c77191624/element
2017-07-31 11:46:33 INFO Selenium    >>> http://127.0.0.1:9515/session/f19ba6fd8cf948b590b36f3c77191624/element | {"using":"css selector","value":"#bin_die_count > tbody > tr:nth-child(1) > td:nth-child(1)"}
2017-07-31 11:46:33 INFO Selenium <- {"sessionId":"f19ba6fd8cf948b590b36f3c77191624","status":0,"value":{"ELEMENT":"0.8228824382277831-1"}}
2017-07-31 11:46:33 INFO Selenium -> GET session/f19ba6fd8cf948b590b36f3c77191624/element/0.8228824382277831-1/name
2017-07-31 11:46:33 INFO Selenium <- {"sessionId":"f19ba6fd8cf948b590b36f3c77191624","status":0,"value":"td"}
2017-07-31 11:46:33 INFO Selenium -> GET session/f19ba6fd8cf948b590b36f3c77191624/element/0.8228824382277831-1/text
2017-07-31 11:46:33 INFO Selenium <- {"sessionId":"f19ba6fd8cf948b590b36f3c77191624","status":0,"value":"Product"}

The second approach has less than half the number of wire calls, which results in roughly half the time to execute.

It looks like the main reason the first approach takes longer is that when getting the collection of td elements, each td elements tag name is verified. For example, a row with 3 td elements will make 3 wire calls to the name, a row with 4 td elements will make 4 wire calls to the name, etc. In contrast, the second approach just needs to grab the one specific td element. The larger the table, the more time the second approach saves.

Solution - Using a few cells of a row

If you are just picking out a few cells of a row, you can avoid the CSS-selector, by locating the specific td without the collection call:

rows[0].td(index: 0).text

This gives similar performance to the CSS-selector. The following performance was seen for getting the text of a single td in a row with 25 td elements:

rows = browser.trs
puts Benchmark.measure { 100.times { rows[0][0].text } } 
#=> 45.781881

puts Benchmark.measure { 100.times { browser.td(:css,"#bin_die_count > tbody > tr:nth-child(1) > td:nth-child(1)").text } }
#=> 4.832999

rows = browser.trs
puts Benchmark.measure { 100.times { rows[0].td(index: 0).text } }
#=> 4.812138

Solution - Using many cells of a row

If you are using many of td elements of the row, you are better of getting the collection of elements. However, you should do it once for the row instead of once per cell you need. For example:

row_tds = rows[index].tds
speed_grade = row_tds[2].text
die_count = row_tds[3].text
bin_value = row_tds[0].text

As you can seen in the below performance results, getting the entire collection at once is faster than accessing each cell individually:

rows = browser.trs
puts Benchmark.measure { 20.times { (1..24).map { |i| rows[0].td(index: i).text } } }
#=> 18.776798

rows = browser.trs
puts Benchmark.measure { 20.times { tds = rows[0].tds; tds.map(&:text) } }
#=> 13.478259