Search code examples
rubyperformancecss-selectorswatir

Watir: Difference in time taken for reading using different DOM selection methods


I am using Watir for testing my web application. I use CSS selectors to access various elements. In the below example I am trying to read all the data from a table.

In the first method, I get all table rows and then read the text from each cell within the rows.

In the second method, I am reading each cell with its selectors. I am surprised with the fact that the second one was around three times faster compared to the first.

Method 1:

rows = $browser.table(:id,"bin_die_count").rows
while index < rows.count
    bin_details = {}
    speed_grade = rows[index][2].text
    die_count = rows[index][3].text
    bin_value = rows[index][0].text
    bin_details = {"speed_grade" => speed_grade, "die_count" => die_count}
    all_bin_details[bin_value] = bin_details
    puts bin_details
    index = index + 1
end

Method 2:

row_count = $browser.table(:id,"bin_die_count").rows.count
while index < rows.count
    bin_details = {}
    speed_grade =  $browser.element(:css,"#bin_die_count > tbody > tr:nth-child(#{index}) > td:nth-child(3)").text
    die_count = $browser.element(:css,"#bin_die_count > tbody > tr:nth-child(#{index}) > td:nth-child(4)").text
    bin_value = $browser.element(:css,"#bin_die_count > tbody > tr:nth-child(#{index}) > td:nth-child(1)").text
    bin_details = {"speed_grade" => speed_grade, "die_count" => die_count}
    all_bin_details[bin_value] = bin_details
    puts bin_details
    index = index + 1
end

Here method 1 took 46.555 seconds to finish and method 2 took 16.025 seconds. I was expecting method 1 to be faster because it refers to text relative from rows but method 2 refers to each text with absolute CSS selector.

Why is this?


Solution

  • Problem

    The largest factor in determining the performance is the number of wire calls - ie the number of times Watir asks Selenium to talk to the browser.

    In the first approach, which I have simplified to just being rows[0][0].text, you see the following 8 wire calls:

    2017-07-31 11:45:37 INFO Selenium -> POST session/a248a69dfd9ae930072e4a3dbe5a979f/elements
    2017-07-31 11:45:37 INFO Selenium    >>> http://127.0.0.1:9515/session/a248a69dfd9ae930072e4a3dbe5a979f/elements | {"using":"tag name","value":"tr"}
    2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":[{"ELEMENT":"0.9824074557261091-1"},{"ELEMENT":"0.9824074557261091-2"}]}
    2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-1/enabled
    2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":true}
    2017-07-31 11:45:37 INFO Selenium -> POST session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-1/elements
    2017-07-31 11:45:37 INFO Selenium    >>> http://127.0.0.1:9515/session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-1/elements | {"using":"xpath","value":"./th | ./td"}
    2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":[{"ELEMENT":"0.9824074557261091-3"},{"ELEMENT":"0.9824074557261091-4"},{"ELEMENT":"0.9824074557261091-5"},{"ELEMENT":"0.9824074557261091-6"}]}
    2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-3/name
    2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":"td"}
    2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-4/name
    2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":"td"}
    2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-5/name
    2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":"td"}
    2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-6/name
    2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":"td"}
    2017-07-31 11:45:37 INFO Selenium -> GET session/a248a69dfd9ae930072e4a3dbe5a979f/element/0.9824074557261091-3/text
    2017-07-31 11:45:37 INFO Selenium <- {"sessionId":"a248a69dfd9ae930072e4a3dbe5a979f","status":0,"value":"Product"}
    

    In contrast, the second approach, again simplified to browser.td(:css,"#bin_die_count > tbody > tr:nth-child(1) > td:nth-child(1)").text, has only 3 wire calls:

    2017-07-31 11:46:33 INFO Selenium -> POST session/f19ba6fd8cf948b590b36f3c77191624/element
    2017-07-31 11:46:33 INFO Selenium    >>> http://127.0.0.1:9515/session/f19ba6fd8cf948b590b36f3c77191624/element | {"using":"css selector","value":"#bin_die_count > tbody > tr:nth-child(1) > td:nth-child(1)"}
    2017-07-31 11:46:33 INFO Selenium <- {"sessionId":"f19ba6fd8cf948b590b36f3c77191624","status":0,"value":{"ELEMENT":"0.8228824382277831-1"}}
    2017-07-31 11:46:33 INFO Selenium -> GET session/f19ba6fd8cf948b590b36f3c77191624/element/0.8228824382277831-1/name
    2017-07-31 11:46:33 INFO Selenium <- {"sessionId":"f19ba6fd8cf948b590b36f3c77191624","status":0,"value":"td"}
    2017-07-31 11:46:33 INFO Selenium -> GET session/f19ba6fd8cf948b590b36f3c77191624/element/0.8228824382277831-1/text
    2017-07-31 11:46:33 INFO Selenium <- {"sessionId":"f19ba6fd8cf948b590b36f3c77191624","status":0,"value":"Product"}
    

    The second approach has less than half the number of wire calls, which results in roughly half the time to execute.

    It looks like the main reason the first approach takes longer is that when getting the collection of td elements, each td elements tag name is verified. For example, a row with 3 td elements will make 3 wire calls to the name, a row with 4 td elements will make 4 wire calls to the name, etc. In contrast, the second approach just needs to grab the one specific td element. The larger the table, the more time the second approach saves.

    Solution - Using a few cells of a row

    If you are just picking out a few cells of a row, you can avoid the CSS-selector, by locating the specific td without the collection call:

    rows[0].td(index: 0).text
    

    This gives similar performance to the CSS-selector. The following performance was seen for getting the text of a single td in a row with 25 td elements:

    rows = browser.trs
    puts Benchmark.measure { 100.times { rows[0][0].text } } 
    #=> 45.781881
    
    puts Benchmark.measure { 100.times { browser.td(:css,"#bin_die_count > tbody > tr:nth-child(1) > td:nth-child(1)").text } }
    #=> 4.832999
    
    rows = browser.trs
    puts Benchmark.measure { 100.times { rows[0].td(index: 0).text } }
    #=> 4.812138
    

    Solution - Using many cells of a row

    If you are using many of td elements of the row, you are better of getting the collection of elements. However, you should do it once for the row instead of once per cell you need. For example:

    row_tds = rows[index].tds
    speed_grade = row_tds[2].text
    die_count = row_tds[3].text
    bin_value = row_tds[0].text
    

    As you can seen in the below performance results, getting the entire collection at once is faster than accessing each cell individually:

    rows = browser.trs
    puts Benchmark.measure { 20.times { (1..24).map { |i| rows[0].td(index: i).text } } }
    #=> 18.776798
    
    rows = browser.trs
    puts Benchmark.measure { 20.times { tds = rows[0].tds; tds.map(&:text) } }
    #=> 13.478259