Search code examples
powershelldatatable

Why does the result of DataTable.Load with just 1 row behave as if it is a DataRow instead of a DataTable?


I want to query two databases on different servers for the same kind of information, and merge the results into a single list.

I created this example, which works, but only if the first server returns at least two rows.

The example uses a list of integers (1 .. 5), but my original code joining two lists of strings has the same problem.

# 'quick and dirty' sample
# to be considered insecure,
# do not use as a basis for production code.
function GetData($dbserver, $start, $end)
{
    $qry = "WITH q AS (SELECT  $start AS num UNION ALL SELECT num + 1 FROM q WHERE num < $end) SELECT * FROM q"
    $con = New-Object System.Data.SqlClient.SqlConnection
    $con.ConnectionString = "server=$dbserver;database=A2SDataCentral;Integrated Security=SSPI"
    $con.Open()
    $com = $con.CreateCommand()
    $com.CommandText = $qry
    $res = $com.ExecuteReader()
    $table = New-Object System.Data.DataTable
    $table.Load($res)
    return $table
}

$all = $null # erase any result remaining from a previous run
$t1 = GetData "(local)" 1 2
$t2 = GetData "(local)" 3 5
$all = $t1 + $t2
$all

The SQL query I use here is just a dummy that generates sequential numbers. It will work only on SQL server. In this example I run it at the same server (localhost) twice, in reality it has to query two different servers to get two lists of names and join them.

Used as above, this example does what I was trying to make it do, the end result is a list of integers from 1 through 5.

num
---
  1
  2
  3
  4
  5

But when I change the GetData calls into this, making the first call return only a single row,

$t1 = GetData "(local)" 1 1
$t2 = GetData "(local)" 2 5

it crashes with

Method invocation failed because [System.Data.DataRow] does not contain a method named 'op_Addition'.

So it looks like the GetData function returns not a DataTable, but a DataRow if there is only one row in the result.

I tried this, but that's not the solution either:

$all = $t1
$all.Rows.Add($t2)

This throws either "Collection was of a fixed size" ($t1 has 2 rows) or "You cannot call a method on a null-valued expression" ($t1 contains 1 row).

Can someone shine a light on why a result with only 1 row causes a different behavior from multiple rows?

(Edit: draw attention to example/insecure nature of script)


Solution

  • You're a victim of PowerShell trying to be a little too friendly

    When PowerShell sees a DataTable being passed from a function on the pipeline it attempts to unravel it and send the individual rows down-stream one by one.

    This can be extremely useful when filtering a data set from a db query for example, but obviously extremely frustrating if you're relying on some intrinsic DataTable behavior.

    To avoid having PowerShell unravel the rows, use Write-Output $table -NoEnumerator or - if you add a CmdletBinding attribute to your function - you can call $PSCmdlet.WriteObject($table,$false) directly:

    function GetData($dbserver, $start, $end)
    {
        $qry = "WITH q AS (SELECT  $start AS num UNION ALL SELECT num + 1 FROM q WHERE num < $end) SELECT * FROM q"
        $con = New-Object System.Data.SqlClient.SqlConnection
        $con.ConnectionString = "server=$dbserver;database=A2SDataCentral;Integrated Security=SSPI"
        $con.Open()
        $com = $con.CreateCommand()
        $com.CommandText = $qry
        $res = $com.ExecuteReader()
        $table = New-Object System.Data.DataTable
        $table.Load($res)
    
        Write-Output $table -NoEnumerate
    }
    

    or

    function GetData
    {
        [CmdletBinding()]
        param($dbserver, $start, $end)
    
        $qry = "WITH q AS (SELECT  $start AS num UNION ALL SELECT num + 1 FROM q WHERE num < $end) SELECT * FROM q"
        $con = New-Object System.Data.SqlClient.SqlConnection
        $con.ConnectionString = "server=$dbserver;database=A2SDataCentral;Integrated Security=SSPI"
        $con.Open()
        $com = $con.CreateCommand()
        $com.CommandText = $qry
        $res = $com.ExecuteReader()
        $table = New-Object System.Data.DataTable
        $table.Load($res)
    
        $PSCmdlet.WriteObject($table,$false)
    }
    

    Bonus security review!

    I cannot in good conscience answer this question without pointing out that your function is currently susceptible to the most basic form of SQL injection.

    What if a user was to pass a non-numeric value as the $start or $end parameter argument value? Someone could do:

    $untrustedUserInput = '0 OR 1 = 1; xp_cmdshell "cmd /c calc.exe" --'
    

    Congratulations, I now own your database server :)

    In this specific instance you can easily mitigate the risk by strongly typing your parameters:

    function GetData([string]$dbServer, [int]$start, [int]$end)
    {
    ...
    

    or

    param([string]$dbServer, [int]$start, [int]$end)
    

    The better solution is to parameterize your SQL query:

    $qry = "WITH q AS (SELECT @start AS num UNION ALL SELECT num + 1 FROM q WHERE num < @end) SELECT * FROM q"
    # ...
    $com.Parameters.Add('@start','Int').Value = $start
    $com.Parameters.Add('@end','Int').Value = $end
    

    This way no evil string values will sneak into the query without being sanitized