Search code examples
.netpowershellpowershell-3.0powerguipowershell-jobs

Can Powershell Receive-Job return a DataSet?


Background Info:

I have an application that makes several SQL connections to multiple databases which currently takes a very very long time to execute.

Powershell (.NET) will wait for each proceeding "SQL-GET" function to finish before it can fire off the next. I am under the impression I can speed this app up dramatically by firing each "SQL-GET" function in their own background job simultaneously!I will then retrieve the data from each job as they finish. Ideally as a DataSet system object.

The Issues:

When retrieving the data from the background job, I can ONLY manage to get a System.Array object back. What I am actually after, is a System.DataSet object. This is necessary because all the logic within the app is dependant on a DataSet object.

The Code:

Here is a v.simple slice of code that will create a sql connection and fill a newly created dataset object with the results returned. Works a treat. The $results is a DataSet object and I can manipulate this nicely.

$query = "SELECT * FROM [database]..[table] WHERE column = '123456'"

$Connection = New-Object System.Data.SqlClient.SQLConnection      
$ConnectionString = "Server='SERVER';Database='DATABASE';User ID='SQL_USER';Password='SQL_PASSWORD'"  
$Connection.ConnectionString = $ConnectionString     
$Connection.Open() 
$Command = New-Object system.Data.SqlClient.SqlCommand($Query,$Connection) 
$Adapter = New-Object system.Data.SqlClient.SqlDataAdapter
$Adapter.SelectCommand = $Command
$Connection.Close() 
[System.Data.SqlClient.SqlConnection]::ClearAllPools()

$results = New-Object system.Data.DataSet 

[void]$Adapter.fill($results)

$results.Tables[0]

And here is that VERY SAME CODE wrapped into the scriptblock parameter of a new background job. Only upon calling Receive-Job, I get an array back, not a dataset.

     $test_job = Start-Job -ScriptBlock {

$query = "SELECT * FROM [database]..[table] WHERE column = '123456'"

$Connection = New-Object System.Data.SqlClient.SQLConnection      
$ConnectionString = "Server='SERVER';Database='DATABASE';User ID='SQL_USER';Password='SQL_PASSWORD'"  
$Connection.ConnectionString = $ConnectionString     
$Connection.Open() 
$Command = New-Object system.Data.SqlClient.SqlCommand($Query,$Connection) 
$Adapter = New-Object system.Data.SqlClient.SqlDataAdapter
$Adapter.SelectCommand = $Command
$Connection.Close() 
[System.Data.SqlClient.SqlConnection]::ClearAllPools()

$results = New-Object system.Data.DataSet 

[void]$Adapter.fill($results) 
return $results.Tables[0]

}

Wait-Job $test_job
$ret_results = Receive-Job $test_job

Any help would be greatly appreciated!!!

Research Thus Far:

I have done the old Google, but all of the posts, blogs and articles I stumble across seem to go into EXTREME depth about managing jobs and all the bells and whistles around this. Is it the underlying nature of powershell to ONLY return an array through the receive-job cmdlet?

I have read a stack post about the return expression. Thought I was on to something. Attempted:

  • return $results.Tables[0]
  • return ,$results.Tables[0]
  • return ,$results

All still return an array.

I have seen people, rather cumbersomely, manually transform the array back into a dataset object - though this seems very 'dirty' - I am pedantic and live in hope there must be a way for this magical dataset object to traverse through the background job and into my current session! :)

To reiterate:

Basically, all I would like is to have the $ret_results object retrieved from the Receive-Job cmdlet to be a DataSet...or even a DataTable. I'll take either...JUST NOT AN ARRAY :)


Solution

  • In powershell, it is common for a set of more than one objects of an arbitrary type to return in a collection. Consider this altered example where I build my own table:

    PS C:\> $job = Start-Job -ScriptBlock {
    >>
    >> $table = New-Object system.Data.DataTable “MyTable”
    >>
    >> $col1 = New-Object system.Data.DataColumn MyFirstCol,([string])
    >> $col2 = New-Object system.Data.DataColumn MyIntCol,([int])
    >>
    >> $table.columns.add($col1)
    >> $table.columns.add($col2)
    >>
    >> $row1 = $table.NewRow()
    >> $row1.MyFirstCol = "FirstRow"
    >> $row1.MyIntCol = 1
    >> $row2 = $table.NewRow()
    >> $row2.MyFirstCol = "SecondRow"
    >> $row2.MyIntCol = 2
    >>
    >> $table.Rows.Add($row1)
    >> $table.Rows.Add($row2)
    >>
    >> $dataSet = New-Object system.Data.DataSet
    >> $dataSet.Tables.Add($table)
    >>
    >> $dataSet.Tables[0]
    >>
    >> }
    >>
    PS C:\> $output = Receive-Job -Job $job
    

    Output received. So what did we get?

    PS C:\> $output.GetType()
    
    IsPublic IsSerial Name                                     BaseType
    -------- -------- ----                                     --------
    True     True     Object[]                                 System.Array
    

    An array, as you've described. But that's the whole object. What if we analyze its members individually, by piping them to Get-Member?

    PS C:\> $output | gm
    
       TypeName: Deserialized.System.Data.DataRow
    
    Name               MemberType   Definition
    ----               ----------   ----------
    ToString           Method       string ToString(), string ToString(string format, System.IFormatProvider formatProvi...
    PSComputerName     NoteProperty System.String PSComputerName=localhost
    PSShowComputerName NoteProperty System.Boolean PSShowComputerName=False
    RunspaceId         NoteProperty System.Guid RunspaceId=186c51c3-d3a5-404c-9a4a-8ff3d3a7f024
    MyFirstCol         Property     System.String {get;set;}
    MyIntCol           Property     System.Int32 {get;set;}
    
    PS C:\> $output
    
    
    RunspaceId : 186c51c3-d3a5-404c-9a4a-8ff3d3a7f024
    MyFirstCol : FirstRow
    MyIntCol   : 1
    
    RunspaceId : 186c51c3-d3a5-404c-9a4a-8ff3d3a7f024
    MyFirstCol : SecondRow
    MyIntCol   : 2
    

    Consider the following:

    • In your job, you have specified that $results.Tables[0] should be returned. By specifying a particular Tables iterate, you're returning the object that describes that table... perhaps a DataTable, or in this case DataRows... instead of a DataSet like you seem to be expecting?

    • DataTables have rows. If the DataTable has more than one row, powershell will return it in a collection of DataRows, as I've demonstrated above. You may be surprised to learn that this is not the case for a single row returning -- it will only return the single DataRow object instead of a collection of DataRow objects.

    • If this really is the output you are expecting, you may want to force it to always return in a collection by specifying the output as @($results.Tables[0]). That way, you always know to expect a collection and can handle the resulting content appropriately (by iterating through the collection to manage individual objects).