Search code examples
c#ssisxslcompiledtransform

XslCompiledTransform: how to read the transformed text, without saving to file


Input: My input files are XML files. They are read by the foreach file enumerator in SSIS.

Process: An SSIS script component (C#) reads the file name from the variable. I created an XSL file for transforming the XML into the format necessary. The script task uses the XSL file, and transforms XML files (to text)

Here is the piece of code I used:

public override void CreateNewOutputRows()
{
    XslCompiledTransform transformer = new XslCompiledTransform();
    transformer.Load(_xsltFile);
    transformer.Transform(_fileName, @"C:\macro3\outputTestFile.txt");
}

Problem:

As expected, this writes the transformed text content to the mentioned output file. I want to read through each line, process it, and load to database.

Now, writing to a file, reading it again is an overhead.

Is there a way I can read the transformed content into any object and iterate over it (without actually writing to a file)? Like a Stream or something?

Alternatively:

Though SSIS "XML Task" has "Operation Type = XSLT" feature, it is not reading the XML if the "SourceType" is variable and I give file name with path in the variable. It is expecting the XML content in the variable. Any work around possible?

Please ask for specific details in comments, so I can update accordingly. Thank you.

I am not able to edit the Expressions of the XML task as shown in the imageenter image description here


Solution

  • Instead of using the Script Task, use a Data Flow. The Data Flow is for transforming streams of data in memory, so sounds like exactly what you're after.

    A couple of options:

    1. If the transformations you need to do aren't too complex, you could set up an XML Source and use an expression so that the source uses the file path variable as its connection string. Once you've done that you can add any further components you need to carry out transformations, and then your database destination.
    2. If the transformations are more complex and you want to use the XSL, you could use a Script Component as a source in the Data Flow, and code picking up the XML and XSD, and carrying out the transform. Here's an example of carrying out the transform and getting the rows of data into memory instead of into a file. MSDN lists all of the overloads available, if that isn't the best direction for you. You would then pass the resultant rows as output into the rest of the Data Flow, and from there you could go directly to a database Destination Component.

    Either way, make sure you set the destination to "fast load" to speed things up.


    While you won't need it if you decide to do this entirely in a Data Flow, as far as the XML Task goes, you need to use File connection as the source instead of Variable. MSDN notes that Variable is only for use with a variable that holds the XML content. You'll need to set up an expression in the same way you would for any file source, and pass the file path variable in.