Search code examples
ssissql-server-data-toolsderived-column

SSIS with Visual Studio Derived Column


I'm trying to take a list from a text file and export it to excel.

Rank,Millions,Name,Country,Source of wealth
1,12700,Lakshmi Mittal and family, India,Steel
2,12315,Alisher Usmanov, Russia,Mining and Investment
3,9500,Roman Abramovich, Russia,Oil and industry
4,8600,Sri and Gopi Hinduja, India,Industry and Finance
5,7580,Leonard Blavatnik, Russia,Industry
6,7400,Ernesto and Kirsty Bertarelli, Switzerland &  United  Kingdom,Pharmaceuticals
7,7350,Gerald Grosvenor, 6th Duke of Westminster, United Kingdom,Property
8,7083,David and Simon Reuben, United Kingdom,Property and Internet
9,6600,John Fredriksen and family, Norway,Shipping and oil services
10,5900,Galen Weston, George Weston and family, Canada,Retailing
11,5490,Charlene de Carvalho-Heineken and Michel Carvalho, Netherlands,Inheritance, banking, brewing (Heineken)
12,4300,Hans Rausing and family, Sweden,Packaging
13,4100,Joseph Lau, China,Property
14,,Paul Chuckle, UK, IT Training
15,4300,Nicky Oppenheimer, South Africa,Mining, Diamonds
16,3900,Kirsten & Jorn Rausing, Sweden,Inheritance, Investment
17,3400,Sir Richard Branson, United Kingdom,Internet, Transport, Finance
18,3300,Sir Philip and Lady Green, United Kingdom,Retailing
19,,Barry Chuckle, UK, Comedy
20,,Roger Chuckle,UK,SSIS consultancy

For rank 14, 19, and 20, the Millions column is an empty string. I've already separated the data from rows that have Millions specified and those that have not by using a conditional split. What I want to do now is to take the 3 rows that aren't specified, and write in "Not Specified" using a Derived Column Transformation. I figure that I would be able to do something like (in the expression tab):

Millions=="Not Specified"

However when I run a Data Viewer on the output, the three columns show "False" in the Millions column.

I did a weird work around but I'm sure there's a better way:

Millions == "" || Millions != "" ? "Not Specified" : "Not Specified"

Can someone tell me why I can't use my first solution? I'm assuming it has something to do with boolean, but the data type is a string.


Solution

  • First of all, is there some special reason why are you separating records with empty value for "Millions" field with conditional split? You can just use Derived Column transformation and under the Derived Column tab select: Replace 'Millions'. In expression bar you can enter:

    `Millions=="" ? "Not specified" : Millions`
    

    This means that if Millions field has empty value, replace this with "Not specified" (true condition) or, on the opposite, just insert existing value (false condition). You are right, data type is string, but expression evaluation always points to the boolean. Let me just briefly explain why you can't use just Millions=="Not specified". Expression expects some arguments to evaluate against, so you must specify what happens if the expression evaluates to true or false. You can think of this operator as simplified version of if/else statement. So, the problem with your first solution is that you didn't specify what happens if Millions=="". Hope that this helps.