Search code examples
sqlperformancessisodbcoledb

SSIS - handle large number of records between different databases


I want to export a flat file and have this tricky problem:

  1. I have a query that selects from OracleDB (about 8 million records)
  2. One of the returned fields (ISO Country Code), is selected by two different columns
  3. I want to make a transformation on this field (e.g. 'GBR' to '1' , 'RUS' to 2 etc.)
  4. If I try to use IN operator in the query it takes too long (because I have 7 cases with about 30 values per case). Also because of #2, I need a subquery.
  5. I have no rights in the specific database. All I can do is select.
  6. I tried to use temporary(with the records) and parametric(with the paragroup) OLEDB tables but moving 8m records from one database to another and then again join with the parametric was also running for hours. (Note that OLEDB is the only Database that I can use)

I have read that trying to do this with a Derived Column when you have so many values isn't the best idea, but still I cannot think of any better way. Any suggestion?

Edit:

The specific statement is like CASE WHEN dt.source_country_code IS NOT NULL THEN dt.source_country_code ELSE dt.trans_country_code END) AS Country. The rest query is large and it doesn't affect this field. The results of the field are ISO Country Codes for example: GRC, GBR, FRA, JPN .... The transformation that should be done is standard would look like (if I choose to do it inside the query):

case when Country in ('GRC','GBR','ESP','FRA','DEU','ROM','DNK','IMN','PRT','ITA','CZE','IRL','POL','AND','FIN','LVA','SHN','ATA','LIE','SMR','AUT','LTU','SVK','BEL','GIB','LUX','SVN','BGR','MLT','GRL','MCO','SWE','HRV','HUN','NLD','CHE','CYP','ISL','NOR','VAT','ROU','EST') then '1' 
  when Country in ('ALB','MDA','ARM','MNE','AZE','RUS','BLR','SRB','BIH','TJK','GEO','TUR','ISR','TKM','KAZ','UKR','XKX','UZB','MKD') then '4'
  when Country in ('USA','CAN') then '5' 
  when Country = 'CHN' then '6' 
  when Country = 'JPN' then '7' 
  when Country in ('AUS','NZL') then '8' 
  else '9' end) as CountryCode

Solution

  • Create a simple text file (CSV) with your conversions, like the following:

    CountryISO,ID
    USA,1
    ARG,2
    ESP,3
    CHN,4
    HKG,4
    

    Create a Cache connection manager to store your conversions; use a flat file connection for this conversion file and in a DataFlow task load it on the cache connection.

    In a separate DataFlow following the previous one (cache needs to be loaded in a separate DataFlow task before), join your Oracle results with a Lookup search, in which the Lookup has "Full Cache" (store all referenced values in memory) and your "Cache connection manager" settings.

    You can follow this post as example.