Search code examples
sqlsql-serveretlpentaho-spoonpentaho-data-integration

How to output a table with values from a table, alongside repeated lookup id's of a reference table (in Pentaho PDI and SQL)


Consider, I have table_A, like:

table_A

+--------+--------+--------+-----+
| code_1 | code_2 | code_3 | qty |
+--------+--------+--------+-----+
|   0.5  |   0.6  |   0.2  |  3  |
+--------+--------+--------+-----+
|   0.8  |   1.2  |   3.2  |  4  |
+--------+--------+--------+-----+
|   3.0  |   2.2  |   0.4  |  7  |
+--------+--------+--------+-----+

and a reference_table, like:

reference_table

+--------+--------+
|   id   | code   |
+--------+--------+
|    1   | code_1 |
+--------+--------+
|    2   | code_2 |
+--------+--------+
|    3   | code_3 |
+--------+--------+

Now, I would like to get the output like:

output_table

+--------+--------+--------+
|   id   |  value |   qty  |
+--------+--------+--------+
|    1   |   0.5  |    3   |
+--------+--------+--------+
|    2   |   0.6  |    3   |
+--------+--------+--------+
|    3   |   0.2  |    3   |
+--------+--------+--------+
|    1   |   0.8  |    4   |
+--------+--------+--------+
|    2   |   1.2  |    4   |
+--------+--------+--------+
|    3   |   3.2  |    4   |
+--------+--------+--------+
|    1   |   3.0  |    7   |
+--------+--------+--------+
|    2   |   2.2  |    7   |
+--------+--------+--------+
|    3   |   0.4  |    7   |
+--------+--------+--------+

NB:- The reference table is in a MySQL database, and the table_A is in a MSSQL database. I use Pentaho PDI to get data from both the tables (in different databases, via the "Table Input" step).

How can the output_table be obtained in Pentaho PDI - Spoon (ETL)?

Also, how is this done in SQL (if both the tables were in the same MSSQL database)?

Thanks in advance :)


Solution

  • In SQL Server you can use a lateral join -- i.e. the keyword apply:

    select r.id, v.value, a.qty
    from table_A a cross apply
         (values ('code_1', a.code_1), ('code_2', a.code_2), ('code_3', a.code_3)
         ) v(code, value) join
         reference r
         on r.code = v.code;