Search code examples
t-sqlself-joincross-join

How best can I produce a cartesian product with the different columns of a same table?


Suppose I do have a table ManjoloHigh that holds the following columns: Firstname, Surname, Age, Gender, Village, and you're required to produce a cartesian product based on the following these two columns: Village, Firstname.

What is the best way to produce a cartesian product in T-SQL?


Solution

  • Assuming that you are trying to retreive a set of unique possible combinations of the values in given columns.

    You could either explode your resultset by creating a huge cartensian product, followed by DISTINCT, like the following:

    SELECT DISTINCT V.Village, F.Firstname 
    FROM MANJOLOHIGH AS V 
    CROSS JOIN MANJOLOHIGH AS F
    

    Or you could create subsets of the unique values in every column and cross join these, like so:

    SELECT V.Village, F.Firstname 
    FROM (SELECT DISTINCT Village FROM MANJOLOHIGH) AS V 
    CROSS JOIN (SELECT DISTINCT Firstname FROM MANJOLOHIGH) AS F
    

    Given the fact that you are talking about five columns, I'd strongly suggest the latter. Given a table of 1000 records, the first resultset would result in 1000 * 1000 * 1000 * 1000 * 1000 records, before DISTINCT is applied.