Search code examples
mysqlsqlcartesian-productcross-join

performance of cross join


I am building a photo content webpage. In order to make all the combination of content, I use cross join. Let's say I have the following simple table:

table photos

id filename
------------
1   a.jpg
2   b.jpg
3   c.jpg
4   d.jpg
5   e.jpg

Number of combination I can make out of the table is 10 and those are

1,2
1,3
1,4
1,5
2,3
2,4
2,5
3,4
3,5
4,5

I get the above result by using the following cross join query

    SELECT cp1.id,
           cp2.id 
      from photos as cp1 
cross join photos as cp2 
     where cp1.id < cp2.id

...and I use the following query to display challenges that a user didn't see.

    SELECT cp1.id,
           cp1.filename,
           cp2.id, 
           cp2.filename 
      from challenge_photos as cp1 
cross join challenge_photos as cp2 
     where cp1.id < cp2.id 
       and (cp1.id,cp2.id) not in ( (x1,x2), (x1,x2) ) 
       and cp1.id = something 
       and cp2.img_id != something

So far, these queries are working just fine. Each query takes about 0.0002 sec (from 2300 rows) from phpmyadmin.

Let's assume I have 2000 rows in my table. The number of possible combinations is huge. Am I going to have a problem when there are a lot of active users on my website?


Solution

  • I would suggest creating a test database of various sizes and testing the query to see its performance. Doing a similar query on a table of 29k rows I was getting 10-30ms queries and similar numbers in a 70k row table which may seem to indicate that the query isn't completely terrible. The queries also appear to be cached which means you'll see significant benefit if your table sees few writes and many of the same queries.

    There is also a significant asymmetry in your example query: if cp1.id is small you're getting 100s/1000s of rows while if it is large you're only getting a few.

    This SO question refers to a cross join query with a WHERE clause acting like an inner join.