As an example: I'm having a database to detect visitor (bots, etc) and since not every visitor have the same amount of 'credential' I made a 'dynamic' table like so: see fiddle: http://sqlfiddle.com/#!9/ca4c8/1 (simplified version).
This returns me the profile ID that I use to gather info about each profile (in another DB). Depending on the profile type I query the table with different name
clause (name='something') (ei: hostname, ipAddr, userAgent, HumanId, etc).
I'm not an expert in SQL but I'm familiar with indexes, constraints, primary, unique, foreign key etc. And from what I saw from these search results:
Most of them have comments about bad performance on self-join but answers tend to go for the missing index cause.
So the final question is: is self joining a table makes it more prone to bad performance assuming that everything is indexed properly?
On a side note, more information about the table: might be irrelevant to the question but is well in context for my particular situation:
This is a big project I have in development so I cannot test it with millions of records for now but I wonder if performance will be an issues as this grows. Any input, links, references, documentation or test procedure (maybe in comments) will be appreciated.
A self-join is no different than joining two different tables. The optimizer will pick one 'table', usually based on the WHERE
, then do a Nested Loop Join into the other. In your case, you have implied, via LEFT
, that it should work only one way. (The Optimizer will ignore that if it sees no need for it.
Your keys are find for that Fiddle.
The real problem is "Entity-Attribute-Value", which is a messy way to lay out data in tables. Your query seems to be saying "find a (limit 1) profile
(entity) that has a certain pair of attributes (name = Googlebot AND
addr = ...).
It would be so much easier, and faster, to have two columns (name and addr) and a "composite" INDEX(name, addr)
.
I recommend doing that for the common "attributes", then put the rest into a single column with a JSON string. See here.