I have a question about SQL indexing in my theory class. It asks me to choose which column should be indexed to optimize these 2 queries: (This is 'paper' question so I am not provided the database to test these indexes with EXPLAIN)
1, First query.
SELECT BranchName, No_Of_Copies
FROM BOOK as B, BOOK_COPIES as BC, LIBRARY_BRANCH as LB
WHERE B.BookId = BC.BookId and BC.BranchId=LB.BranchId and title ="The Lost Tribe";
I have the answer of this one, which is that BOOK.title, BOOK_COPIES.BranchId and LIBRARY_BRANCH.BranchId should be used for indexing. However, I don't really understand why BOOK.BookId and BOOK_COPIES.BookId are not chosen for indexing.
2, Second query
SELECT B.cardNo, Name, Address, COUNT(BookId,BranchId)
FROM BORROWER as B, BOOK_LOANS as BL
WHERE (BL.CardNo=B.CardNo)
GROUP BY B.CardNo, Name, Address
HAVING COUNT(BL.BookId, BranchId)>5;
Would it be optimized if I create index on BOOK_LOANS.CardNo, BORROWER.CardNo, Name and Address ?
That class needs to be updated. Using commas in JOIN
is antiquated; the new style uses JOIN .. ON
The question is ambiguous -- what table contains title
? I'll assume it is B
.
Since the only filtering is on title
, The Optimizer will pick B
as the first table to look at:
B needs INDEX(title)
From B, it can reach for BC:
BC needs INDEX(BookId)
Similarly:
LB needs INDEX(BranchId)
If you are using MySQL, be aware that a PRIMARY KEY
is an index. And every table needs a PRIMARY KEY
. Also a PRIMARY KEY
is necessarily unique. So, when I said "needs", you may find that there is already a PRIMARY KEY
satisfying the need.
More: http://mysql.rjweb.org/doc.php/index_cookbook_mysql
I will quibble with the schema -- why is the "number of books" not simply a column in Books
?
As for query 2, it is even less clear which table each column might be in.
Do be aware that an INDEX
can contain only columns from a single table.