Search code examples
mysqlsearchtagsstorage

MySQL - Making a (simple?) search engine with tags


I'm about to make a (simple?) search engine to our future website and I would like some advice regarding a question.

The contents on the website will have tags connected to them but I don't know how to store them the correct way. To keep it simple, 1 title with multiple tags. I've two solutions and I would like from you which one's the better, or say a another solution ;)

  1. Store titles in one table and tags in another one. Each tag then have a titleId. When a user search for something, the database searches in two tables and returns the result simultaneously. + Less data to store. - Searching in multiple tables

  2. Store titles in one table and tags in another one. Though each title has a tag with the same name as the title. That way, the database just has to search in one table.

    • Just has to search in one table. - More data to store.

I say "performance is everything" but what do you think?


Solution

  • What you want to do is use a Many-to-Many relationship. The way that works is you would use a join table. Here's how I'd set it up

    create table page (
      id int(11) not null auto_increment,
      title varchar(128)
      ...
    ) engine = innodb;
    
    create table tag (
      id int(11) not null auto_increment,
      tag_name varchar(128)
    ) engine = innodb;
    
    create table page_tag (
     id int(11) not null auto_increment,
     tag_id int(11) not null,
     page_id int(11) not null,
     foreign key(tag_id) references tag(id),
     foreign key(page_id) references page(id),
     unique key(tag_id, page_id)
    ) engine=innodb;
    

    Make sure you use innodb tables for these to enforce referential integrity. If you need full text searches, add an additional myISAM table that basically replicates your tag table. The unique key in page_tag prevents duplicate entries in there and makes the join on that table quicker.

    I'd also say you should go for #2. That way you're searching one table every time, and that's probably easier to deal with.