Search code examples
mysqldatabaseindexingforeign-keysrelational

Designing a relational database and having a looming feeling of doom


I'm part of a four-man team at a growing company of over 200 users. It's time for a massive refactoring of our proprietary software, and we're very excited to build an ideal system that we know can withstand growth for at least 5+ years. We're using a relational database, however, and while we're making some pretty great designs, I have a looming feeling that this product may be slower that we'd hope in the future.

My concern is our usage of foreign key relations. They're great for data integrity, which is why we're going with them. If we wanted to change someone's username, it would change it in all related spots. That's great. Problem is, we aren't - we're relating by their ID, so the only major benefit is the performance gained by having relational key's index.

All these indices piling up give me a red flag. We've got some tables that are simply linking tables, with three relational keys. They definitely have their place, and we're so confident that this cuts down on the queries we'll be doing. However, I then think - we have 10,000 rows in this, 10,000 in that, and 10,000 in the other, and we want to add a new one. Bam! New index * 4.

This is worrying. Are there any pitfalls we'll be falling into, any advice from seasoned individuals?


Solution

  • How fast is your current system? Designing a good database schema is the foundation of your entire application, and if I were to decide between speed and design, I would opt for design. There are a number of ways you can speed up your application which have nothing to do with the database itself.

    If you conduct a parallel installation (running old system with new system), you can monitor slow query logs and head off any initial slowness problems in the early stages. You can also identify commonly run queries and optimize the queries by adding new or editing existing indices.

    You can also implement a caching layer which will greatly speed up your application. Caching acts as a layer between your application and the database where you can store commonly requested information in a volatile, but quickly accessible state.

    Another optimization technique is to scale up (increasing a single machines physical capacity) or scaling out (adding more machines in a cluster with replication). I have seen systems run extremely fast with 10 million+ records, on machines that had 64GB of ram. So ensure your design includes physical capacities.

    There are a whole host of optimization techniques you can follow to ensure a speedy database; stay away from text columns, do not use OR operators, stay way from ORDER BY RAND(), and limit your use of grouping operators such as group by. These are just a few examples, so do some research. To make optimization easier you can use tools such as MySQL's explain, which will identify how painful a query may be when run through the application.

    I'd highly recommend using Percona's MySQL build as they are highly optimized and offer custom features.

    It sounds like you and your team are traveling down the right path, don't be too worried about designing a complex system. Some software applications require complex systems to operate. The real trick is to make complex systems easy to use so you can easily support it and grow it in the future. Good luck.