we are planning to introduce push applications to our mobile apps (for Android phone and tablet, iPhone, iPad and Blackberry).
Every 15 minutes we get a new set of data. This data is stored in an MySQL databse. We would then check if this data matches the subscriptions of our users (data is location based, so a user would subscribe to notifications for one or more locations). All users with matching data should then be notified via the push service of their respective platform.
Server capacity is not a problem. We are mostly using PHP and would prefer to stay with it but are willing to go with other languages if necessary.
My questions are:
Can you give me advice on the technology to use on the server side? It should scale really well (I expect lots of subscriptions across the platforms), ideally work with the common push gateways and be fast enough to handle all notifications before the next batch of data comes in.
I have concerns regarding the delivery speed of those notifications. Let's say we have 500.000 subscriptions and the data matches to 50%, that would mean we would need to push 250.000 notifications in 15 minutes. Do you have any experience with high numbers and push notifications?
Thanks a lot, Mark.
Although PHP is great for generating dynamic web content, I feel it misses some essential features for doing high performance background operations like this. I'd go with a language that supports multi-threading (my personal preference would lead me to C# 4.0, but it depends on your server platform as well).
If you have multi-threading support, you can then write threads that load the data from the database and while it is being loaded have other threads push out notifications. Make sure you can configure how much threads are used for each part of the deal, so you can throttle performance as needed.
If one server can't do the job, you might want to look at partitioning your data across multiple servers. I'd imagine the quickest way to do this is to assign blocks of records to different servers.
One final word of advice, get a test environment where you can simulate your problem and do stress testing. While stress testing, don't stop at your target number of 500.000, but push on to at least ten times that. This will be much more effective in finding the weak spots of your software in advance. Also it would be very helpful to be able to throttle certain hardware parameters, like memorey, disk IO, network IO and CPU. By simulating having a low amount on any of these, you get a feel for how the software will behave in certain conditions. This experience will help you if you run into any performance issues in production and it will help you come up with hardware requirements.