I am working on an application that works like this:
Application components can (and will be) run off different computers (in the same network), so storage has to be reachable from multiple hosts.
I have considered using memcached, but I'm not quite sure should I do so, because one record is usually no smaller than 200 bytes, and if I have 1,500,000 records, I guess that it would amount to over 300 MB of memcached cache... But that doesn't seem scalable to me - what if data was 5x that amount? If it were to consume 1-2 GB of cache only to keep data in between iterations (which could easily happen)?
So, the question is: which temporary storage mechanism would be most suitable for this kind of processing? I haven't considered using mysql temporary tables, as I'm not sure if they can persist between sessions, and be used by other hosts in network... Any other suggestion? Something I should consider?
I know this sounds very old-school, but a temp file on your SAN would be easy and cheap.
Loading a 300M file at the start of each run is trivial compared to consuming 300M of cache all the time.
And if you can recreate it from the database keys, it would be wise to write and test that part and make it automatic that if the temp file was unavailable, the info would be mined from the keys and recreated.