Search code examples
phplockingdelayfopen

fopen file locking in PHP (reader/writer type of situation)


I have a scenario where one PHP process is writing a file about 3 times a second, and then several PHP processes are reading this file.

This file is esentially a cache. Our website has a very insistent polling, for data that changes constantly, and we don't want every visitor to hit the DB every time they poll, so we have a cron process that reads the DB 3 times per second, processes the data, and dumps it to a file that the polling clients can then read.

The problem I'm having is that, sometimes, opening the file to write to it takes a long time, sometimes even up to 2-3 seconds. I'm assuming that this happens because it's being locked by reads (or by something), but I don't have any conclusive way of proving that, plus, according to what I understand from the documentation, PHP shouldn't be locking anything. This happens every 2-5 minutes, so it's pretty common.

In the code, I'm not doing any kind of locking, and I pretty much don't care if that file's information gets corrupted, if a read fails, or if data changes in the middle of a read. I do care, however, if writing to it takes 2 seconds, esentially, because the process that has to happen thrice a second now skipped several beats.

I'm writing the file with this code:

$handle = fopen(DIR_PUBLIC . 'filename.txt', "w");
fwrite($handle, $data);
fclose($handle);

And i'm reading it directly with:

file_get_contents('filename.txt')

(it's not getting served directly to the clients as a static file, I'm getting a regular PHP request that reads the file and does some basic stuff with it)

The file is about 11kb, so it doesn't take a lot of time to read/write. Well under 1ms.

This is a typical log entry when the problem happens:

  Open File:    2657.27 ms
  Write:    0.05984 ms
  Close:    0.03886 ms

Not sure if it's relevant, but the reads happen in regular web requests, through apache, but the write is a regular "command line" PHP execution made by Linux's cron, it's not going through Apache.

Any ideas of what could be causing this big delay in opening the file?
Any pointers on where I could look to help me pinpoint the actual cause?

Alternatively, can you think of something I could do to avoid this? For example, I'd love to be able to set a 50ms timeout to fopen, and if it didn't open the file, it just skips ahead, and lets the next run of the cron take care of it.

Again, my priority is to keep the cron beating thrice a second, all else is secondary, so any ideas, suggestions, anything is extremely welcome.

Thank you!
Daniel


Solution

  • I can think of 3 possible problems:

    • the file get locked when read/writing to it by lower php system calls without you knowing it. This should block the file for 1/3s max. You get periods longer that that.
    • the fs cache starts a fsync() and the whole system blocks read/writes to the disk until that is done. As a fix you may try installing more RAM or upgrading the kernel or using a faster hard disk
    • your "caching" solution is not distributed, and it hits the worst acting piece of hardware in the whole system many times a second... meaning that you cannot scale it further by simply adding more machines, only by increasing the hdd speed. You should take a look at memcache or APC, or maybe even shared memory http://www.php.net/manual/en/function.shm-put-var.php

    Solutions i can think of:

    • put that file in a ramdisk http://www.cyberciti.biz/faq/howto-create-linux-ram-disk-filesystem/ . This should be the simplest way to avoid hitting the disk so often, without other major changes.
    • use memcache. Its very fast when used locally, it scales well, used by "big" players. http://www.php.net/manual/en/book.memcache.php
    • use shared memory. It was designed for what you are trying to do here... having a "shared memory zone"...
    • change the cron scheduler to update less, maybe implement some kind of event system, so it would only update the cache when necessary, and not on a time basis
    • make the "writing" script write to 3 files different, and make the "readers" read from one of the files, randomly. This may allow a "distribution" of the loking across more files and it may reduce the chances that a certain file is locked when writing to it... but i doubt it would bring any noticeable benefits.