I have been doing some cpu profiling of my application, and I note that one of the things that takes a significant amount of time is the code that ensures I send mo more than query to webservice per second. The actual query itself and handling of the results take little time in comparison, of course there is an I/O component waiting for results but they thing I am trying to do is reduce cpu since the applications sometimes has to run on a single cpu machine
Using YourKit Profiler the call that uses the significant amount of cpu is
java.util.concurrent.locks.AbstractQueuedSynchronizer.aquireQueued()
My delay method is below
public class SearchServer
{
private static java.util.concurrent.locks.Lock delayLock = new ReentrantLock();
private static AtomicInteger queryQueue = new AtomicInteger();
private static AtomicLong queryDelay = new AtomicLong();
static void doDelayQuery()
{
delayLock.lock();
try
{
if(isUserCancelled())
{
return;
}
//Ensure only send one query a second
Date currentDate = new Date();
long delay = currentDate.getTime() - querySentDate.getTime();
if (delay < delayInMilliseconds)
{
try
{
long delayBy = delayInMilliseconds - delay;
queryDelay.addAndGet(delayBy);
Thread.sleep(delayBy);
logger.info(Thread.currentThread().getName() + ":Delaying for " + delayBy + " ms");
}
catch (InterruptedException ie)
{
Thread.currentThread().interrupt();
throw new UserCancelException("User Cancelled whilst thread was delay sleeping");
}
}
}
finally
{
//We set before unlocking so that if another thread enters this method before we start query we ensure they
//do not skip delay just because the query that this thread has delayed for has started
querySentDate = new Date();
delayLock.unlock();
}
}
}
Okay using Google Guava Library it turned out to be suprisingly simple
import com.google.common.util.concurrent.RateLimiter;
public class SearchServer
{
private static RateLimiter rateLimiter = RateLimiter.create(1.0d);
static void doDelayQuery()
{
rateLimiter.acquire();
}
public doQuery()
..................
}
Although key difference is previously is I took the time of the previous call so didn't wait full second between calls, so to get similar throughput I changed RateLmiter to use 2.0d
Profiling no longer shows cpu hit in this area.