I have a ConcurrentHashMap that exhibits strange behavior on occasion.
When my app first starts up, I read a directory from the file system and load contents of each file into the ConcurrentHashMap using the filename as the key. Some files may be empty, in which case I set the value to "empty".
Once all files have been loaded, a pool of worker threads will wait for external requests. When a request comes in, I call the getData() function where I check if the ConcurrentHashMap contains the key. If the key exists I get the value and check if the value is "empty". If value.contains("empty"), I return "file not found". Otherwise, the contents of the file is returned. When the key does not exist, I try to load the file from the file system.
private String getData(String name) {
String reply = null;
if (map.containsKey(name)) {
reply = map.get(name);
} else {
reply = getDataFromFileSystem(name);
}
if (reply != null && !reply.contains("empty")) {
return reply;
}
return "file not found";
}
On occasion, the ConcurrentHashMap will return the contents of a non-empty file (i.e. value.contains("empty") == false
), however the line:
if (reply != null && !reply.contains("empty"))
returns FALSE. I broke down the IF statement into two parts: if (reply != null)
and if (!reply.contains("empty"))
. The first part of the IF statement returns TRUE. The second part returns FALSE. So I decided to print out the variable "reply" in order to determine if the contents of the string does in fact contain "empty". This was NOT the case i.e. the contents did not contain the string "empty". Furthermore, I added the line
int indexOf = reply.indexOf("empty");
Since the variable reply did not contain the string "empty" when I printed it out, I was expecting indexOf
to return -1. But the function returned a value approx the length of the string i.e. if reply.length == 15100
, then reply.indexOf("empty")
was returning 15099.
I experience this issue on a weekly basis, approx 2-3 times a week. This process is restarted on a daily basis therefore the ConcurrentHashMap is re-generated regularly.
Has anyone seen such behavior when using Java's ConcurrentHashMap?
EDIT
private String getDataFromFileSystem(String name) {
String contents = "empty";
try {
File folder = new File(dir);
File[] fileList = folder.listFiles();
for (int i = 0; i < fileList.length; i++) {
if (fileList[i].isFile() && fileList[i].getName().contains(name)) {
String fileName = fileList[i].getAbsolutePath();
FileReader fr = null;
BufferedReader br = null;
try {
fr = new FileReader(fileName);
br = new BufferedReader(fr);
String sCurrentLine;
while ((sCurrentLine = br.readLine()) != null) {
contents += sCurrentLine.trim();
}
if (contents.equals("")) {
contents = "empty";
}
return contents;
} catch (Exception e) {
e.printStackTrace();
if (contents.equals("")) {
contents = "empty";
}
return contents;
} finally {
if (fr != null) {
try {
fr.close();
} catch (Exception e) {
e.printStackTrace();
}
}
if (br != null) {
try {
br.close();
} catch (Exception e) {
e.printStackTrace();
}
}
if (map.containsKey(name)) {
map.remove(name);
}
map.put(name, contents);
}
}
}
} catch (Exception e) {
e.printStackTrace();
if (contents.equals("")) {
contents = "empty";
}
return contents;
}
return contents;
}
I think your problem is that some of your operations should be atomic and they aren't.
For example, one possible thread interleaving scenario is the following:
Thread 1 reads this line in the getData
method:
if (map.containsKey(name)) // (1)
the result is false and Thread 1 goes to
reply = getDataFromFileSystem(name); // (2)
in getDataFromFileSystem
, you have the following code:
if (map.containsKey(name)) { // (3)
map.remove(name); // (4)
}
map.put(name, contents); // (5)
imagine that another thread (Thread 2) arrives at (1)
while Thread 1 is between (4)
and (5)
: name is not in the map, so thread 2 goes to (2)
again
Now that does not explain the specific issue you are observing but it illustrates the fact that when you let many threads run concurrently in a section of code without synchronization, weird things can and do happen.
As it stands, I can't find an explanation for the scenario you describe, unless you call reply = map.get(name)
more than once in your tests, in which case it is very possible that the 2 calls don't return the same result.