Search code examples
hadoophbasefile-descriptor

Why Too many open files in Hbase


I have configured a 2 node cluster with hadoop and installed hbase. It was working properly and I have run some basic map reduce jobs in hadoop and I was able to create and list some tables in hbase too. However I have few data in hdfs/hbase and there were no job running. After a while I started to get "Java.net.Socket: Too many open files" error in hbase logs.

I have looked for some solutions but there are mainly answers about increasing limit. However I am curious about why there are too many open files. This cluster is not used by any other program and I have not run any job other than simple map reduce tasks in tutorials.

Why could it be?

EDIT

After Andrzej suggested, I have run this command (lsof | grep java) and I have observed that there are lots of connection in different ports which are waiting to be closed. This is just a few line of the output of the command

java      29872     hadoop  151u     IPv6          158476883      0t0       TCP os231.myIP:44712->os231.myIP:50010 (CLOSE_WAIT)
java      29872     hadoop  152u     IPv6          158476885      0t0       TCP os231.myIP:35214->os233.myIP:50010 (CLOSE_WAIT)
java      29872     hadoop  153u     IPv6          158476886      0t0       TCP os231.myIP:39899->os232.myIP:50010 (CLOSE_WAIT)
java      29872     hadoop  155u     IPv6          158476892      0t0       TCP os231.myIP:44717->os231.myIP:50010 (CLOSE_WAIT)
java      29872     hadoop  156u     IPv6          158476895      0t0       TCP os231.myIP:44718->os231.myIP:50010 (CLOSE_WAIT)

Now the question becomes, why do not they close automatically if the connection is useless now? If they do not get aborted automatically, is there any way to close them with a crontab script or something similar?

Thanks


Solution

  • ... I am curious about why there are too many open files?...

    HBase keeps open all the files all the time. Here is some example. If you have 10 tables with 3 column familes each with average of 3 files per column family and 100 regions per Region Server per table, there will be 10*3*3*100 = 9000 file descriptors open. This math doesn't take in account JAR files, temp files etc.

    Suggested value for ulimit is 10240, but you might want to set it to a value that matches better your case.