Search code examples
linuxapache-zookeeperzoo

zoo + snapshot files are created very frequently


under folder - /var/hadoop/zookeeper/version-2/ we can see that Zookeeper transaction logs and snapshot files are created very frequently (multiple files in every minute) and that fills up the Filesystem in a very short time.

ROOT CAUSE

One or more application are creating or modifying the znodes too frequently, causing too many transactions in a short duration. This leads to the creation of too many transactional log files and snapshot files since they get rolled over after 100,000 entries by default (as defined by zookeeper property 'snapCount')

-rw-r--r-- 1 zookeeper hadoop  67108880 Jul 28 17:24 log.570021fa92
-rw-r--r-- 1 zookeeper hadoop 490656299 Jul 28 17:24 snapshot.5700232ffa
-rw-r--r-- 1 zookeeper hadoop  67108880 Jul 28 17:29 log.5700232ffc
-rw-r--r-- 1 zookeeper hadoop 490656389 Jul 28 17:29 snapshot.5700249d7f
-rw-r--r-- 1 zookeeper hadoop  67108880 Jul 28 17:33 log.5700249d78
-rw-r--r-- 1 zookeeper hadoop 490656275 Jul 28 17:33 snapshot.570025fdaf
-rw-r--r-- 1 zookeeper hadoop  67108880 Jul 28 17:36 log.570025fdae
-rw-r--r-- 1 zookeeper hadoop 490656275 Jul 28 17:36 snapshot.570026c447
-rw-r--r-- 1 zookeeper hadoop  67108880 Jul 28 17:40 log.570026c449
-rw-r--r-- 1 zookeeper hadoop 490658969 Jul 28 17:40 snapshot.570027caed
-rw-r--r-- 1 zookeeper hadoop  67108880 Jul 28 17:43 log.570027caef
-rw-r--r-- 1 zookeeper hadoop 490658981 Jul 28 17:43 snapshot.570028a0d0
-rw-r--r-- 1 zookeeper hadoop  67108880 Jul 28 17:48 log.570028a0d2
-rw-r--r-- 1 zookeeper hadoop 165081088 Jul 28 17:48 snapshot.57002a0268
-rw-r--r-- 1 zookeeper hadoop  67108880 Jul 28 17:48 log.57002a026b
.
.
.
.

when we opened one of the log as - log.57002a026b we saw encrypted log

any suggestion how to unencrypted the logs above ?

or how to know which is the application thatcreating or modifying the znodes too frequently ?


Solution

  • PROBLEM

    Zookeeper transaction logs and snapshot files are created very frequently (multiple files in every minute) and that fills up the FileSystem in a very short time.

    ROOT CAUSE

    One or more application are creating or modifying the znodes too frequently, causing too many transactions in a short duration. This leads to the creation of too many transactional log files and snapshot files since they get rolled over after 100,000 entries by default (as defined by zookeeper property 'snapCount')

    RESOLUTION

    The resolution for such cases involves reviewing the zookeeper transaction logs to find the znodes that are updated/created most frequently using the following command on one of the zookeeper servers:

    # cd /usr/hdp/current/zookeeper-server
    # java -cp zookeeper.jar:lib/* org.apache.zookeeper.server.LogFormatter /hadoop/zookeeper/version-2/logxxx
    (where 'dataDir' is set to '/hadoop/zookeeper' within zookeeper configuration)
    

    Once the frequently updating znodes are identified using the above command, one should continue with fixing the related application that is creating such a large number of updates on zookeeper.

    An example of such an application that can cause this problem is Hbase, when there are very large number of regions stuck in transition and they repeatedly fail to become online.