Search code examples
javascalaapache-zookeeperapache-curatorznodes

Java Zookeeper API weird ZNode behavior. Unable to delete ZNode properly. It has unexpected results


I am trying to create a persistent ZNode and storing the number of lines of a particular file that I have processed. Creation works just like it should, so does reading data from the node, but deletion doesn't work if it's in the same code. I'd explain what I mean.

I have created functions:

setOrCreateFileCheckpoint(fileName: String, lineNumber: Int) :- checks if the ZNode exists, creates it if it doesn't and sets the stored value to lineNumber getFileCheckpoint(fileName: String) :- returns the value stored in the ZNode deleteFileCheckpoint(fileName: String) :- deletes the ZNode

below is the code for all three:

/*
updates or creates a checkpoint for a file being processed
 */
def setOrCreateFileCheckpoint(fileName: String, lineNumber: Int): Unit =
    {
        val fileCheckpointPath = checkpointPoolPath + "/" +fileName
        val zk = getZookeeper
        val zkCuratorClient = getZookeeperCuratorClient

        if ( zk.exists(fileCheckpointPath, false) == null)
            {
                val node = new PersistentNode(zkCuratorClient, CreateMode.PERSISTENT, false, fileCheckpointPath, lineNumber.toString.getBytes())
                node.start()
            }
        else
            zk.setData(fileCheckpointPath, lineNumber.toString.getBytes(), -1)
    }

/*
gets checkpoint for a file
 */
def getFileCheckpoint(fileName: String): Int =
    {
        val fileCheckpointPath = checkpointPoolPath + "/" +fileName
        val zk = getZookeeper
        val zkCuratorClient = getZookeeperCuratorClient

        if ( zk.exists(fileCheckpointPath, false) != null)
            new String(zk.getData(fileCheckpointPath, false, null)).toInt

        else
            0

    }

/*
deletes the file checkpoint so that we don't keep accumulating zNodes on the zookeeper
 */
def deleteFileCheckpoint(fileName: String): Unit =
    {
        val fileCheckpointPath = checkpointPoolPath + "/" +fileName

        val zk = getZookeeper

        if ( zk.exists(fileCheckpointPath, false) == null)
        {
            throw RuntimeException("Trying to delete checkpoint that doesn't exist for file: " + fileName)
        }
        else
            {
                /*println(zk.exists(fileCheckpointPath, false).getVersion)
                zk.delete(fileCheckpointPath, zk.exists(fileCheckpointPath, false).getVersion)*/
                deleteChildren(zk, fileCheckpointPath, true)

            }
    }

Following is the code I am testing and am perplexed by:

        ZookeeperUtility.setOrCreateFileCheckpoint("file1", 2000) //let's call it cre1

        println(ZookeeperUtility.getFileCheckpoint("file1")) //let's call it get1

        ZookeeperUtility.deleteFileCheckpoint("file1") //let's call it del1
        println("del1")

        ZookeeperUtility.deleteFileCheckpoint("file1") //let's call in del2
        println("del2")

Run 1:

Step1: I Run the code shown above

Result: Error encountered on the del2

Step2: Comment out cre1 and run the code again

Result: Node is fetched, gives the correct value as result error encountered on del2. This is mind boggling. I can't understand why. The node is supposed to be deleted.

Step3: with cre1 still commented, same as previous step, run the code again

Result: Node doesn't exist gives 0 at get1 which means node doesn't exist. error is encountered at del1. Which is what should've happened in step2 itself

Run2:

Step1: Comment out del2, run the code

Result: creates node, fetches correct data, exits normally

Step2: Comment out cre1, run the code

Result: Fetches the value 2000 from a node that was supposed to be deleted. exits normally

Step3: Run the same code as step2 again

Result: fetches 0, error encountered on del1.

If I run the code one step at a time, if I only create in one run, only fetch in the next run and only delete in the run after that, everything works just like it should. I am on the brink of pulling my hair out.

P.S. The code is written in Scala but I am using the Java API. Scala can seemlessly work with Java classes.

If you look at the deleteFileCheckpoint function I have commented out a part, I have tried that approach as well. It has the exact same behvaior.


Solution

  • This is mind boggling. I can't understand why. The node is supposed to be deleted.

    I'm not sure why you're surprised. You are creating a PersistentNode which exists to automatically recreate the node should it get deleted. In fact, all the surrounding code is very puzzling. It's duplicating what PersistentNode does internally. You don't need to do all that other stuff. Just use PersistentNode.

    Further, code like checkExists() followed by an action based on the result will almost never work in production. ZooKeeper is highly concurrent and eventually consistent. This is why you should always use Curator's recipes instead of hand-coding solutions.