Search code examples
distributed-computingcomputation

Identifying colluding nodes in a distributed computing environment


I have a distributed environment in which the computations are carried out in parallel in the same form as that of Map Reduce paradigm. I want to know if there is a result verification scheme for such computations. I have read that the basic method is to replicate the computation in multiple worker nodes. In such a method, the hash value of the computation is returned from the worker nodes. The hash value is compared to identify if the result is correct. But this method does not prevent collusion in which malicious nodes might communicate with each other and return the same result. Is there any method available to thwart collusion. There are several collusion detection algorithms, but I've not been able to find an algorithm that identifies a colluding group of nodes as malicious. Kindly give me some insight on the solution.


Solution

  • Sounds like you're trying to solve the Byzantine Generals Problem.

    The essence of the solution to the problem problem (and the linked paper) is:

    The method of having one general simulate m others can now be used to prove that no solution with fewer than 3m + 1 generals can cope with m traitors. The proof is similar to the one for the original Byzantine Generals Problem and is left to the reader.

    So when you expect that up to m of your nodes can be hacked and provide you with wrong answers, you'll need to have at least 3*m + 1 nodes to detect those faulty ones and find the correct solution.