Search code examples
algorithmexpert-system

Expert System (?) algorithm


I have an algorithmic problem which can be reduced to this task:

Suppose we have a list of n diseases and m symptoms.
For each disease d and symptom s, we have one of three options:

  • the symptom is positively correlated with the disease: s => d
  • the symptom is negatively correlated with the disease: s => ~d
  • the symptom is uncorrelated with the disease

The goal of the algorithm is to create a list of yes/no questions regarding symptoms (or even better - a binary tree of questions), which can deduce the exact disease according to the symptoms.

Any references to specific algorithms, relevant software tools and even domain-specific jargon would be very appreciated.


Solution

  • You case use a decision tree : http://en.wikipedia.org/wiki/Decision_tree_learning

    Basically finding the optimum tree (ie which will minimize the average number of questions asked before you can identify the desease) is NP-hard.

    You can can use a greedy algorithm and then try to optimise it (if you need it).

    At each step you want to reduce at much as possible the number of deceases that are still "possible".

    You are at the top of your tree and so you have three possible options for a given s, count the number of diseases in each option : pc nc uc.

    On one side you have pc on the other nc and the uc you can't say anything (you can look at two level of your tree to have some info but for now we don't do that).

    So worst case scenario, you have pc / nc + uc or pc + uc / nc, choose the s that minimize worst case (ie : lot of one side, only a few on the other).

    You need to minimize abs( pc - (nc + uc)) + abs ( (pc+uc) - nc).

    You now have your s for your first question and you can iteratively build your tree.