Search code examples
apache-sparkuima

UIMA with Spark


as said in here there are some overlap between UIMA and spark in distribution infrastructures. I was planning to use UIMA with spark. (now i am moving to UIMAFit) Can any one tell me what are the problems we really face when we develop uima with spark. And what are the possible encounters. (Sorry I haven't done any research on this.)


Solution

  • The main problem is accessing objects because UIMA tries to re instantiate objects when running their analyse engines. if the objects has local references then there will be a problem with accessing from a remote spark cluster. some RDD functions might not work within UIMA context. however if you don't use a separate remote cluster then there won't be a problem. (I am talking about uima-fit 2.2)