I'm trying to run the CRFClassifier on a string to extract entities from the string. I'm using the Ruby bindings for the Stanford NLP entity recognizer from here: https://github.com/tiendung/ruby-nlp
It works perfectly fine on its own class say (nlp.rb). When I run ruby nlp.rb
it works fine. However, I've tried to create an object of this class inside one of my controllers in my rails app and for some reason I'm getting the following error:
java.lang.NoClassDefFoundError: edu/stanford/nlp/ie/crf/CRFClassifier
Here is the code that works fine on its own but not inside a controller.
def initialize
Rjb::load('stanford-postagger.jar:stanford-ner.jar', ['-Xmx200m'])
crfclassifier = Rjb::import('edu.stanford.nlp.ie.crf.CRFClassifier')
maxentTagger = Rjb::import('edu.stanford.nlp.tagger.maxent.MaxentTagger')
maxentTagger.init("left3words-wsj-0-18.tagger")
sentence = Rjb::import('edu.stanford.nlp.ling.Sentence')
@classifier = crfclassifier.getClassifierNoExceptions("ner-eng-ie.crf-4-conll.ser.gz")
end
def get_entities(sentence)
sent = sentence
@classifier.testStringInlineXML( sent )
end
It's the same exact code in both cases. Anyone has any idea of what's happening here!?
Thanks in advance!
I think you need this:
Rjb::load('/path/to/jar/stanford-postagger.jar:/path/to/jar/stanford-ner.jar', ['-Xmx200m'])
I just tried this and it works. Create a dir in lib called nlp. Put the jars there and then create a class which loads the jars using the full path:
So you end up with:
├── lib
│ ├── nlp
│ │ ├── stanford-ner.jar
│ │ └── stanford-postagger.jar
│ └── nlp.rb
require 'rjb'
class NLP
def initialize
pos_tagger = File.expand_path('../nlp/stanford-postagger.jar', __FILE__)
ner = File.expand_path('../nlp/stanford-ner.jar', __FILE__)
Rjb::load("#{pos_tagger}:#{ner}", ['-Xmx200m'])
crfclassifier = Rjb::import('edu.stanford.nlp.ie.crf.CRFClassifier')
maxentTagger = Rjb::import('edu.stanford.nlp.tagger.maxent.MaxentTagger')
maxentTagger.init("left3words-wsj-0-18.tagger")
sentence = Rjb::import('edu.stanford.nlp.ling.Sentence')
@classifier = crfclassifier.getClassifierNoExceptions("ner-eng-ie.crf-4-conll.ser.gz")
end
def get_entities(sentence)
sent = sentence
@classifier.testStringInlineXML( sent )
end
end
Little test class:
require_relative 'lib/nlp'
n = NLP.new
n.get_entities("Good afternoon Rajat Raina, how are you today?")
output:
ruby t.rb
Loading classifier from /Users/brendan/code/ruby/ruby-nlp/ner-eng-ie.crf-4-conll.ser.gz ... done [1.2 sec].
Getting data from Good afternoon Rajat Raina, how are you today? (default encoding)
Good afternoon <PERSON>Rajat Raina</PERSON>, how are you today?