Search code examples

Using custom feature generators with parameters in OpenNLP

I am trying to setup the OpenNLP NameFinder in a project with an XML feature generator descriptor and some non-standard features. The XML descriptor has support for custom feature generators:

      <custom class="com.example.MyFeatureGenerator"/>

However, documentation doesn't speak of passing parameters to the feature generator. Creating a new class for every slightly different configuration of the feature generator is not desirable. On the other hand, creating the feature generators programmatically likely means duplicating much of the OpenNLP code for handling the feature generator setup. What is the recommended way to use custom feature generators in OpenNLP?


  • No proper solution yet, but I worked around the issue by registering a new feature factory in OpenNLP. Unfortunately, this needs access to private parts of the OpenNLP class GeneratorFactory via reflection. Here's a working solution.

    First, define a new class, named XmlDescriptorUtil:

    import java.lang.reflect.Field;
    import java.lang.reflect.InvocationHandler;
    import java.lang.reflect.Method;
    import java.lang.reflect.Proxy;
    import java.util.Map;
    import org.w3c.dom.Element;
    public final class XmlDescriptorUtil {
      private XmlDescriptorUtil(){};
      public static abstract class XmlDescriptorFactory implements InvocationHandler
        public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
          return create((Element)args[0], (FeatureGeneratorResourceProvider)args[1]);
        public abstract AdaptiveFeatureGenerator create(Element generatorElement, FeatureGeneratorResourceProvider resourceManager)
          throws InvalidFormatException;
      public static void register(String name, XmlDescriptorFactory factory) throws Exception
        Class<?> factoryInterface = Class.forName(GeneratorFactory.class.getName()+"$XmlFeatureGeneratorFactory");
        Object proxy = Proxy.newProxyInstance(GeneratorFactory.class.getClassLoader(), new Class[]{factoryInterface}, factory);
        registerByProxy(name, proxy);
      private static void registerByProxy(String name, Object proxy) throws Exception
        Field f = GeneratorFactory.class.getDeclaredField("factories");
        Map<String, Object> factories = (Map<String, Object>) f.get(null);
        factories.put(name, proxy);

    Then, create a feature generator factory which implements the public interface XmlDescriptorUtil$XmlDescriptorFactory:

    public static void main(String[] args) {
      XmlDescriptorUtil.register("myCustom", new XmlDescriptorUtil.XmlDescriptorFactory() {
        public AdaptiveFeatureGenerator create(Element generatorElement, FeatureGeneratorResourceProvider resourceManager) throws InvalidFormatException {
          return new MyFeatureGenerator();

    Now, the feature generator is ready for use and can be used in the XML descriptor:


    If the feature generator needs parameters, they can be extracted from generatorElement in the factory class.