When I perform a jackrabbit (version 2.2.9) search and I call get row.getValue("rep:excerpt()") the returned string is just all the properties (excluding jcr: properties) concatenated. How do I control this? eg. If I have a property called "description" containing "bla foo bla" when I search for "foo" I would like to see rep:excerpt() return part of just the description.
I tried creating an index config (and I deleted my repository between tests) in an attempt to control what properties were indexed, to no avail.
Workspace.xml...
<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
<param name="path" value="${wsp.home}/index"/>
<param name="supportHighlighting" value="true"/>
<param name="excerptProviderClass" value="org.apache.jackrabbit.core.query.lucene.DefaultHTMLExcerpt"/>
<param name="indexingConfiguration" value="${wsp.home}/indexing_configuration.xml"/>
</SearchIndex>
indexing_configuration.xml
<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://jackrabbit.apache.org/dtd/indexing-configuration-1.0.dtd">
<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
<index-rule nodeType="nt:teneoNode">
<property>description</property>
<property>input</property>
<property>key</property>
<property>comment</property>
</index-rule>
</configuration>
Thanks.
Ted.
You can configure the ExcerptProvider
(Javadoc) implementation which is responsible for the rep:excerpt()
functionality in the SearchIndex
element of you workspace.xml
file:
<param name="excerptProviderClass" value="org.apache.jackrabbit.core.query.lucene.DefaultHTMLExcerpt"/>
You might need to plugin in your own implementation for you specific needs.
There is also some - unfortunately rather old - information on the Jackrabbit Wiki.