Search code examples
lucenerelationshipgraph-theoryorientdbvertices

Using Lucene to query related document fields


Say I have Movie vertices connected to Person vertices by DirectedBy and Starring edges.

Consider this OrientDB query which works as expected:

select *, out("DirectedBy").Name, out("Starring").Name from Movie where out("DirectedBy") CONTAINS (Name = 'John Ford')

That correctly returns all movies directed by Persons with the name "John Ford". However, I want to perform the query using the Lucene full text search to give a little more flexibility.

I think I have my indexes set up correctly, as a query directly on the Persons table succeeds produces results:

select * from person where Name lucene 'John Ford'

However trying to use the Lucene operator in my query of the Movie vertices produces no results:

select *, out("DirectedBy").Name, out("Starring").Name from Movie where out("DirectedBy") CONTAINS (Name LUCENE 'John Ford')

Am I doing something wrong? Or am I trying to do something that is not possible?


Solution

  • In order to use LUCENE, you should execute the SELECT with it, not inside the contains. Try this that should be super fast:

    select *, out("DirectedBy").Name, out("Starring").Name
    from (
      select expand( in("DirectedBy") ) from person where Name lucene 'John Ford'
    )
    

    The inner SELECT uses LUCENE to find "John Ford", then crosses the connected Movies thanks to the in("DirectedBy"). I used expand() because you need that result for the outer SELECT where you display the information you want.