Search code examples
neo4jcyphercollect

Neo4j cypher query : using ORDER BY with COLLECT(S)


I'm having a hard time collecting data from two distinct sources and merge the collections so that the final one is a set of objects ordered by 'dateCreated'.

Context

Users can ask questions in groups. A question can be either general or related to a specific video-game. If the question asked in a group is video-game related, this question also appears in the video-game's questions page.

Currently, I have two general questions and one specific to one video-game. Hence, when fetching the questions, I should have 3 questions.

Query

Here's the query :

START group = node(627)
MATCH generalQuestions-[?:GENERAL_QUESTION]->group
WITH group, generalQuestions
MATCH gamesQuestions-[?:GAME_QUESTION]->games<-[:GAMES]-group
WITH (collect(generalQuestions) + collect(gamesQuestions)) as questions
RETURN questions
ORDER BY questions.dateCreated

First issue : using ORDER BY

Cached(questions of type Collection) expected to be of type Map but it is of type Collection - maybe aggregation removed it?

What's the proper way of achieving what I'm trying to do?

Second issue : wrong results

If I remove the ORDER BY clause, instead of having 3 results, I get 14 ... :

[
Node[641]{dateCreated:1380892636,dateUpdated:1380892636,title:"GENERAL TITLE 1",type:1,content:"GENERAL CONTENT 1"},
Node[641]{dateCreated:1380892636,dateUpdated:1380892636,title:"GENERAL TITLE 1",type:1,content:"GENERAL CONTENT 1"},
Node[641]{dateCreated:1380892636,dateUpdated:1380892636,title:"GENERAL TITLE 1",type:1,content:"GENERAL CONTENT 1"},
Node[641]{dateCreated:1380892636,dateUpdated:1380892636,title:"GENERAL TITLE 1",type:1,content:"GENERAL CONTENT 1"},
Node[641]{dateCreated:1380892636,dateUpdated:1380892636,title:"GENERAL TITLE 1",type:1,content:"GENERAL CONTENT 1"},
Node[641]{dateCreated:1380892636,dateUpdated:1380892636,title:"GENERAL TITLE 1",type:1,content:"GENERAL CONTENT 1"},
Node[642]{dateCreated:1380892642,dateUpdated:1380892642,title:"GENERAL TITLE 2",type:1,content:"GENERAL CONTENT 2"},
Node[642]{dateCreated:1380892642,dateUpdated:1380892642,title:"GENERAL TITLE 2",type:1,content:"GENERAL CONTENT 2"},
Node[642]{dateCreated:1380892642,dateUpdated:1380892642,title:"GENERAL TITLE 2",type:1,content:"GENERAL CONTENT 2"},
Node[642]{dateCreated:1380892642,dateUpdated:1380892642,title:"GENERAL TITLE 2",type:1,content:"GENERAL CONTENT 2"},
Node[642]{dateCreated:1380892642,dateUpdated:1380892642,title:"GENERAL TITLE 2",type:1,content:"GENERAL CONTENT 2"},
Node[642]{dateCreated:1380892642,dateUpdated:1380892642,title:"GENERAL TITLE 2",type:1,content:"GENERAL CONTENT 2"},
Node[632]{dateCreated:1380889484,dateUpdated:1380889484,title:"GTA5 TITLE",type:2,content:"GTA5 CONTENT"},
Node[632]{dateCreated:1380889484,dateUpdated:1380889484,title:"GTA5 TITLE",type:2,content:"GTA5 CONTENT"}
]

Is there something wrong with the way I collect the results ?

EDIT

Expanded query to get the gamesQuestion :

gamesQuestions-[:GAME_QUESTION]->()<-[:QUESTIONS]-games-[:INTERESTS]->()<-[:HAS_‌​INTEREST_FOR]-interests<-[:INTERESTS]-group

Thanks for your help,


Solution

  • The "Order by" expects a property on a node or a relationship. The "questions" in your query is a collection of nodes instead of a node/relationship, you can't sort a collection using "Order by", you can only sort nodes or relationships on their properties.

    In order to use "Order by", you need to return the questions as a column of rows rather than a collection. In terms of the relationships indicated in your original query, the following query should return the general and the specific game questions as a column of rows and sort them on the property "dateCreated",

    START group = node(627) 
    Match question-[?:GENERAL_QUESTION|GAME_QUESTION]->()<-[:GAMES*0..1]-(group)
    Return distinct question
    Order by question.dateCreated
    

    For the expanded case where the game questions are related to the group via the sequence of relationships "gamesQuestions-[?:GAME_QUESTION]->games<-[:GAMES]-group, I have gamesQuestions-[:GAME_QUESTION]->()<-[:QUESTIONS]-games-[:INTERESTS]->()<-[:HAS_‌​INTEREST_FOR]-interests<-[:INTERESTS]-group", you can simply extend the pattern in the previous query as follows,

    START group = node(627) 
    Match question-[:GENERAL_QUESTION|GAME_QUESTION]->()-[*0..4]-(group)
    Return distinct question
    Order by question.dateCreated
    

    The idea is to match the questions that can reach the group node with either one step, or 4 more steps.

    Another option is to specify the two patterns in the where clause,

    START group = node(627) 
    MATCH question-[*]-group
    Where question-[:GENERAL_QUESTION]->group or (question-[:GAME_QUESTION]->()<-[:QUESTIONS]-()-[:INTERESTS]->()<-[:HAS_INTERESTS_FOR]-()<-[:INTERESTS]-group)
    Return distinct q
    Order by q.dateCreated