Cypher query warning give inconsistent result when fix

When I run this query:

MATCH (n:test) with n limit 100
WITH DISTINCT n, keys(n) AS allKeys
UNWIND allKeys AS key
with n, 
    CASE
        WHEN key STARTS WITH 'prop.title' THEN {column: 'title', value: collect(n[key])}
        WHEN key STARTS WITH 'prop.keywords' THEN {column: 'keywords', value: collect(n[key])}
    END AS data
with n, data, collect(data.value) as values
RETURN n.id, apoc.map.fromPairs(COLLECT([data.column, values]))

This query purpose is to find all the property with given prefix and group their value into an object with apoc. An example result would be like so:

My issue start when Cypher give me this warning: This feature is deprecated and will be removed in future versions. with status code Neo.ClientNotification.Statement.FeatureDeprecationWarning Then I try to fix the query by adding the key to the WITH keyword, like so: WITH n, key, CASE ... But now the value in my object (collect(n[key])) does not return all of the value anymore and only the last one. I guess the rest have been overrides.

Does anyone knows how to fix this but still remove the warning?

EDIT: I found out that by changing n to * in the first query without the key added, I get the same wrong result. Also add pictures.

Solution

I believe you could rewrite your query to something like this:

MATCH (n:test) with n limit 100 
WITH DISTINCT n, keys(n) AS allKeys
UNWIND allKeys AS key
WITH n, 
    // Note: in the original query you used the aggregation expression `collect` which made `key` an implicit grouping key.
    CASE
      WHEN key STARTS WITH 'prop.title' THEN {column: 'title', value:n[key]}
      WHEN key STARTS WITH 'prop.keywords' THEN {column: 'keywords', value:n[key]}
    END AS data
WITH n, data.column AS column, collect(data.value) as values
RETURN n.id, apoc.map.fromPairs(COLLECT([column, values]))

The problem with your original query was that you used non-grouping keys outside of an aggregation expression, so called "implicit grouping keys". Prior to neo4j 5.0 we allowed implicit grouping keys, but as that can get very confusing it was removed in 5.0. You can read more about the confusion with implicit grouping keys here: https://opencypher.org/articles/2017/07/27/ocig1-aggregations-article/

So, let's take this step-by-step:

Assume that you have a single node:

CREATE (:test{`prop.title1`:"title1", `prop.title2`:"title2", `prop.keywords`:"hey!"})

After this part of the query:

MATCH (n:test) with n limit 100
WITH DISTINCT n, keys(n) AS allKeys
UNWIND allKeys AS key

You have:

n	key	allKeys
n	"prop.title1"	["prop.title1", "prop.title2", "prop.keywords"]
n	"prop.title2"	["prop.title1", "prop.title2", "prop.keywords"]
n	"prop.keywords"	["prop.title1", "prop.title2", "prop.keywords"]`

Let's have a look at the next part of the original query:

with n, 
CASE
    WHEN key STARTS WITH 'prop.title' THEN {column: 'title', value: collect(n[key])}
    WHEN key STARTS WITH 'prop.keywords' THEN {column: 'keywords', value: collect(n[key])}
END AS data

the explicit grouping keys are the projected variables/properties that does not contain any aggregations - which in this case is n. That means that all other variables which are used outside of an aggregation expression are "implicit" grouping keys - in your case the implicit grouping key is key. Implicit grouping keys are no longer supported from 5.0. But, instead of adding it as an explicit grouping key, you can remove the aggregation expression "collect":

WITH n, 
     CASE
       WHEN key STARTS WITH 'prop.title' THEN {column: 'title', value:n[key]}
       WHEN key STARTS WITH 'prop.keywords' THEN {column: 'keywords', value:n[key]}
     END AS data

This would mean that we know have:

n	data
n	{column: "title", value: "title1"}
n	{column: "title", value: "title2" }
n	{column: "keywords", value: "hey!"}`

If you now look at the next part of the original query:

with n, data, collect(data.value) as values

You again have the aggregation expression collect with grouping keys n and data. But you don't want to group on the full data object, instead you want to collect all data.values grouped by n and data.column:

WITH n, data.column AS column, collect(data.value) as values

Which gives us:

n	column	values
n	"title"	["title1", "title2"]
n	"keywords"	["hey!"]

I hope this made it a bit more clear