I want to calculate percentage of how much each type of skill is present for each occupation. I made a have different nodes occupation and Soft_skills they have an relationship [r:MAPS_TO] the relationship has two property type('essentieel', 'optioneel')and b (this kind of weight essentieel =1.0 and optioneel =0.5)
see below een example of the file that i'm using
occupation | code_occupation | skill_code | name | type | b |
---|---|---|---|---|---|
dokter | sk_565 | sk_687 | leren | optioneel | 0.5 |
dokter | sk_565 | sk_687 | leren | optioneel | 0.5 |
dokter | sk_565 | sk_687 | leren | essentieel | 1.0 |
dokter | sk_565 | sk_687 | leren | essentieel | 1.0 |
i made a query where i can calculate what the sum is for each occupation see below my query:
match (b:BOC_beroep)-[r:MAPS_TO]-(s:Soft_Skill)
return b.beroepstitel as beroep,
sum(toFloat(r.b)) as sum order by sum desc
limit 10
this is my output
beroep | sum |
---|---|
Doker | 5 |
Kapper | 7 |
know i want to achieve what the percentage is of the type of skill my calculation will be
Dokter have in total 4 skills en the percentage of each skill is
2/4 * 100 = 50% essentieel
1/4 * 100 = 25% optioneel
i was trying to make a new calculation with this query:
MATCH (b:BOC_beroep)-[r:MAPS_TO]->(s:Soft_skill)
WITH SUM(toFloat(r.b)) As total
MATCH (b:BOC_beroep)-[r:MAPS_TO]->(s:Soft_Skill)
RETURN b.beroepstitel AS beroep,
(toFloat(total/(r.b)))*100 AS percent
order by percent desc
My output is Cannot divide 'Long' by 'string' I don't understand the erorr
i was inspired by this question and answer Here's an inline link to stackoverflow.
Your MAPS_TO
relationships apparently store the b
property as a string, so any query must always convert it to a float to do math with it. Also, the math in your query is wrong - r.b
should be in the numerator and total
should be in the denominator.
This query may work for you:
MATCH (b:BOC_beroep)-[r:MAPS_TO]->(s:Soft_Skill)
WITH b.beroepstitel AS beorep, r.type AS type, s.name AS skill, SUM(TOFLOAT(r.b)) AS weight
WITH SUM(weight) As total, COLLECT([beorep, type, skill, weight]) AS data
UNWIND data AS datum
RETURN
datum[0] AS beroep,
datum[1] AS type,
datum[2] AS skill,
datum[3]/total*100 AS percent
ORDER BY percent desc
The first WITH
clause sums all the weights for a given beorep
, type
, and skill
. The second WITH
clause calculates a total of all the weights and collects all the other data needed. The rest of the query is straightforward.
Here is a sample result:
╒════════╤════════════╤═══════╤═════════════════╕
│beroep │type │skill │percent │
╞════════╪════════════╪═══════╪═════════════════╡
│"dokter"│"essentieel"│"leren"│66.66666666666666│
├────────┼────────────┼───────┼─────────────────┤
│"dokter"│"optioneel" │"leren"│33.33333333333333│
└────────┴────────────┴───────┴─────────────────┘
If you stored the b
property as a float instead of a string, you can avoid the cost of doing TOFLOAT
conversions all the time. You can easily convert all the stored weights this way:
MATCH ()-[r:MAPS_TO]->()
SET r.b = TOFLOAT(r.b)
If type
only has 2 possible values, it would require less storage space to store that property as a boolean (named, say, isRequired
) instead. This can also simplify queries using that property. You can easily convert that property this way:
MATCH ()-[r:MAPS_TO]->()
SET r.isRequired = CASE r.type WHEN 'essentieel' THEN true ELSE false END
REMOVE r.type