I want to add persons as vertices in a graph which works with the following code:
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.traversal import Column
persons = [{"id":1,"name":"bob","age":25}, {"id":2,"name":"joe","age":25,"occupation":"lawyer"}]
g.inject(persons).unfold().as_('entity').\
addV('entity').as_('v').\
sideEffect(__.select('entity').unfold().as_('kv').select('v').\
property(__.select('kv').by(Column.keys),
__.select('kv').by(Column.values)
)
).iterate()
Question 1: What if one of the properties is a List or dict. Example:
persons = [{"id":1,"name":"bob","age":25, "house":{"a":1,"b":4}}, {"id":2,"name":"joe","age":25,"occupation":"lawyer","house":{"a":1,"b":4}}]
How do I ignore that 1 property (house) but still add the rest to the person vertex? Then take house and create another vertex (add properties a and b) with edge to person?
Question 2: What if I want to modify an attribute before I add it as a property to the graph? For example: Convert id into string and then add it as property
I could be wrong, but I sense that your question will end up being more complex than you've posted it. With that in mind, I will offer an answer that works given the assumption that each house is unique which I've made more clear with a "hid" (house id) that I've added to the data.
gremlin> persons = [["pid":1,"name":"bob","age":25, "house":["hid":10,"a":1,"b":4]],
......1> ["pid":2,"name":"joe","age":25,"occupation":"lawyer","house":["hid":20,"a":1,"b":4]]]
==>[pid:1,name:bob,age:25,house:[hid:10,a:1,b:4]]
==>[pid:2,name:joe,age:25,occupation:lawyer,house:[hid:20,a:1,b:4]]
gremlin> g.inject(persons).unfold().as('entity').
......1> addV('entity').as('v').
......2> sideEffect(select('entity').unfold().as('kv').select('v').
......3> choose(select('kv').by(keys).is('house'),
......4> addV('house').as('h').
......5> addE('owns').from('v').
......6> select('kv').by(values).unfold().as('hkv').select('h').
......7> property(select('hkv').by(keys),
......8> select('hkv').by(values)),
......9> property(select('kv').by(keys),
.....10> select('kv').by(values))))
==>v[0]
==>v[9]
gremlin> g.V().elementMap()
==>[id:0,label:entity,name:bob,pid:1,age:25]
==>[id:4,label:house,a:1,hid:10,b:4]
==>[id:9,label:entity,occupation:lawyer,name:joe,pid:2,age:25]
==>[id:14,label:house,a:1,hid:20,b:4]
gremlin> g.E().elementMap()
==>[id:5,label:owns,IN:[id:4,label:house],OUT:[id:0,label:entity]]
==>[id:15,label:owns,IN:[id:14,label:house],OUT:[id:9,label:entity]]
I've not really done anything new here, in that sense that I've largely just embedded the traversal pattern you were already using within itself. Note that at line 6 I'm just re-doing what was done on line 2 in the sideEffect()
.
Now, if my assumption was wrong about having unique houses in your data, then things get more complicated because you can't easily inline upsert traversal patterns in this context. Upserts typically involve a fold/coalesce/unfold pattern that immediately conflicts with this "insert only" pattern that you are using as you can't backtrack in a traversal (i.e. refer to a previous step) that is behind a reducing barrier (i.e. fold). I think I would try to restructure the source data in this case to make it more amenable for pure inserts rather than upsert operations.