Hi I will try to keep on track but I've done a lot of research and now I just lost. I could really use some expertise here. Below is the situation:
This is a follow up question from my question here. The issue there was that my cypher
queries were taking 1 second
at the minimum to return a response. Even queries like RETURN 123
also took 1 second
. Which lead to the conclusion Neo4j Bolt Driver for Python is slower than an actual http
call to neo4j
I can back this up with research from GitHub Issues and this from stackoverflow
Each time my code runs, it generates upto 10 Cypher
queries and all those have to be fired and then operations need to be performed based on the results.
The issue is using
the queries take1 second
to execute and withHTTP
I am stuck. Since I want to useQuery Parameters
to make the query faster since now it's notBolt
as eachhttp
call now takes30ms
, multiply that by 10 {since I have 10 queries} and you have a very poor performing python API to fetch user relations. '
driver is slow and that I am not doing anything wrong. Since all the posts I've seen are dated a year backOR
and AND
conditions, how can I write those using parameters in neo4j
database I should look towards? 200ms
is the most popular graph database
. How is it possible with such drivers?BOLT drivers
and they still haven't fixed these issues. curl -X POST \
http://localhost:7474/db/data/cypher \
-H 'Authorization: Basic bmVvNGo6Y29kZQ==' \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"query" : "MATCH (ct:city)-[:CHILD_OF]->(st:state) WHERE (st.name_wr = {st}) AND (ct.name_wr= {ct}) RETURN st, ct",
"st" : "california",
"ct" : "san francisco"
but what if I want to add a clause that either st
should be California
OR it can be Alaska
AND ct
must be san francisco
, how do I do that with the parameters in REST
I replicated the script and below is the verdict:
58 transactions, tps 0.97 maxdelay 1.08
The curl
sample request is the one that fire from postman. The code that I am using can be found from the linked question (in the preface).
Well to be honest the issue was with the IP
I was using localhost
and resolving the localhost
was taking time. As soon as I switched to
it started working perfectly fine.
Marking this as the answer as this answer helped to actually benchmark the two approaches that lead to the discovery of the issue in host resolution
I think there must be something wrong with your setup. I've been using the python bolt driver for a while now, and for simple queries, I don't think I've ever seen a 1 second delay. I don't know what you code looks like, or your network delay, but I wrote a quick example to look at the delays I see in my local network (which has very low latency). Using Neo4j 3.2.9 and python driver 1.5.3.)
from __future__ import print_function
import sys
import time
from neo4j.v1 import GraphDatabase, basic_auth
ip = ''
runtime = 60.0
querystr = 'RETURN 123'
runstart = time.time()
maxdelay = 0
cnt = 0
#driver = GraphDatabase.driver("bolt+routing://%s:7687" % ip,
driver = GraphDatabase.driver("bolt://%s:7687" % ip,
auth=basic_auth("neo4j", "password"))
while time.time() - runstart < runtime:
start = time.time()
session = driver.session(access_mode='READ')
ret = session.run(querystr)
result = ret.data()
cnt += 1
delay = time.time() - start
if delay > maxdelay:
maxdelay = delay
if delay > 0.1:
print('Large delay seen cnt %s delay %0.2f' % (cnt, delay))
print('%d transactions, tps %0.2f maxdelay %0.2f' % (cnt, cnt/runtime, maxdelay))
I get the output:
117360 transactions, tps 1956.00 maxdelay 0.06
This means the average read took about half a millisecond, and the max was 60ms.
I would look at network latency and issues with resources on both your client and server side.