I've done a search on google to figure out where to go to ask about query analysis, but everything brought me back to here. I apologize if I am wrong and this should go somewhere else.
Here are the current stats of my database (running Neo4j server v3.4):
My Neo4jClient query:
var query = graphClient
.Cypher
.Match("(cp:CurrencyPair) - [:Currency_Pair] -> (mkt:MarketInfo)")
.With("mkt, cp")
.Where((CurrencyPair cp) => cp.Name == Pair.Name)
.AndWhere((MarketInfo mkt) => mkt.StartTicksUTC >= startTicks)
.AndWhere((MarketInfo mkt) => mkt.EndTicksUTC <= endTicks)
.With("mkt, cp")
.OrderBy("mkt.EndTicksUTC DESC")
.Match("(mkt) -[Ask_Input_Data] - (a:Ask)")
.With("mkt, cp, a")
.Match("(mkt) -[Bid_Input_Data] - (b:Bid)")
.With("mkt, cp, a, b")
.Return((mkt, cp, a, b) => new
{
MarketInfo = mkt.As<MarketInfo>(),
CurrencyPair = cp.As<CurrencyPair>(),
Ask = a.As<Ask>(),
Bid = b.As<Bid>()
});
Resulting Neo4j query:
profile
MATCH (cp:CurrencyPair) - [Currency_Pair] -> (mkt:MarketInfo)
WITH mkt, cp
WHERE (cp.Name = "AUD/CAD") AND (mkt.StartUTC >= "2012-01-10T15:50:00+00:00" ) AND (mkt.StartUTC <= "2012-01-10T16:00:00+00:00" )
WITH mkt, cp
ORDER BY mkt.EndTicksUTC DESC
MATCH (mkt:MarketInfo) <-[Ask_Input_Data] - (a:Ask)
WITH mkt, cp, a
MATCH (mkt:MarketInfo) <-[Bid_Input_Data] - (b:Bid)
WITH mkt, cp, a, b
RETURN mkt AS MarketInfo, cp AS CurrencyPair, a AS Ask, b AS Bid
As you can see, I am trying to query the database to find all "MarketInfo" nodes that are related to a given currency node and then return nodes that are only between two dates.
Running this query, it takes 7 seconds to run:
and returns 10 nodes, 15 relationships
I do have indexes in the database:
Indexes
ON :NewsEvent(Date) ONLINE
ON :MarketInfo(EndTicksUTC) ONLINE
ON :MarketInfo(EndUTC) ONLINE
ON :MarketInfo(EndUTCTicks) ONLINE
ON :NewsEvent(EventID) ONLINE
ON :Ask(Id) ONLINE
ON :Component(Id) ONLINE
ON :NewsEvent(Id) ONLINE
ON :MarketInfo(Id) ONLINE
ON :Bid(Id) ONLINE
ON :NewsEvent(Id, Date) ONLINE
ON :MarketInfo(Interval) ONLINE
ON :MarketInfo(Name) ONLINE
ON :MarketInfo(StartTicksUTC) ONLINE
ON :MarketInfo(StartUTC) ONLINE
ON :MarketInfo(StartUTCTicks) ONLINE
Any thoughts on how I could speed this query up? Maybe I'm just going about this wrong? It's getting harder to sell my boss on Neo4j instead of using the "tried and true" MS SQL Server...
You made several major mistakes.
WHERE
clause from the first MATCH
clause with a superfluous WITH
clause. That caused the WHERE
clause to not filter the MATCH
clause at all -- so it was finding all instances of the (cp:CurrencyPair) - [:Currency_Pair] -> (mkt:MarketInfo)
pattern.:CurrencyPair(Name)
. (And do you really need every one of the other indexes? They introduce overhead when creating/deleting the relevant nodes.)Here is a fixed version of your query that should run faster (especially after you add the index for :CurrencyPair(Name)
):
MATCH (cp:CurrencyPair) - [:Currency_Pair] -> (mkt:MarketInfo)
WHERE cp.Name = "AUD/CAD" AND
"2012-01-10T15:50:00+00:00" <= mkt.StartUTC <= "2012-01-10T16:00:00+00:00"
MATCH (mkt) <-[:Ask_Input_Data] - (a:Ask)
MATCH (mkt) <-[:Bid_Input_Data] - (b:Bid)
RETURN mkt AS MarketInfo, cp AS CurrencyPair, a AS Ask, b AS Bid
ORDER BY MarketInfo.EndTicksUTC DESC