Search code examples
graphneo4jcypher

Neo4J query to get top N results


I have a node called Airports containing data about airports, and a node called Routes that has has tons of flights from and to airports (SourceID, DestinationID).

The data on that node looks like the following:

Airline AirlineID Source SourceID Destination DestinationID Codeshare Stops Equipment
2B 410 AER 2965 KZN 2990 0 CR2
2B 410 ASF 2966 KZN 2990 0 CR2
2B 410 ASF 2966 MRV 2962 0 CR2
2B 410 CEK 2968 KZN 2990 0 CR2

I have created relationship types (1) SOURCE & (2) DESTINATION that show direction to an other airport

  1. SourceID -> DestinationID
  2. DestinationID -> SourceID

I need to write a query to return the top 5 airports with the most flights in total (inbound / outbound).

When running the following code results are correct:

MATCH (f)-[:SOURCE]->(a:Airport) 
WITH a, COUNT(*) as Count 
RETURN a.name AS AirportName, Count AS TotalFlights 
ORDER BY Count DESC LIMIT 5;

If I try to run the following to get total flights, result are not.

MATCH (f)-[:SOURCE]->(a:Airport)<-[:DESTINATION]-(f)
WITH a, COUNT(*) as Count 
RETURN a.name AS AirportName, Count AS TotalFlights 
ORDER BY Count DESC LIMIT 5;

Can anyone help me? I am new to NEO4J.


Solution

  • Your query returns no result because you assgined the same variable f twice for a single route node only. Therefore this query will count only Routes that have SOURCE and DESTINATION relationships to the same airport (which do not make much sense usually).

    MATCH (f)-[:SOURCE]->(a:Airport)<-[:DESTINATION]-(f)
    WITH a, COUNT(*) as Count 
    RETURN a.name AS AirportName, Count AS TotalFlights 
    ORDER BY Count DESC LIMIT 5;
    

    As suggested by others, it should be enough to simply count the relationships (getting a node degree):

    MATCH (a:Airport)
    WITH a, SIZE((a)<-[:SOURCE|DESTINATION]-()) as numOfFlights 
    RETURN a.name as airportName, numOfFlights
    ORDER BY numOfFlights DESC
    LIMIT 5;