I am working with GTFS and trying to figure out a way to speed up my inner join. The tables "Trips" and "StopTimes" have a common column "TripID" and the tables "StopTimes" and "Stops" have a common column "StopID". The inner join retrieves all Stops that correspond to a certain Trip, however it takes about 20 seconds. Also the "StopTimes" table consists of 9 million rows while the others consists of a couple thousand. The following code is my query.
String joinQuery = "SELECT s.stop_id, s.stop_code, s.stop_name "
+ "FROM Trips as t INNER JOIN StopTimes as st ON st.trip_id = t.trip_id "
+ "INNER JOIN Stops as s ON s.stop_id = st.stop_id "
+ "WHERE t.trip_id = " + TripID + " AND t.shape_id = " + ShapeID
+ " ORDER BY st.stop_sequence";
This is the code executed upon database creation.
String CREATE_TRIPS_TABLE = "CREATE VIRTUAL TABLE " + TABLE_TRIPS + " USING FTS3("
+ KEY_T_ROUTE_ID + " INTEGER,"
+ KEY_T_NAME + " TEXT,"
+ KEY_TRIP_ID + " INTEGER,"
+ KEY_DIRECTION_ID + " INTEGER,"
+ KEY_SHAPE_ID + " INTEGER);";
String CREATE_STOPS_TABLE = "CREATE VIRTUAL TABLE " + TABLE_STOPS + " USING FTS3("
+ KEY_STOP_ID + " INTEGER,"
+ KEY_STOP_CODE + " TEXT,"
+ KEY_STOP_NAME + " TEXT,"
+ KEY_STOP_LAT + " DOUBLE,"
+ KEY_STOP_LON + " DOUBLE,"
+ KEY_STOP_WHEELCHAIR + " INTEGER);";
String CREATE_STOPTIMES_TABLE = "CREATE VIRTUAL TABLE " + TABLE_STOP_TIMES + " USING FTS3("
+ KEY_S_TRIP_ID + " INTEGER,"
+ KEY_S_STOP_ID + " INTEGER,"
+ KEY_ARRIVAL_TIME + " TEXT,"
+ KEY_STOP_SEQUENCE + " INTEGER,"
+ KEY_SHAPE_DIST_TRAVELLED + " DOUBLE);";
I followed a few suggestion I found on here and am still not able to speed up my query, any feedback is appreciated.
UPDATE I changed how my tables were created and added indexes for two of the indexes involved in the INNER JOIN query and the JOINS are now instant.
String CREATE_TRIPS_TABLE = "CREATE TABLE " + TABLE_TRIPS + "("
+ KEY_T_ID + " integer PRIMARY KEY AUTOINCREMENT,"
+ KEY_T_ROUTE_ID + " INTEGER,"
+ KEY_T_NAME + " TEXT,"
+ KEY_TRIP_ID + " INTEGER,"
+ KEY_DIRECTION_ID + " INTEGER,"
+ KEY_SHAPE_ID + " INTEGER);";
String CREATE_STOPS_TABLE = "CREATE TABLE " + TABLE_STOPS + "("
+ KEY_S_ID + " integer PRIMARY KEY AUTOINCREMENT,"
+ KEY_STOP_ID + " INTEGER,"
+ KEY_STOP_CODE + " TEXT,"
+ KEY_STOP_NAME + " TEXT,"
+ KEY_STOP_LAT + " DOUBLE,"
+ KEY_STOP_LON + " DOUBLE,"
+ KEY_STOP_WHEELCHAIR + " INTEGER);";
String CREATE_STOPTIMES_TABLE = "CREATE TABLE " + TABLE_STOP_TIMES + "("
+ KEY_ST_ID + " integer PRIMARY KEY AUTOINCREMENT,"
+ KEY_S_TRIP_ID + " INTEGER,"
+ KEY_S_STOP_ID + " INTEGER,"
+ KEY_ARRIVAL_TIME + " TEXT,"
+ KEY_STOP_SEQUENCE + " INTEGER,"
+ KEY_SHAPE_DIST_TRAVELLED + " DOUBLE);";
db.execSQL(CREATE_ROUTES_TABLE);
db.execSQL(CREATE_TRIPS_TABLE);
db.execSQL(CREATE_STOPS_TABLE);
db.execSQL(CREATE_STOPTIMES_TABLE);
db.execSQL("CREATE INDEX T_id ON Trips(trip_id)");
db.execSQL("CREATE INDEX S_t_id ON StopTimes(trip_id)");
I see you're using SQLite's FTS3 extension, which is designed to improve the performance of full-text searches. I suspect that's actually working against you for the kind of query you're issuing (and for the kinds of queries normally run against GTFS data).
I recommend you start by creating a conventional relational database, with indices on the appropriate columns, and testing the performance of your query against that. SQLite is certainly capable of good performance when used in this type of application so I think you'll be pleasantly surprised. Once you've demonstrated success with conventional techniques you can look into other ways to make the query run faster, if necessary.
Finally, note SQLite does not allow indices on virtual tables, which I suspect is the specific reason your query is taking so long to complete right now.