Search code examples
rrsqlite

Syntax error when selecting specific rows for SQL (in R, using RSQLITE)


I'm trying to select some rows and columns from bigquery data on help requests in NYC. I want to select five columns - date request created, city where the request was made, the agency that received the request, etc.

First, I managed to select the columns I want:

conn <- dbConnect(SQLite(),'nyc311.db')
dbListFields(conn, "requests")

df<-dbGetQuery(conn, 'SELECT "Agency", "Created Date", "Complaint Type", "City", Descriptor FROM requests)

  Agency           Created Date           Complaint Type     City                  Descriptor
1  DOHMH 01/25/2016 02:11:12 AM       Indoor Air Quality    BRONX Chemical Vapors/Gases/Odors
2   NYPD 01/25/2016 02:08:08 AM          Noise - Vehicle NEW YORK              Car/Truck Horn
3   NYPD 01/25/2016 02:07:24 AM  Noise - Street/Sidewalk NEW YORK                Loud Talking
4  CHALL 01/25/2016 02:05:00 AM    Opinion for the Mayor                              HOUSING
5    HRA 01/25/2016 02:01:46 AM Benefit Card Replacement                             Medicaid
6   NYPD 01/25/2016 01:54:56 AM         Blocked Driveway   CORONA                   No Access

How can I select from the .db file so that I get agency=NYPD, City=Bronx and Queens; and Created Date=year 2015? I tried the following but I am getting syntax errors.

df<-dbGetQuery(conn, 'SELECT "Agency", "Created Date", "Complaint Type", "City", Descriptor 
FROM requests WHERE City IN ("BRONX", "QUEENS") AND Agency="NYPD"
                   AND YEAR(Created Date)=2015')

I'm a beginner so I'm not clear about how to subset the year, since Created Date shows date and time in character format, not integer. I also noticed that the code runs except for the part YEAR(Created Date)=2015


Solution

  • There is no YEAR() function in SQLite (although MySQL has one, hence your confusion). First note that you are storing your dates as text, and also in a non ANSI format of mm/dd/yyyy. In order to compare the year of each record, you will have to extract this information using SQLite's string functions. The following should work:

    SUBSTR("Created Date", 7, 4)
    

    Note that you also need to place the Created Date column name in double quotes to escape whitespace.

    Here is the actual query I would use:

    SELECT "Agency",
           "Created Date",
           "Complaint Type",
           "City",
           "Descriptor"
    FROM requests
    WHERE City IN ('BRONX', 'QUEENS') AND
          Agency = 'DOHMH' AND
          SUBSTR("Created Date", 7, 4) = '2015'      -- compare against the string '2015'
    

    Some notes: It is convention in SQL to use single quotes for string data. You may put all your column names in double quotes, but it is only necessary if you have whitespace, keywords, etc.