Search code examples
sqljsonapache-drill

Apache Drill to query JSON having whitespace in column name


My dataset is of the following format:

    {  
   "business_id":"5UmKMjUEUNdYWqANhGckJw",
   "full_address":"4734 Lebanon Church Rd\nDravosburg, PA 15034",
   "hours":{  
      "Friday":{  
         "close":"21:00",
         "open":"11:00"
      },
      "Tuesday":{  
         "close":"21:00",
         "open":"11:00"
      },
      "Thursday":{  
         "close":"21:00",
         "open":"11:00"
      },
      "Wednesday":{  
         "close":"21:00",
         "open":"11:00"
      },
      "Monday":{  
         "close":"21:00",
         "open":"11:00"
      }
   },
   "open":true,
   "categories":[  
      "Fast Food",
      "Restaurants"
   ],
   "city":"Dravosburg",
   "review_count":4,
   "name":"Mr Hoagie",
   "neighborhoods":[  

   ],
   "longitude":-79.9007057,
   "state":"PA",
   "stars":4.5,
   "latitude":40.3543266,
   "attributes":{  
      "Take-out":true,
      "Drive-Thru":false,
      "Good For":{  
         "dessert":false,
         "latenight":false,
         "lunch":false,
         "dinner":false,
         "brunch":false,
         "breakfast":false
      },
      "Caters":false,
      "Noise Level":"average",
      "Takes Reservations":false,
      "Delivery":false,
      "Ambience":{  
         "romantic":false,
         "intimate":false,
         "classy":false,
         "hipster":false,
         "divey":false,
         "touristy":false,
         "trendy":false,
         "upscale":false,
         "casual":false
      },
      "Parking":{  
         "garage":false,
         "street":false,
         "validated":false,
         "lot":false,
         "valet":false
      },
      "Has TV":false,
      "Outdoor Seating":false,
      "Attire":"casual",
      "Alcohol":"none",
      "Waiter Service":false,
      "Accepts Credit Cards":true,
      "Good for Kids":true,
      "Good For Groups":true,
      "Price Range":1
   },
   "type":"business"
}

I want to query the attributes of the dataset. However many of the attributee names have spaces in it. How do you reference an attribute name with space? Example - I want to find the average 'Price Range' for restaurants in Vegas. I tried referencing it using:

select avg(`t.attributes.Price Range`) from `mongo.274_BI`.`yelp_dataset`t where t.city = 'Las Vegas';

It returns null. There is a problem due to the space in between Price and Range. I queried the 'Parking' field with no issues. How can I solve this problem?


Solution

  • Use below query:

    select avg(t.attributes.`Price Range`) from `mongo.274_BI`.`yelp_dataset`t where t.city = 'Las Vegas';
    

    Additionally, you may require to CAST attributes.Price Range as int or double.

    Example:

    select avg(cast(t.attributes.`Price Range` as double)) from `mongo.274_BI`.`yelp_dataset`t where t.city = 'Las Vegas';