Search code examples

Terraform AWS Athena to use Glue catalog as db

I'm confused as to how I should use terraform to connect Athena to my Glue Catalog database.

I use

resource "aws_glue_catalog_database" "catalog_database" {
    name = "${var.glue_db_name}"

resource "aws_glue_crawler" "datalake_crawler" {
    database_name = "${var.glue_db_name}"
    name          = "${var.crawler_name}"
    role          = "${}"
    description   = "${var.crawler_description}"
    table_prefix  = "${var.table_prefix}"
    schedule      = "${var.schedule}" 

    s3_target {
      path = "s3://${var.data_bucket_name[0]}"
    s3_target {
      path = "s3://${var.data_bucket_name[1]}"

to create a Glue DB and the crawler to crawl an s3 bucket (here only two), but I don't know how I link the Athena query service to the Glue DB. In the terraform documentation for Athena, there doesn't appear to be a way to connect Athena to a Glue catalog but only to an S3 Bucket. Clearly, however, Athena can be integrated with Glue.

How can I terraform an Athena database to use my Glue catalog as its data source rather than an S3 bucket?


  • Our current basic setup for having Glue crawl one S3 bucket and create/update a table in a Glue DB, which can then be queried in Athena, looks like this:

    Crawler role and role policy:

    • The assume_role_policy of the IAM role needs only Glue as principal
    • The IAM role policy allows actions for Glue, S3, and logs
    • The Glue actions and resources can probably be narrowed down to the ones really needed
    • The S3 actions are limited to those needed by the crawler
    resource "aws_iam_role" "glue_crawler_role" {
      name = "analytics_glue_crawler_role"
      assume_role_policy = <<EOF
      "Version": "2012-10-17",
      "Statement": [
          "Action": "sts:AssumeRole",
          "Principal": {
            "Service": ""
          "Effect": "Allow",
          "Sid": ""
    resource "aws_iam_role_policy" "glue_crawler_role_policy" {
      name = "analytics_glue_crawler_role_policy"
      role = "${}"
      policy = <<EOF
      "Version": "2012-10-17",
      "Statement": [
          "Effect": "Allow",
          "Action": [
          "Resource": [
          "Effect": "Allow",
          "Action": [
          "Resource": [
          "Effect": "Allow",
          "Action": [
          "Resource": [

    S3 Bucket, Glue Database and Crawler:

    resource "aws_s3_bucket" "product_bucket" {
      bucket = "analytics-product-data"
      acl = "private"
    resource "aws_glue_catalog_database" "analytics_db" {
      name = "inventory-analytics-db"
    resource "aws_glue_crawler" "product_crawler" {
      database_name = "${}"
      name = "analytics-product-crawler"
      role = "${aws_iam_role.glue_crawler_role.arn}"
      schedule = "cron(0 0 * * ? *)"
      configuration = "{\"Version\": 1.0, \"CrawlerOutput\": { \"Partitions\": { \"AddOrUpdateBehavior\": \"InheritFromTable\" }, \"Tables\": {\"AddOrUpdateBehavior\": \"MergeNewColumns\" } } }"
      schema_change_policy {
        delete_behavior = "DELETE_FROM_DATABASE"
      s3_target {
        path = "s3://${aws_s3_bucket.product_bucket.bucket}/products"