Search code examples
pythonaws-glueapache-iceberg

Empty iceberg table created with PyIceberg in AWS Glue misses location and schema


I want to create an empty Iceberg table with PyIceberg in AWS Glue. The snippet below creates the table, but does not show the location and schema information in AWS Glue. What am I missing?

missing s3 location

missing schema

from pyiceberg.catalog import load_catalog
from pyiceberg.schema import Schema
from pyiceberg.types import TimestampType, StringType, NestedField

schema = Schema(
    NestedField(field_id=1, name="datetime", field_type=TimestampType()),
    NestedField(field_id=2, name="job_id", field_type=StringType()),
    NestedField(field_id=3, name="result", field_type=StringType()),
    NestedField(field_id=4, name="description", field_type=StringType()),
)

if __name__ == "__main__":
    catalog = load_catalog(
        "default",
        **{
            "type": "glue",
        },
    )

    catalog.create_table(
        identifier=f"manually_created.my_test_table",
        schema=schema,
        location="s3://my_test_bucket/my_test_table",
    )

with version pyiceberg==0.5.1.


Solution

  • Actually this is fixed in a newer pyiceberg version (> 0.5.1), see this PR as mentioned in this issue.

    pip install pyiceberg==0.6.0rc4 and location and schema appear.

    enter image description here