Search code examples
pythonazureazure-table-storageazure-tablequery

Table Storage SDK


I am trying to load some data from a CSV file to Azure table storage row by row using Python. String columns are getting inserted directly but the date column mentioned in the source in the format 2018-02-18T11:29:12.000Z is still loaded as string. This means I am unable to query the records using date column.

Can someone tell me if there is a way to create an entity definition (datatype for columns) for the table and use it to load the records in order to avoid dates loaded with string type?


Solution

  • I tried to reproduce your issue but failed. I loaded my csv file to Azure Table Storage and the data column loaded as DataTime Type.

    You could refer to my code as below:

    my csv file:

    'tasksSeattle','001','jay1',100,2018-02-18T11:29:12.000Z
    'tasksSeattle','002','jay2',100,2018-02-18T11:29:12.000Z
    'tasksSeattle','003','jay3',100,2018-02-18T11:29:12.000Z
    'tasksSeattle','004','jay4',100,2018-02-18T11:29:12.000Z
    'tasksSeattle','005','jay5',100,2018-02-18T11:29:12.000Z
    

    my python code:

    from azure.cosmosdb.table.tableservice import TableService
    from azure.cosmosdb.table.models import Entity
    import csv
    import sys
    import codecs
    
    table_service = TableService(connection_string='***')
    
    reload(sys)
    sys.setdefaultencoding('utf-8')
    filename = "E:/jay.csv"
    
    with codecs.open(filename, 'rb', encoding="utf-8") as f_input:
        csv_reader = csv.reader(f_input)
        for row in csv_reader:
            task = Entity()
            task.PartitionKey = row[0]
            task.RowKey = row[1]
            task.description = row[2]
            task.priority = row[3]
            task.logtime = row[4]
            table_service.insert_entity('tasktable', task)
    

    load result:

    enter image description here

    Hope it helps you.


    Update Answer:

    If you observe the Data type options box in the screenshot above, it's not hard to see that only those 8 types are supported by the Table Service Data Model:

    • Edm.Binary
    • Edm.Boolean
    • Edm.DateTime
    • Edm.Double
    • Edm.Guid
    • Edm.Int32
    • Edm.Int64
    • Edm.String

    You could use entity.x = EntityProperty(EdmType.STRING, 'y') function which is mentioned here to define data types as you want.

    Please refer to my sample code as below:

    with codecs.open(filename, 'rb', encoding="utf-8") as f_input:
        csv_reader = csv.reader(f_input)
        for row in csv_reader:
            task = Entity()
            task.PartitionKey = row[0]
            task.RowKey = row[1]
            task.description = row[2]
            task.priority = EntityProperty(EdmType.INT32, row[3])
            task.logtime = EntityProperty(EdmType.DATETIME, row[4])
    
            table_service.insert_entity('tasktable', task)
    

    Just for summary:

    We could convert the string to datetime and get the date fragments as below:

    task.startDateTime = datetime(startDateFrag.year,startDateFrag.month,startDateFrag.day,startDateFrag.hour, startDateFrag.minute,startDateFrag.second)