Search code examples
sqlgoogle-bigquerygoogle-cloud-vertex-ai

Issue with timestamps in BigQuery when importing to Vertex AI Retail


I am trying to import my product data into the catalogue of Vertex AI. I have tried to follow the schema given here. It says the publishTime should be in the Timestamp format (e.g. "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z").

This is my schema:

[
  {
    "name": "id",
    "type": "STRING",
    "mode": "REQUIRED"
  },
  {
    "name": "type",
    "type": "STRING",
    "mode": "REQUIRED"
  },
  {
    "name": "categories",
    "type": "STRING",
    "mode": "REPEATED"
  },
  {
    "name": "title",
    "type": "STRING",
    "mode": "REQUIRED"
  },
  {
    "name": "brands",
    "type": "STRING",
    "mode": "REQUIRED"
  },
  {
    "name": "priceInfo",
    "type": "RECORD",
    "mode": "REQUIRED",
    "fields": [
      {
        "name": "currencyCode",
        "type": "STRING",
        "mode": "REQUIRED"
      },
      {
        "name": "price",
        "type": "FLOAT",
        "mode": "REQUIRED"
      }
    ]
  },
  {
    "name": "rating",
    "type": "INTEGER",
    "mode": "NULLABLE"
  },
  {
    "name": "publishTime",
    "type": "TIMESTAMP",
    "mode": "NULLABLE"
  },
  {
    "name": "promotions",
    "type": "BOOLEAN",
    "mode": "NULLABLE"
  }
]

And this is my query:

INSERT INTO `project.products.test` (
    id,
    type,
    categories,
    title,
    brands,
    priceInfo,
    rating,
    publishTime,
    promotions
) VALUES (
    'IXbgQctuTx9',
    'PRIMARY',
    ARRAY["Shopping", "Fashion", "Women"],
    'GSM W',
    'onitsukatiger',
    STRUCT("GBP" AS currencyCode, 90 AS price),  -- Price info with currency and price
    NULL,  -- Rating
    TIMESTAMP('2023-10-09T00:00:00Z'),  -- Publish time with proper format
    FALSE  -- Promotions
);

Which is then inserted into the table successfully. product table

However, when I try to import the table to Vertex, I get the following error:

{
  "code": 3,
  "message": "Invalid value at 'publish_time' (type.googleapis.com/google.protobuf.Timestamp): Field 'publishTime', Illegal timestamp format; timestamps must end with 'Z' or have a valid timezone offset.",
  "details": [
    {
      "@type": "type.googleapis.com/google.rpc.ResourceInfo",
      "resourceName": "gs://530498506955_eu_import_product/staging11436364033615815385/retail_products_000000000004.json:1"
    }
  ]
}

I would greatly appreciate if anyone could help me fix this issue.


Solution

  • As it turns out, Vertex wants the publishTime to be in STRING format, meaning the schema should be:

    [
      {
        "name": "id",
        "type": "STRING",
        "mode": "REQUIRED"
      },
      {
        "name": "type",
        "type": "STRING",
        "mode": "REQUIRED"
      },
      {
        "name": "categories",
        "type": "STRING",
        "mode": "REPEATED"
      },
      {
        "name": "title",
        "type": "STRING",
        "mode": "REQUIRED"
      },
      {
        "name": "brands",
        "type": "STRING",
        "mode": "REQUIRED"
      },
      {
        "name": "priceInfo",
        "type": "RECORD",
        "mode": "REQUIRED",
        "fields": [
          {
            "name": "currencyCode",
            "type": "STRING",
            "mode": "REQUIRED"
          },
          {
            "name": "price",
            "type": "FLOAT",
            "mode": "REQUIRED"
          }
        ]
      },
      {
        "name": "rating",
        "type": "INTEGER",
        "mode": "NULLABLE"
      },
      {
        "name": "publishTime",
        "type": "STRING",
        "mode": "NULLABLE"
      },
      {
        "name": "promotions",
        "type": "BOOLEAN",
        "mode": "NULLABLE"
      }
    ]
    

    And the query should be:

    INSERT INTO `project.products.test` (
        id,
        type,
        categories,
        title,
        brands,
        priceInfo,
        rating,
        publishTime,
        promotions
    ) VALUES (
        'IXbgQctuTx9',
        'PRIMARY',
        ARRAY["Shopping", "Fashion", "Women"],
        'GSM W',
        'onitsukatiger',
        STRUCT("GBP" AS currencyCode, 90 AS price),  -- Price info with currency and price
        NULL,  -- Rating
        '2023-10-09T00:00:00Z',  -- Publish time with proper format
        FALSE  -- Promotions
    );