Search code examples
azure-blob-storageazure-stream-analytics

What do the Azure Stream Analytics blob output schemaHashcode, Guid, and Number variables refer to?


In the Blob Output Configuration documentation for Azure Stream Analytics under "Path pattern" it is stated:

File naming uses the following convention:

{Path Prefix Pattern}/schemaHashcode_Guid_Number.extension

Example output files:

  • Myoutput/20170901/00/45434_gguid_1.csv
  • Myoutput/20170901/01/45434_gguid_1.csv

However, the following referenced variables do not appear to be explained in the documentation:

  • schemaHashcode
  • Guid
  • Number

What do these variables refer to, and when can they change?


Solution

  • The GUID refers to the internal writer's uid. This is unique for each writer that gets created to write to the blob file. New writers are created based on partition and in the event of exceptions when the writer crashes. SchemaHashcode's value changes when a new schema in the incoming stream is observed. Hence you notice new files when the schema changes. Number refers to the index of the Blob block counter.