Search code examples
azureazure-table-storageazure-machine-learning-service

Azure Machine Learning Reader + Table Storage


Duplicating: https://social.msdn.microsoft.com/Forums/azure/en-US/6560c2d6-9836-41a1-8076-caf0d514222a/azure-machine-learning-reader-table-storage?forum=MachineLearning

I currently have a table storage setup which is constantly performing insertions. There is approximately 260 million rows in the table storage.

I have set up two machine learning experiments to use a 'Reader' to read the data from the 'Azure Table'.

Experiment 1 is set to read all the rows to train the model.

Experiment 2 is set to read only the top 1,000 rows to train the model.

Experiment 1 has been running for over 5 hours with no results.

Experiment 2 has been running for over 1 hour with no results.

It is stuck on the 'Reader' process.

I do not understand why experiment 2 is taking so long. I know I have set this up right as I tested the 'Reader's with another table storage. Thanks in advance for any help/suggestions.


Solution

  • A lot of this will probably depend on the design of your tables. Table Storage is a key / value store (think of it as a dictionary). It has some capabilities for scanning within a partition and across partitions - but the latencies will differ greatly. Ideally if you want to query 1000 rows they should be localized within a partition. See Table Design Guide and Perf and Scalability Checklist for full details.