I currently have a table storage setup which is constantly performing insertions. There is approximately 260 million rows in the table storage.
I have set up two machine learning experiments to use a 'Reader' to read the data from the 'Azure Table'.
Experiment 1 is set to read all the rows to train the model.
Experiment 2 is set to read only the top 1,000 rows to train the model.
Experiment 1 has been running for over 5 hours with no results.
Experiment 2 has been running for over 1 hour with no results.
It is stuck on the 'Reader' process.
I do not understand why experiment 2 is taking so long. I know I have set this up right as I tested the 'Reader's with another table storage. Thanks in advance for any help/suggestions.
A lot of this will probably depend on the design of your tables. Table Storage is a key / value store (think of it as a dictionary). It has some capabilities for scanning within a partition and across partitions - but the latencies will differ greatly. Ideally if you want to query 1000 rows they should be localized within a partition. See Table Design Guide and Perf and Scalability Checklist for full details.