I need to migrate some data from Azure Storage to Sql db.
I have the following code :
class AzureDataAccessManager : IAzureDataAccessManager
{
private readonly CloudTable tableClient;
private readonly CloudStorageAccount storageAccount;
public string TableName { get; }
public AzureDataAccessManager(string connectionString, string tableName)
{
TableName = tableName ?? throw new ArgumentNullException(nameof(tableName));
if (connectionString == null) throw new ArgumentNullException(nameof(connectionString));
storageAccount = CloudStorageAccount.Parse(connectionString);
tableClient = storageAccount.CreateCloudTableClient().GetTableReference(TableName);
}
public List<T> QueryAllRecords<T>() where T : class, ITableEntity, new()
{
TableContinuationToken token = null;
var entities = new List<T>();
do
{
var queryResult = tableClient.ExecuteQuerySegmented(new TableQuery<T>(), token);
entities.AddRange(queryResult.Results);
token = queryResult.ContinuationToken;
} while (token != null);
return entities;
}
}
And I am getting all the records like this :
var result = azureTableManager.QueryAllRecords<AzureCpaDataEntity>();
The problem is I don't know, how many rows I'll have there. What if it will be too large? Maybe get via some ranges (10 thousands or whatever), but as I see there is no respective method in List.
Help me with some solutions or ideas, please!
Thanks!
The question's code already retrieves results in batches. Instead of waiting for all of them to arrive, the method could be turned into an iterator and return each batch immediatelly :
public IEnumerable<List<T>> QueryRecords<T>() where T : class, ITableEntity, new()
{
TableContinuationToken token = null;
do
{
var queryResult = tableClient.ExecuteQuerySegmented(new TableQuery<T>(), token);
token = queryResult.ContinuationToken;
yield return queryResult.Results;
} while (token != null);
}
The results should be processed in batches as well :
foreach(var batch in QueryRecords<AzureCpaDataEntity>())
{
ProcessTheBatch(batch);
}