below is the code which reads the blobs from my blob storage and then copy the contents in a tabular storage. Everything works fine now. but I know that if my file is too big then it will fail this reading and copying. I would like to know how do we handle this ideally, is it we write the file temporarily instead of storing it in memory? If yes , can someone give me example or show me how to do it in my existing code below >
public async Task<Stream> ReadStream(string containerName, string digestFileName, string fileName, string connectionString)
{
string data = string.Empty;
string fileExtension = Path.GetExtension(fileName);
var contents = await DownloadBlob(containerName, digestFileName, connectionString);
return contents;
}
public async Task<Stream> DownloadBlob(string containerName, string fileName, string connectionString)
{
Microsoft.Azure.Storage.CloudStorageAccount storageAccount = Microsoft.Azure.Storage.CloudStorageAccount.Parse(connectionString);
CloudBlobClient serviceClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = serviceClient.GetContainerReference(containerName);
CloudBlockBlob blob = container.GetBlockBlobReference(fileName);
if (!blob.Exists())
{
throw new Exception($"Unable to upload data in table store for document");
}
return await blob.OpenReadAsync();
}
private IEnumerable<Dictionary<string, EntityProperty>> ReadCSV(Stream source, IEnumerable<TableField> cols)
{
using (TextReader reader = new StreamReader(source, Encoding.UTF8))
{
var cache = new TypeConverterCache();
cache.AddConverter<float>(new CSVSingleConverter());
cache.AddConverter<double>(new CSVDoubleConverter());
var csv = new CsvReader(reader,
new CsvHelper.Configuration.CsvConfiguration(global::System.Globalization.CultureInfo.InvariantCulture)
{
Delimiter = ";",
HasHeaderRecord = true,
CultureInfo = global::System.Globalization.CultureInfo.InvariantCulture,
TypeConverterCache = cache
});
csv.Read();
csv.ReadHeader();
var map = (
from col in cols
from src in col.Sources()
let index = csv.GetFieldIndex(src, isTryGet: true)
where index != -1
select new { col.Name, Index = index, Type = col.DataType }).ToList();
while (csv.Read())
{
yield return map.ToDictionary(
col => col.Name,
col => EntityProperty.CreateEntityPropertyFromObject(csv.GetField(col.Type, col.Index)));
}
}
}
At your insistence that CsvHelper is incapable of reading from a stream connected to a blob, I threw something together:
A CSV from my disk:
On my blob storage:
In my debugger, it has record CAf255 OK by Read/GetRecord:
Or by EnumerateRecords:
Using this code:
private async void button1_Click(object sender, EventArgs e)
{
var cstr = "MY CONNECTION STRING HERE";
var bbc = new BlockBlobClient(cstr, "temp", "call.csv");
var s = await bbc.OpenReadAsync(new BlobOpenReadOptions(true) { BufferSize = 16384 });
var sr = new StreamReader(s);
var csv = new CsvHelper.CsvReader(sr, new CsvConfiguration(CultureInfo.CurrentCulture) { HasHeaderRecord = true });
var x = new X();
//try by read/getrecord (breakpoint and skip over it if you want to try the other way)
while(await csv.ReadAsync())
{
var rec = csv.GetRecord<X>();
Console.WriteLine(rec.Sid);
}
//try by await foreach
await foreach (var r in csv.EnumerateRecordsAsync(x))
{
Console.WriteLine(r.Sid);
}
}
Oh, and the class that represents a CSV record in my app (I only modeled one property, Sid, to prove the concept):
class X {
public string Sid{ get; set; }
}
Maybe dial things back a bit, start simple. One string prop in your CSV, no yielding etc, just get the file reading in OK. I didn't bother with all the header faffing either - seems to just work OK by saying "file has headers" in the options - you can see my debugger has an instance of X with a correctly populated Sid property showing the first value. I ran some more loops and they populated OK too