Search code examples
c#algorithmcsvdata-extraction

C#: Reading a special formatted CSV


My problem is that my csv file has data stored in form of a json file. I now need to extract that data in the most efficient way to create objects that store the data.

My csv-file looks like this:

1. 2022-09-19,"{
2.   "timestamp": 41202503,
3.   "machineId": 3567,
4.   "status": 16,
5.   "isActive": false,
6.   "scanWidth": 5.0,
7. }"
8. 2022-09-19,"{
9.   "timestamp": 41202505,
10.  "machineId": 3568,
11.   "status": 5,
12.   "isActive": true,
13.   "scanWidth": 1.4,
14. }"
15. 2022-09-19,"{
16.   "timestamp": 41202507,
17.   "machineId": 3569,
18.   "status": 12,
19.   "isActive": false,
20.   "scanWidth": 6.2,
21. }"

In my project I would have class called "MachineData" with all the relevant properties. My question is now, how can I extract the data stored in this csv file?

Thanks again for helping!


Solution

  • Create a type to represent this data:

    class ResultItem
    {
        public string Date { get; set; }
        public string Timestamp { get; set; }
        public string MachineId { get; set; }
        public string Status { get; set; }
        public string IsActive { get; set; }
        public string ScanWidth { get; set; }
    }
    

    Use Regex and Newtonsoft.Json to extract the data:

    //Remove the line number
    csvText = Regex.Replace(csvText, @"\d+\. ", "");
    
    //Match items with separating the date and the json in different groups
    var matches = Regex.Matches(csvText, @"(?<date>\d{4}-\d{2}-\d{2}),""(?<json>(.+\n){6}})", RegexOptions.CultureInvariant | RegexOptions.Multiline);
    
    var results = new List<ResultItem>();
    foreach (Match match in matches)
    {
        //Getting values from json group
        var item = JsonConvert.DeserializeObject<ResultItem>(match.Groups["json"].Value);
        //Getting value from date group
        item.Date = match.Groups["date"].Value;
        results.Add(item);
    }