Search code examples
c#arrayscsvremove-if

var [][] array remove specific words


I got a little Problem. I have a .csv with "NaN" values and doubles (0.6034 for example) and I am trying to read just the doubles of the CSV into an array[y][x].

Currently, i read the whole .csv, but I can not manage to remove all "NaN" values afterward. (It should parse through the CSV and just add the Numbers to an array[y][x] and leave all "NaN" out)

My current Code:

 var rows = File.ReadAllLines(filepath).Select(l => l.Split(';').ToArray()).ToArray(); //reads WHOLE .CSV to array[][]


        int max_Rows = 0, j, rank;
        int max_Col = 0;
        foreach (Array anArray in rows)
        {
            rank = anArray.Rank;
            if (rank > 1)
            {
                 // show the lengths of each dimension
                for (j = 0; j < rank; j++)
                {

                }
            }
            else
            {

            }
            // show the total length of the entire array or all dimensions

            max_Col = anArray.Length; //displays columns
            max_Rows++;  //displays rows
        }

I tried the search but couldn't really find anything that helped me. I know this is probably really easy but I am new to C#.

The .CSV and the desired outcome:

NaN;NaN;NaN;NaN
NaN;1;5;NaN
NaN;2;6;NaN
NaN;3;7;NaN
NaN;4;8;NaN
NaN;NaN;NaN;NaN

This is a sample .csv i have. I should have been more clear, sorry! There is a NaN in every line. and i want it to display like this:

1;5
2;6
3;7
4;8

This is just a sample of the .csv the real csv has arround 60.000 Values... I need to get the input with [y][x] for example [0][0] should display "1" and [2][1] should displays "7" and so on.

Thanks again for all your help!


Solution

  • If you want to remove all the lines that contain NAN (typical task for CSV - clearing up all incomplete lines), e.g.

      123.0; 456; 789
        2.1; NAN;  35     <- this line should be removed (has NaN value)
         -5;   3;  18
    

    You can implement it like this

      double[][] data = File
        .ReadLines(filepath)
        .Select(line => line.Split(new char[] {';', '\t'},
                                   StringSplitOptions.RemoveEmptyEntries))
        .Where(items => items  // Filter first...
           .All(item => !string.Equals("NAN", item, StringComparison.OrdinalIgnoreCase)))
        .Select(items => items
           .Select(item => double.Parse(item, CultureInfo.InvariantCulture))
           .ToArray()) // ... materialize at the very end
        .ToArray();
    

    Use string.Join to display rows:

     string report = string.Join(Environment.NewLine, data
       .Select(line => string.Join(";", line)));
    
     Console.Write(report);
    

    Edit: The actual problem is to take 2nd and 3rd complete columns only from the CSV:

    NaN;NaN;NaN;NaN
    NaN;1;5;NaN
    NaN;2;6;NaN
    NaN;3;7;NaN
    NaN;4;8;NaN
    NaN;NaN;NaN;NaN
    

    desired outcome is

    [[1, 5], [2, 6], [3, 7], [4, 8]]
    

    implmentation:

    double[][] data = File
      .ReadLines(filepath)
      .Select(line => line
         .Split(new char[] {';'},
                StringSplitOptions.RemoveEmptyEntries)
         .Skip(1) 
         .Take(2)
         .Where(item => !string.Equals("NAN", item, StringComparison.OrdinalIgnoreCase))
         .ToArray())
      .Where(items => items.Length == 2)
      .Select(items => items
        .Select(item => double.Parse(item, CultureInfo.InvariantCulture))
        .ToArray())
      .ToArray();
    

    Tests

    // 1
    Console.Write(data[0][0]);
    // 5
    Console.Write(data[0][1]);
    // 2
    Console.Write(data[1][0]);
    

    All values in one go:

    string report = string.Join(Environment.NewLine, data
       .Select(line => string.Join(";", line)));
    
    Console.Write(report);
    

    Outcome:

    1;5
    2;6
    3;7
    4;8 
    

    Edit 2: if you want to extract non NaN values only (please, notice that the initial CSV structure will be ruined):

    1;2;3              1;2;3
    NAN;4;5            4;5   <- please, notice that the structure is lost
    6;NAN;7        ->  6;7
    8;9;NAN;           8;9
    NAN;10;NAN         10
    NAN;NAN;11         11 
    

    then

    double[][] data = File
      .ReadLines(filepath)
      .Select(line => line
         .Split(new char[] {';'},
                StringSplitOptions.RemoveEmptyEntries)
         .Where(item => !string.Equals("NAN", item, StringComparison.OrdinalIgnoreCase)))
      .Where(items => items.Any()) 
      .Select(items => items
        .Select(item => double.Parse(item, CultureInfo.InvariantCulture))
        .ToArray())
      .ToArray();