Search code examples
javacsvdata-cleaning

How to remove row which contains blank cell from csv file in Java


I'm trying to do data cleaning on dataset. by data cleaning i meant removing the row which containes NaN or duplicates values or empty cell. here is my code

dataset look like this:

Sno Country     noofDeaths
1                32432
2    Pakistan     NaN
3    USA          3332
3    USA          3332

excel file image: enter image description here

public class data_reader {
    String filePath="src\\abc.csv";
    public void readData() {
         BufferedReader br = null;
            String line = "";
          
            HashSet<String> lines = new HashSet<>();
            try {
                br = new BufferedReader(new FileReader(filePath));
                while ((line = br.readLine()) != null) {
                    if(!line.contains("NaN") || !line.contains("")) {
                        if (lines.add(line)) {
                            System.out.println(line);
                        }   
                    }
                }
            } catch (FileNotFoundException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            } finally {
                if (br != null) {
                    try {
                        br.close();
                    } catch (IOException e) {
                        e.printStackTrace();
                    }
                }
            }
    }   
    }
    
    

it is working fine for NaN values and duplicates rows but not for empty cell, please help how to do this.

!line.contains("")

this is not working.


Solution

  • Condition !line.contains("") - doesn't make sence because every string contains empty string.

    General suggestions:

    • don't hard code file-path, code must be reusable;
    • use try with resources;
    • camel-case names.
    public class DataReader {
        public static void main(String[] args) {
            new DataReader().readData("src\\abc.csv");
        }
    
        public void readData(String filePath) {
            try(BufferedReader br = new BufferedReader(new FileReader(filePath))) {
                HashSet<String> lines = new HashSet<>();
                String line = null;
                while ((line = br.readLine()) != null) {
                    if(!line.contains("NaN")) {
                        for (String cell: line.split(",")) {
                            if (!cell.isBlank()&&lines.add(cell)) {
                                System.out.print(cell + " ");
                            }
                        }
                    }
                    System.out.println();
                }
            }  catch (IOException e) {
                e.printStackTrace();
            }
        }
    }