Search code examples
javacsvfilewriteropencsvfile-writing

What are the advantages to writing data values to a CSV as an integer type rather than a string?


I'm collaborating on a project where I currently have built a program writing data to a CSV file in a string format. My partner on the project is saying he thinks the product would be more usable if it was written in an integer format, while I've been arguing that our visualization features could simply run a parseInt when it reads the CSV data.

I wanted to ask this on here to get some information on what can be gained by writing to a file using a primitive data type rather than a string. Java really seems to be built to write to CSV as a string, but he claims it would be more efficient to write the data as an int. Thoughts?

This is really more of a conceptual question, but I'll include the code I'm using to generate the data table in case context matters.

  //Snippet only
  private void elementLocator() {
        //Declare ArrayList to hold values
        data = new ArrayList<ArrayList<String>>();
        
        //Build data table
        try {
            //Unique xpath string
            String prefix = "//*[@id=\"main_table_countries_today\"]/tbody[1]/tr[";
            int j = 2;
            System.out.println("Retrieving data...");
            for(int i = 1; i <= 222; i ++) {
                try {
                    //Initialize array to fill in to data row by row
                    ArrayList<String> parser = new ArrayList<String>();
                    for(j = 2; j <= 13; j ++) {
                        parser.add(driver.findElement(By.xpath(prefix + i + "]/td[" + j + "]")).getText());
                    }
                    //Use a boolean indicator to skip any row that has a blank 1st column
                    String w = parser.get(0);
                    boolean v = w.isEmpty();
                    if(v) {
                        continue;
                    }
                    else {
                        data.add(parser);
                    }
                //Catch errors
                } catch (Exception e) {
                    e.printStackTrace();;
                    continue;
                }
        }
  }

  public void makeCSV() throws IOException {
        //Create output file only if it does not already exist
        EST est = new EST();
        //Pull year, month, day for file name
        String dt = est.getDate();
        f = new File(home + "\\Climate Dev Pegasus\\Data\\Worldometer\\" + dt + ".csv");
        if(!f.exists()) {
            try { 
                //Create FileWriter object with file as parameter 
                CSVWriter writer = new CSVWriter(new FileWriter(f, true));
                //Write headers
                String[] headers = "Country/Region, Total Cases, New Cases, Total Deaths, New Deaths, Total Recovered, Active Cases, Serious Cases, Tot Cases/1M pop, Deaths/1M pop, Total Tests, Tests/1M pop".split(",");
                writer.writeNext(headers);
                writer.flush();
                writer.close();
                //Give full modification permissions to file
                SetPermissions sp = new SetPermissions();
                sp.ChangePermissions(f);
            }catch (Exception ex) {
                ex.printStackTrace();
            }
        }
        else {
        }
        path = Paths.get(home + "\\Climate Dev Pegasus\\Data\\Worldometer\\" + dt + ".csv");
        
        //Write data to file, allowing commas
        FileWriter csvWriter = new FileWriter(f,true);
        for(ArrayList<String> x : data) {
            for(String y : x) {
                String z = appendDQ(y);
                //int value = Integer.parseInt(z);
                csvWriter.append(z);
                csvWriter.append(",");
            }
            csvWriter.append("\n");
        }
        System.out.println("Data successfully written to file.");
        csvWriter.close();
}  

Solution

  • Here is an answer, which came to my mind while thinking about this problem:

    Well, I think that's a very fundamental question.

    First of all, the most important thing. The program should be easy to understand for other developers and at the same time it should be powerful enough that the end user is not disturbed during use, for example by long loading times. But to find the answer to your question you should go further. Should the program run on a PC or an embedded system? Since Java already implements the string class by default, it is quite powerful from scratch. Of course, an integer is always more performant, since it is a primitive data type! Now I assume that the given program should run on a PC or server and not on an embedded system, because for this case a language like C would be much more suitable. I think in this case it actually makes more sense to use strings, because the comfortable code of Java saves time in development and makes the code more readable for other developers. Furthermore, the use of strings could lead to the need for additional methods to convert values into a format that can be read by the later program. However, this would then cancel out any performance advantage.

    Last but not least, one could refer to an interesting example at this point. If you export an Excel table in CSV format, files with long strings will be constructed there as well. And even there, any loading times are not disturbing for the end user (in my opinion)...