Search code examples
javaspring-mvcsupercsv

Out of Memory Error in CSV Export methods with Spring MVC


We've been having problems with our app where it runs out of memory when producing a CSV file. Specifically on big CSV files where there are more than 10k rows. We are using Spring Boot 2.0.8, and SuperCSV 2.4.0.

What would be the correct approach to handle these cases, so that our Spring MVC API does not crash due to OutOfMemoryException.

Would SuperCSV be the cause of this problem? I'd imagine it's not but just in case.

I have been reading about @Async, would it be a good idea to use it on this method as to open a separate thread?

Suppose I have the following method in a controller:

@RequestMapping(value = "/export", method = RequestMethod.GET)
public void downloadData(HttpServletRequest request,HttpServletResponse response) throws SQLException, ManualException, IOException, NoSuchMethodException, InvocationTargetException, IllegalAccessException {

    List<?> data = null;
    data = dataFetchService.getData();

    ICsvBeanWriter csvWriter = new CsvBeanWriter(response.getWriter(), CsvPreference.STANDARD_PREFERENCE);

    //these next lines handle the header
    String[] header = getHeaders(data.get(0).getClass());
    String[] headerLocale = new String[header.length];
    for (int i = 0; i < header.length; i++)
        {
            headerLocale[i] = localeService.getLabel(this.language,header[i]);
        }

        //fix for excel not opening CSV files with ID in the first cell
        if(headerLocale[0].equals("ID")) {
            //adding a space before ID as ' ID' also helps
            headerLocale[0] = headerLocale[0].toLowerCase();
        }

    csvWriter.writeHeader(headerLocale);

    //the next lines handle the content
    for (Object line : data) {
        csvWriter.write(line, header);
    }

    csvWriter.close();
    response.getWriter().flush();
    response.getWriter().close();
}

Solution

  • The code:

    data = dataFetchService.getData();
    

    looks like it may consume a lot of memory. This list may be millions of records in size. Or if many users export at the same time,this will result in memory issues.

    Since the dataFetchService is backed by a Spring data repository you should get the amount of records it will return and then get data one Pagable at a time.

    Example : If there are 20,000 rows in the table you should get 1000 rows of data at a time 20 times and slowly build up your CSV.

    You should also request your data in some order or your CSV may end up in random order.

    Look at implementing PagingAndSortingRepository on your repository

    Example Application

    Product.java

    import javax.persistence.Entity;
    import javax.persistence.Id;
    import lombok.AllArgsConstructor;
    import lombok.Data;
    import lombok.NoArgsConstructor;
    
    @Entity
    @Data
    @NoArgsConstructor
    @AllArgsConstructor
    public class Product {
    
        @Id
        private long id;
        private String name;
    }
    

    ProductRepository.java

    import org.springframework.data.repository.PagingAndSortingRepository;
    
    public interface ProductRepository extends PagingAndSortingRepository<Product, Integer> {
    }
    

    MyRest.java

    import java.io.IOException;
    import java.util.List;
    import javax.servlet.http.HttpServletResponse;
    import lombok.RequiredArgsConstructor;
    import org.springframework.beans.factory.annotation.Autowired;
    import org.springframework.data.domain.Page;
    import org.springframework.data.domain.PageRequest;
    import org.springframework.data.domain.Pageable;
    import org.springframework.data.domain.Sort;
    import org.springframework.web.bind.annotation.RequestMapping;
    import org.springframework.web.bind.annotation.RestController;
    import org.supercsv.io.CsvBeanWriter;
    import org.supercsv.io.ICsvBeanWriter;
    import org.supercsv.prefs.CsvPreference;
    
    @RestController
    @RequiredArgsConstructor
    public class MyRest {
    
        @Autowired
        private ProductRepository repo;
    
        private final int PAGESIZE = 1000;
    
        @RequestMapping("/")
        public String loadData() {
            for (int record = 0; record < 10_000; record += 1) {
                repo.save(new Product(record, "Product " + record));
            }
            return "Loaded Data";
        }
    
        @RequestMapping("/csv")
        public void downloadData(HttpServletResponse response) throws IOException {
            response.setContentType("text/csv");
            String[] header = {"id", "name"};
            ICsvBeanWriter csvWriter = new CsvBeanWriter(response.getWriter(), CsvPreference.STANDARD_PREFERENCE);
    
            csvWriter.writeHeader(header);
    
            long numberRecords = repo.count();
            for (int fromRecord = 0; fromRecord < numberRecords; fromRecord += PAGESIZE) {
                Pageable sortedByName = PageRequest.of(fromRecord, PAGESIZE, Sort.by("name"));
                Page<Product> pageData = repo.findAll(sortedByName);
                writeToCsv(header, csvWriter, pageData.getContent());
            }
            csvWriter.close();
            response.getWriter().flush();
            response.getWriter().close();
        }
    
        private void writeToCsv(String[] header, ICsvBeanWriter csvWriter, List<Product> pageData) throws IOException {
            for (Object line : pageData) {
                csvWriter.write(line, header);
            }
        }
    
    }
    

    1) Load data by calling

    curl http://localhost:8080
    

    2) Download CSV

    curl http://localhost:8080/csv