Search code examples
javaspring-bootmemorycpu-usage

Downloading File as ByteArray Vs Resourse which is better with less memory usage


One of the download we are using byte[] as return type

@PostMapping("/downloadReport")
    public ResponseEntity<byte[]> downlodReport(@RequestBody Request request) {
        byte[] fileContents = someByteContent;
        ResponseEntity.ok().header(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=\"" + fileName + "\"")
                .contentType(MediaType.parseMediaType("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"))
                .body(fileContents)
        
    }

We are seeing more than 100% CPU while downloading the files, when I googled in most of the examples I found with Resource as a return type, does the file loads into memory when we return as byte[] which is consuming memory while downloading the file? does it get's better in terms of memory usage if I use Resource as a return type and use InputStreamResource resource = new InputStreamResource(new FileInputStream(file));

any suggestions are appreciated


Solution

  • When you create a byte[] you are telling Java to "create me a list of bytes in memory, and return the pointer to those bytes." Specifically, the object reference gets allocated on the heap and the pointer is stored in the stack.

    So, thusly, when you use a byte[] to store a file, the whole contents of the file is required to be loaded into memory.

    When you return byte[], it also loads the whole contents from memory. I would say that in most use cases, this is not a problem, but it largely depends on the purpose and size of those files.

    If the file is something that is going to be regularly large, I would try something like this:

    InputStreamResource inputStreamResource = new InputStreamResource(inputStream); httpHeaders.setContentLength(contentLengthOfStream); return new ResponseEntity(inputStreamResource, httpHeaders, HttpStatus.OK);

    java - return-a-stream-with-spring-mvcs-responseentity.

    But I think some testing needs to be done to rule at what point returning a byte[] is preferable and at what point returning the stream is preferable. I'll do some testing and get back to you.

    Now, another option you might want to explore is not storing files in a database, generating them and storing them on the filesystem, and then just returning a forward:/ request in the view to redirect the user to the file, this way, you offload the responsibility of serving the file to the web server, which is often better suited for handling static resources efficiently anyway.

    Quick Edit:

    Wanted to add some source code for you to play with:

    package com.alvonellos.interview.controller;
    
    import lombok.extern.java.Log;
    import org.springframework.core.io.InputStreamResource;
    import org.springframework.http.HttpHeaders;
    import org.springframework.http.MediaType;
    import org.springframework.http.ResponseEntity;
    import org.springframework.web.bind.annotation.GetMapping;
    import org.springframework.web.bind.annotation.RequestParam;
    import org.springframework.web.bind.annotation.RestController;
    
    import java.io.IOException;
    import java.io.InputStream;
    import java.util.Random;
    
    @Log
    @RestController
    public class FileDownloadControllerBytesVsStream {
        public static class RandomByteArrayInputStream extends InputStream {
            private final long length;
            private final Random random = new Random();
            private long bytesRead = 0;
    
            public RandomByteArrayInputStream(long length) {
                this.length = length;
            }
    
            @Override
            public int read() throws IOException {
                if (bytesRead < length) {
                    bytesRead++;
                    return random.nextInt(256); // Generating a random byte (0 to 255)
                } else {
                    return -1; // Signal the end of the stream
                }
            }
        }
    
        @GetMapping("/downloadBytes")
            public ResponseEntity<byte[]> downloadBytes(@RequestParam long length) {
                byte[] fileContents = generateRandomBytes(length);
    
                HttpHeaders headers = new HttpHeaders();
                headers.add(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=randomFile.txt");
                headers.setContentType(MediaType.APPLICATION_OCTET_STREAM);
    
                return ResponseEntity.ok()
                        .headers(headers)
                        .body(fileContents);
            }
    
            @GetMapping("/downloadStream")
            public ResponseEntity<InputStreamResource> downloadStream(@RequestParam long length) {
                HttpHeaders headers = new HttpHeaders();
                headers.add(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=randomFile.txt");
                headers.setContentType(MediaType.APPLICATION_OCTET_STREAM);
    
                InputStreamResource inputStreamResource = generateRandomStream(length);
    
                return ResponseEntity.ok()
                        .headers(headers)
                        .contentLength(length)
                        .contentType(MediaType.APPLICATION_OCTET_STREAM)
                        .body(inputStreamResource);
            }
    
            private byte[] generateRandomBytes(long length) {
                try {
                    final byte[] fileContents;
    
                    if (length < 0 || length > Integer.MAX_VALUE) {
                        fileContents = new byte[Integer.MAX_VALUE];
                    } else {
                        fileContents = new byte[(int) length];
                    }
    
                    new Random().nextBytes(fileContents);
                    return fileContents;
                } catch (Exception e) {
                    log.info("Error generating random bytes");
                    return new byte[] {0};
                }
            }
    
            private InputStreamResource generateRandomStream(long length) {
                RandomByteArrayInputStream inputStream = new RandomByteArrayInputStream(length);
                return new InputStreamResource(inputStream);
            }
    
    }
    

    And the unit tests:

        package com.alvonellos.interview.controller;
    
    import lombok.val;
    import org.junit.Before;
    import org.junit.Test;
    import org.springframework.boot.test.context.SpringBootTest;
    import org.springframework.test.web.servlet.MockMvc;
    import org.springframework.test.web.servlet.setup.MockMvcBuilders;
    
    import java.util.HashMap;
    import java.util.SortedMap;
    import java.util.TreeMap;
    import java.util.logging.Logger;
    
    import static org.springframework.test.web.servlet.request.MockMvcRequestBuilders.get;
    import static org.springframework.test.web.servlet.result.MockMvcResultHandlers.print;
    import static org.springframework.test.web.servlet.result.MockMvcResultMatchers.content;
    import static org.springframework.test.web.servlet.result.MockMvcResultMatchers.status;
    
    @SpringBootTest
    public class FileDownloadControllerBytesVsStreamTest {
        private final MockMvc mockMvc = MockMvcBuilders.standaloneSetup(new FileDownloadControllerBytesVsStream()).build();
    
        private final Logger log = Logger.getLogger(this.getClass().getName());
    
        static final long MAX_RESPONSE_TIME_MSEC = 2000;
    
        @Before
        public void setUp() throws Exception {
            long start = System.currentTimeMillis();
            mockMvc.perform(get("/downloadBytes").param("length", String.valueOf(1000)));
            long end = System.currentTimeMillis();
            log.info("Without warmup downloadbytes took " + (end - start) + " ms");
    
    
            long start2 = System.currentTimeMillis();
            mockMvc.perform(get("/downloadStream").param("length", String.valueOf(1000)));
            long end2 = System.currentTimeMillis();
            log.info("Without warmup downloadstream took " + (end2 - start2) + " ms");
    
            log.info("Warming up to 100 bytes");
            for (int i = 0; i < 100; i++) {
                mockMvc.perform(
                                get("/downloadBytes")
                                        .param("length", String.valueOf(i)))
                        .andExpect(status().isOk())
                        .andExpect(content().contentType("application/octet-stream"))
                        .andReturn();
    
                mockMvc.perform(
                                get("/downloadStream")
                                        .param("length", String.valueOf(i)))
                        .andExpect(status().isOk())
                        .andExpect(content().contentType("application/octet-stream"))
                        .andReturn();
            }
    
            start = System.currentTimeMillis();
            mockMvc.perform(get("/downloadBytes").param("length", String.valueOf(1000)));
            end = System.currentTimeMillis();
            log.info("After warmup downloadbytes took " + (end - start) + " ms");
    
    
            start2 = System.currentTimeMillis();
            mockMvc.perform(get("/downloadStream").param("length", String.valueOf(1000)));
            end2 = System.currentTimeMillis();
            log.info("After warmup downloadstream took " + (end2 - start2) + " ms");
        }
    
        @Test
        public void downloadBytes_andShouldReturnSingleElement() throws Exception {
            log.info("Testing downloadBytes");
            val result = mockMvc.perform(
                            get("/downloadBytes")
                                    .param("length", String.valueOf(1)))
                    .andExpect(status().isOk())
                    .andExpect(content().contentType("application/octet-stream"))
                    .andDo(print())
                    .andReturn();
    
            assert (result.getResponse().getContentAsByteArray().length == 1);
        }
    
        @Test
        public void downloadStream_andShouldReturnSingleElement() throws Exception {
            log.info("Testing downloadStream");
    
            val result = mockMvc.perform(
                            get("/downloadStream")
                                    .param("length", String.valueOf(1)))
                    .andExpect(status().isOk())
                    .andExpect(content().contentType("application/octet-stream"))
                    .andDo(print())
                    .andReturn();
    
            assert (result.getResponse().getContentAsByteArray().length == 1);
        }
    
        @Test
        public void testDownloadBytesAndFindMaxLength() {
            long initialLength = 2;
            long maxLength = Long.MAX_VALUE;  // Adjust as needed
            int maxIterations = 30;  // Adjust as needed
            int iter = 0;
            long length = initialLength;
            long previousResponseTimeBytes = Long.MAX_VALUE;
            TreeMap<Long, Double> dpdtMap = new TreeMap<>();
            TreeMap<Long, Double> ddpdtMap = new TreeMap<>();
    
            outOfRam:
            {
                do {
                    try {
                        long startTimeBytes = System.currentTimeMillis();
    
                        mockMvc.perform(
                                        get("/downloadBytes")
                                                .param("length", String.valueOf(length)))
                                .andExpect(status().isOk())
                                .andExpect(content().contentType("application/octet-stream"))
                                .andReturn();
    
                        long endTimeBytes = System.currentTimeMillis();
                        long responseTimeBytes = endTimeBytes - startTimeBytes;
    
                        System.out.println(String.format("Bytes - length: %d, responseTime: %d", length, responseTimeBytes));
    
                        if (responseTimeBytes > previousResponseTimeBytes) {
                            System.out.println("Bytes - Performance starting to degrade. Halving the length.");
                            // Adjust the length using binary search
                            maxLength = length;
                            length = (length / 2) + (maxLength / 2);
                        } else {
                            System.out.println("Bytes - Performance is good. Squaring the length.");
                            // Adjust the length using binary search
                            maxLength = length;
                            length = (length * length) / (maxLength / 2);
                        }
    
                        previousResponseTimeBytes = responseTimeBytes;
    
                        iter++;dpdtMap.put(length, (double) responseTimeBytes / length);
                        ddpdtMap.put(length, ((double) responseTimeBytes / length) - ((double) previousResponseTimeBytes / length) / (length - (length / 2)));
                    } catch (Exception e) {
                        System.out.println("Bytes - Caught exception: " + e.getMessage());
                        break outOfRam;
                    }
                } while ((iter < maxIterations && length > 0));
                System.out.println(String.format("Bytes - Final time: %d, Length: %d", previousResponseTimeBytes, length));
                System.err.println("Best length: " + dpdtMap.lastEntry().getKey());
                System.err.println("Derivatives");
                System.err.println(String.format("%8s, %9s, %2s", "length", "ddpdt", "=0"));
                ddpdtMap.forEach((key, value) -> {
                    System.err.println(String.format("%8d, %1.8f %2b", key, value, value.compareTo(0.00) == 0));
                });
            }
        }
    
        @Test
        public void testDownloadStreamAndFindMaxLength() {
    
            long initialLength = 2;
            long maxLength = Long.MAX_VALUE;  // Adjust as needed
            int maxIterations = 30;  // Adjust as needed
            int iter = 0;
            long length = initialLength;
            long previousResponseTimeStream = Long.MAX_VALUE;
            double dpdt = Double.MAX_VALUE;
            double ddpdt = Double.MAX_VALUE;
            TreeMap<Long, Double> dpdtMap = new TreeMap<>();
            TreeMap<Long, Double> ddpdtMap = new TreeMap<>();
    
            outOfRam:
            {
                do {
                    try {
                        long startTimeStream = System.currentTimeMillis();
    
                        mockMvc.perform(
                                        get("/downloadStream")
                                                .param("length", String.valueOf(length)))
                                .andExpect(status().isOk())
                                .andExpect(content().contentType("application/octet-stream"))
                                .andReturn();
    
                        long endTimeStream = System.currentTimeMillis();
                        long responseTimeStream = endTimeStream - startTimeStream;
    
                        System.err.println(String.format("Stream - length: %d, responseTime: %d", length, responseTimeStream));
    
                        if (responseTimeStream > (previousResponseTimeStream+1)) {
                            System.out.println("Stream - Performance starting to degrade. Halving the length.");
                            // Adjust the length using binary search
                            maxLength = length;
                            length = (length / 2) + (maxLength / 2);
                        } else {
                            System.out.println("Stream - Performance is good. Squaring the length.");
                            // Adjust the length using binary search
                            maxLength = length;
                            length = (length * length) / (maxLength / 2);
                        }
    
                        previousResponseTimeStream = responseTimeStream;
    
                        iter++; dpdtMap.put(length, (double) responseTimeStream / length);
                        ddpdtMap.put(length, ((double) responseTimeStream / length) - ((double) previousResponseTimeStream / length) / (length - (length / 2)));
                    } catch (Exception e) {
                        System.err.println("Stream - Caught exception: " + e.getMessage());
                        break outOfRam;
                    }
                } while ((iter < maxIterations && length > 0));
                System.err.println(String.format("Stream - Final time: %d, Length: %d", previousResponseTimeStream, length));
                System.err.println("Best length: " + dpdtMap.lastEntry().getKey());
                System.err.println("Derivatives");
                System.err.println(String.format("%8s, %9s, %2s", "length", "ddpdt", "=0"));
                ddpdtMap.forEach((key, value) -> {
                    System.err.println(String.format("%8d, %1.8f %2b", key, value, value.compareTo(0.00) == 0));
                });
            }
        }
    }
    

    From my testing, I found that the second derivative is monotonically increasing on the interval [0, 32), zero at [32, 64], and increasing again from (64, inf) on both download stream and download bytes.

    However, I did notice a big difference in the max lengths achieved by both. In a binary search of 30 iterations, Streaming reaches a length of 8388608 bytes at 188 msec on my local; whereas bytes can only reach 1048576 bytes at around 174 msec with the same number of iterations.

    Stream and byte seem to both be reasonably quick at retrieving results, but I did notice several out of memory errors as we reach higher values.

    I hope this helps.