Search code examples
javaout-of-memoryfileoutputstream

java OutOfMemoryError about FileOutputStream?


Thanks for everyone ^_^,the problem is solved:there is a single line is too big(over 400M...I download a damaged file while I didn't realize), so throw a OutOfMemoryError

I want to split a file by using java,but it always throw OutOfMemoryError: Java heap space,I searched on the whole Internet,but it looks like no help :(

ps. the file's size is 600M,and it have over 30,000,000 lines,every line is no longer than 100 chars. (maybe you can generate a "level file" like this:{ id:0000000001,level:1 id:0000000002,level:2 ....(over 30 millions) })

pss. set the Jvm memory size larger is not work,:(

psss. I changed to another PC, problem remains/(ㄒoㄒ)/~~

no matter how large the -Xms or -Xmx I set,the outputFile's size is always same,(and the Runtime.getRuntime().totalMemory() is truely changed)

here's the stack trace:

 Heap Size = 2058027008
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2882)
        at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
        at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:515)
        at java.lang.StringBuffer.append(StringBuffer.java:306)
        at java.io.BufferedReader.readLine(BufferedReader.java:345)
        at java.io.BufferedReader.readLine(BufferedReader.java:362)
        at com.xiaomi.vip.tools.ptupdate.updator.Spilt.main(Spilt.java:39)
    ...

here's my code:

package com.updator;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.FileReader;

public class Spilt {
    public static void main(String[] args) throws Exception {
        long heapSize = Runtime.getRuntime().totalMemory();

        // Print the jvm heap size.
        System.out.println("Heap Size = " + heapSize);

        String mainPath = "/home/work/bingo/";
        File mainFilePath = new File(mainPath);
        FileInputStream inputStream = null;
        FileOutputStream outputStream = null;
        try {
            if (!mainFilePath.exists())
                mainFilePath.mkdir();

            String sourcePath = "/home/work/bingo/level.txt";
            inputStream = new FileInputStream(sourcePath);
            BufferedReader bufferedReader = new BufferedReader(new FileReader(
                    new File(sourcePath)));

            String savePath = mainPath + "tmp/";
            Integer i = 0;
            File file = new File(savePath + "part"
                    + String.format("%0" + 5 + "d", i) + ".txt");
            if (!file.getParentFile().exists())
                file.getParentFile().mkdir();
            file.createNewFile();
            outputStream = new FileOutputStream(file);
            int count = 0, total = 0;
            String line = null;
            while ((line = bufferedReader.readLine()) != null) {
                line += '\n';
                outputStream.write(line.getBytes("UTF-8"));
                count++;
                total++;
                if (count > 4000000) {
                    outputStream.flush();
                    outputStream.close();
                    System.gc();
                    count = 0;
                    i++;
                    file = new File(savePath + "part"
                            + String.format("%0" + 5 + "d", i) + ".txt");
                    file.createNewFile();
                    outputStream = new FileOutputStream(file);
                }
            }

            outputStream.close();
            file = new File(mainFilePath + "_SUCCESS");
            file.createNewFile();
            outputStream = new FileOutputStream(file);
            outputStream.write(i.toString().getBytes("UTF-8"));
        } finally {
            if (inputStream != null)
                inputStream.close();
            if (outputStream != null)
                outputStream.close();
        }
    }
}

I think maybe: when outputStream.close(),the memory did not release?


Solution

  • So you open the original file and create a BufferedReaderand a counter for the lines.

    char[] buffer = new char[5120];
    BufferedReader reader = Files.newBufferedReader(Paths.get(sourcePath), StandardCharsets.UTF_8);
    int lineCount = 0;
    

    Now you read into your buffer, and write the characters as they come in.

    int read;
    
    BufferedWriter writer = Files.newBufferedWriter(Paths.get(fileName), StandardCharsets.UTF_8);
    while((read = reader.read(buffer, 0, 5120))>0){
        int offset = 0;
        for(int i = 0; i<read; i++){
            char c = buffer[i];
            if(c=='\n'){
               lineCount++;
               if(lineCount==maxLineCount){
                  //write the range from 0 to i to your old writer.
                  writer.write(buffer, offset, i-offset);
                  writer.close();
                  offset=i;
                  lineCount=0;
                  writer = Files.newBufferedWriter(Paths.get(newName), StandarCharset.UTF_8);
               }
            }
            writer.write(buffer, offset, read-offset);
        }
        writer.close();
    }
    

    That should keep the memory usage lower and prevent you from reading too large of a line at once. You could go without BufferedWriters and control the memory even more, but I don't think that is necessary.