Search code examples
javafilesplitword-count

Splitting a text file into equal size files without breaking words in Java


I'm trying to split a txt file into multiple ones having the same size. I managed to do that using this function :

public static int fileSplitting(String fichier, String dossSortie, int nbMachines) throws FileNotFoundException, IOException{
        int i=1;

        File f = new File(fichier);
        //FileReader fr = new FileReader(f);
        //BufferedReader br = new BufferedReader(fr);
        int sizeOfFiles =  (int) (f.length()/(nbMachines));

        System.out.print(sizeOfFiles);
        byte[] buffer = new byte[sizeOfFiles];

        try (BufferedInputStream bis = new BufferedInputStream(
                new FileInputStream(f))){
            int tmp = 0;
            while ((tmp = bis.read(buffer)) > 0) {
                //write each chunk of data into separate file with different number in name
                File newFile = new File(dossSortie+"S"+i);
                try (FileOutputStream out = new FileOutputStream(newFile)) {
                    out.write(buffer, 0, tmp);//tmp is chunk size
                    }
                i++;
            }
        }
    
        return i;
}

The thing is that this function cut the words, while I need to keep every word. For example, if I have a txt file "I live in Amsterdam", the function will split it like that: "I live in Ams", "terdam". I would like something like: "I live in", "Amsterdam".


Solution

  • I couldn't do the job, but I split my file into an array of words and then divided my file into files with equal number of words. It's not exactly what I wanted to do and it's not a "beautiful way" to do it, but it's not that bad.