Search code examples
javaarraysloopslinenewline

How to split a byte array that contains multiple "lines" in Java?


Say we have a file like so:

one 
two 
three

(but this file got encrypted)

My crypto method returns the whole file in memory, as a byte[] type.
I know byte arrays don't have a concept of "lines", that's something a Scanner (for example) could have.

I would like to traverse each line, convert it to string and perform my operation on it but I don't know how to:

  1. Find lines in a byte array
  2. Slice the original byte array to "lines" (I would convert those slices to String, to send to my other methods)
  3. Correctly traverse a byte array, where each iteration is a new "line"

Also: do I need to consider the different OS the file might have been composed in? I know that there is some difference between new lines in Windows and Linux and I don't want my method to work only with one format.

Edit: Following some tips from answers here, I was able to write some code that gets the job done. I still wonder if this code is worthy of keeping or I am doing something that can fail in the future:

byte[] decryptedBytes = doMyCrypto(fileName, accessKey);
ByteArrayInputStream byteArrInStrm = new ByteArrayInputStream(decryptedBytes);
InputStreamReader inStrmReader = new InputStreamReader(byteArrInStrm);
BufferedReader buffReader = new BufferedReader(inStrmReader);

String delimRegex = ",";
String line;
String[] values = null;

while ((line = buffReader.readLine()) != null) {
    values = line.split(delimRegex);
    if (Objects.equals(values[0], tableKey)) {
        return values;
    }
}
System.out.println(String.format("No entry with key %s in %s", tableKey, fileName));
return values;

In particular, I was advised to explicitly set the encoding but I was unable to see exactly where?


Solution

  • If you want to stream this, I'd suggest:

    • Create a ByteArrayInputStream to wrap your array
    • Wrap that in an InputStreamReader to convert binary data to text - I suggest you explicitly specify the text encoding being used
    • Create a BufferedReader around that to read a line at a time

    Then you can just use:

    String line;
    while ((line = bufferedReader.readLine()) != null)
    {
        // Do something with the line
    }
    

    BufferedReader handles line breaks from all operating systems.

    So something like this:

    byte[] data = ...;
    ByteArrayInputStream stream = new ByteArrayInputStream(data);
    InputStreamReader streamReader = new InputStreamReader(stream, StandardCharsets.UTF_8);
    BufferedReader bufferedReader = new BufferedReader(streamReader);
    
    String line;
    while ((line = bufferedReader.readLine()) != null)
    {
        System.out.println(line);
    }
    

    Note that in general you'd want to use try-with-resources blocks for the streams and readers - but it doesn't matter in this case, because it's just in memory.