Search code examples
javaserializationseekrandomaccessfile

Search for a String (as an byte[]) in a binary stream


Hi Team, I am trying to find a String "Henry" in a binary file and change the String to a different string. FYI the file is the output of serialisation of an object. Original Question here

I am new to searching bytes and imagined this code would search for my byte[] and exchange it. But it doesn't come close to working it doesn't even find a match.

{
    byte[] bytesHenry = new String("Henry").getBytes();
    byte[] bytesSwap = new String("Zsswd").getBytes();

    byte[] seekHenry = new byte[bytesHenry.length];

    RandomAccessFile file = new RandomAccessFile(fileString,"rw");

    long filePointer;
    while (seekHenry != null) {
       filePointer = file.getFilePointer();
       file.readFully(seekHenry);
       if (bytesHenry == seekHenry) {
           file.seek(filePointer);
           file.write(bytesSwap);
           break;
       }
     }
}

Okay I see the bytesHenry==seekHenry problem and will swap to Arrays.equals( bytesHenry , seekHenry )

I think I need to move along by -4 byte positions each time i read 5 bytes.


Bingo it finds it now

    while (seekHenry != null) {
                    filePointer = file.getFilePointer();
                    file.readFully(seekHenry);;
                    if (Arrays.equals(bytesHenry,
                                      seekHenry)) {
                        file.seek(filePointer);
                        file.write(bytesSwap);
                        break;
                    }
                    file.seek(filePointer);
                    file.read();
                }

Solution

  • The following could work for you, see the method search(byte[] input, byte[] searchedFor) which returns the index where the first match matches, or -1.

    public class SearchBuffer {
    
        public static void main(String[] args) throws UnsupportedEncodingException {
            String charset= "US-ASCII";
            byte[] searchedFor = "ciao".getBytes(charset);
            byte[] input = "aaaciaaaciaojjcia".getBytes(charset);
    
            int idx = search(input, searchedFor);
            System.out.println("index: "+idx); //should be 8
        }
    
        public static int search(byte[] input, byte[] searchedFor) {
            //convert byte[] to Byte[]
            Byte[] searchedForB = new Byte[searchedFor.length];
            for(int x = 0; x<searchedFor.length; x++){
                searchedForB[x] = searchedFor[x];
            }
    
            int idx = -1;
    
            //search:
            Deque<Byte> q = new ArrayDeque<Byte>(input.length);
            for(int i=0; i<input.length; i++){
                if(q.size() == searchedForB.length){
                    //here I can check
                    Byte[] cur = q.toArray(new Byte[]{});
                    if(Arrays.equals(cur, searchedForB)){
                        //found!
                        idx = i - searchedForB.length;
                        break;
                    } else {
                        //not found
                        q.pop();
                        q.addLast(input[i]);
                    }
                } else {
                    q.addLast(input[i]);
                }
            }
    
            return idx;
        }
    }