Search code examples
javabit-manipulationbinaryfiles

Write bits in a file and retrieve them to a string of "0101.." in java?


I am working on a compression algorithm and for that i need to write strings of bits to a binary file and retrieve back exactly the same to a String again! say, i have a string "10100100100....." and i will write them in a file as bits

(not chars '0' '1')

. and read back as bits and convert to string... and this is for a large amount of data (>100 megabytes). is there any neat and fast way of doing this?

So far i tried (and failed) writing them to bytes by sub-stringing into 8 bits and then as ASCII characters to a string and finally to a .txt file.

{
    String Bits="10001010100000000000"; // a lot larger in actual program 

    String nCoded="";
    char nextChar;
    int i = 0;
    for(i=0; i < Bits.length()-8; i += 8){

        nextChar = (char)Integer.parseInt( Bits.substring(i, i+8), 2 );
        nCoded += nextChar;
    }

    // for the remainding bits, padding
    if(newBits.length()%8 != 0){
        nCoded+=(char)Integer.parseInt(Bits.substring(i), 2);
    }
    nCoded+=(char)Bits.length()%8; //to track the remainder of Bits that was padded 

    writeToTextFile( nCoded, "file.txt"); //write the nCoded string to file
}            

but this seems to corrupt information and inefficient. again for clarification, i dont want the String to be written, its just a representation of the actual data. So, i want to

convert each 0 and 1 from the string representation to its binary form and write that to file.


Solution

  • Here is a method you can use to convert the String to a series of bits, ready for output to file:

    private byte[] toByteArray(String input){
        //to charArray
        char[] preBitChars = input.toCharArray();
        int bitShortage = (8 - (preBitChars.length%8));
        char[] bitChars = new char[preBitChars.length + bitShortage];
        System.arraycopy(preBitChars, 0, bitChars, 0, preBitChars.length);
    
        for (int  i= 0;  i < bitShortage;  i++) {
            bitChars[preBitChars.length + i]='0';
        }
    
        //to bytearray
        byte[] byteArray = new byte[bitChars.length/8];
        for(int i=0; i<bitChars.length; i++) {
            if (bitChars[i]=='1'){
                byteArray[byteArray.length - (i/8) - 1] |= 1<<(i%8);
            }
        }
        return byteArray;
    }
    

    Passing the String "01010101" will return the result [85] as a byte[].

    It turns out there is an easier way. There is a static Byte.parseByte(String) that returns Byte object. Calling:

     Byte aByte = Byte.parseByte("01010101");
     System.out.println(aByte);
    

    Displays the same value: 85.

    So you may ask a couple of questions here.

    1. Why are we passing a String that is 8 characters in length. Well, you can prefix the String with an 9th character, that would represent a sign bit. I don't think you have this case, but if you needed to, the documentation for Byte.parseByte() states it should be:

    An ASCII minus sign '-' ('\u002D') to indicate a negative value or an ASCII plus sign '+' ('\u002B') to indicate a positive value.

    So from this information, you would need to break up your String manually into 8 bit Strings and call Byte.parseByte() to get a Byte object for each.

    2) What about writing bits to a file? No, file writing is done in bytes. If you need to write the file, then read it back in and convert back to a String, you will need to reverse the process and read the file in as a byte[] then convert that to it's String representation.

    A Hint on how to convert a byte to a nice String format can be found here:

    Convert byte (java data type) value to bits (a string containing only 8 bits)