Search code examples
javainputstreamentropydatainputstream

Stream of short[]


Hi I need to calculate the entropy of order m of a file where m is the number of bit (m <= 16).

So:

H_m(X)=-sum_i=0 to i=2^m-1{(p_i,m)(log_2 (p_i,m))}

So, I thought to create an input stream to read the file and then calculate the probability of each sequence composed by m bit.

For m = 8 it's easy because I consider a byte. Since that m<=16 I tought to consider as primitive type short, save each short of the file in an array short[] and then manipulate bits using bitwise operators to obtain all the sequences of m bit in the file. Is this a good idea?

Anyway, I'm not able to create a stream of short. This is what I've done:

public static void main(String[] args) {
    readFile(FILE_NAME_INPUT);
}

public static void readFile(String filename) {
    short[] buffer = null;
    File a_file = new File(filename);
    try {
        File file = new File(filename);

        FileInputStream fis = new FileInputStream(filename);
        DataInputStream dis = new DataInputStream(fis);

        int length = (int)file.length() / 2;
        buffer = new short[length];

        int count = 0;
        while(dis.available() > 0 && count < length) {
            buffer[count] = dis.readShort(); 
            count++;
        }
        System.out.println("length=" + length);
        System.out.println("count=" + count);


        for(int i = 0; i < buffer.length; i++) {
            System.out.println("buffer[" + i + "]: " + buffer[i]);
        }

        fis.close();
    }
    catch(EOFException eof) {
        System.out.println("EOFException: " + eof);
    }
    catch(FileNotFoundException fe) {
        System.out.println("FileNotFoundException: " + fe);
    }
    catch(IOException ioe) {
        System.out.println("IOException: " + ioe);
    }
}

But I lose a byte and I don't think this is the best way to proced.


This is what I think to do using bitwise operator:

int[] list = new int[l];
foreach n in buffer {
    for(int i = 16 - m; i > 0; i-m) {
        list.add( (n >> i) & 2^m-1 );
    }
}

I'm assuming in this case to use shorts. If I use bytes, how can I do a cycle like that for m > 8? That cycle doesn't work because I have to concatenate multiple bytes and each time varying the number of bits to be joined..

Any ideas? Thanks


Solution

  • I think you just need to have a byte array:

    public static void readFile(String filename) {
      ByteArrayOutputStream outputStream=new ByteArrayOutputStream();
      try {
        FileInputStream fis = new FileInputStream(filename);
        byte b=0;
        while((b=fis.read())!=-1) {
            outputStream.write(b);
        }
        byte[] byteData=outputStream.toByteArray();
        fis.close();
      }
      catch(IOException ioe) {
        System.out.println("IOException: " + ioe);
    }
    

    Then you can manipulate byteData as per your bitwise operations.

    --

    If you want to work with shorts you can combine bytes read this way

    short[] buffer=new short[(int)(byteData.length/2.)+1];
    j=0;
    for(i=0; i<byteData.length-1; i+=2) {
      buffer[j]=(short)((byteData[i]<<8)|byteData[i+1]);
      j++;
    }
    

    To check for odd bytes do this

    if((byteData.length%2)==1) last=(short)((0x00<<8)|byteData[byteData.length-1]]);
    

    last is a short so it could be placed in buffer[buffer.length-1]; I'm not sure if that last position in buffer is available or occupied; I think it is but you need to check j after exiting the loop; if j's value is buffer.length-1 then it is available; otherwise might be some problem.

    Then manipulate buffer.

    The second approach with working with bytes is more involved. It's a question of its own. So try this above.