I have written a program about Huffman compression and at the end of the day I can manipulate the string read and change it specific to Huffman. For example, I could get the "1001010000010101111011" string for the "abcdeffg" string entered. My problem is this: I have to write this string I got to a file and save it. I also know file operations, but I don't know how to write this string in bits. The string "abcdeffg" that I read from the file now is 8 bytes, while the string "1001010000010101111011" I created is 22 bytes. So if I write this to the file, I will not be compressing. How can I write this string of 0's and 1's to the file for Huffman Compression to work? The code I have written so far is following;
Class HuffmanEncodedResult:
class HuffmanEncodedResult {
final Node root;
final String encodedData;
public HuffmanEncodedResult(final String encodedData, final Node root) {
this.encodedData = encodedData;
this.root = root;
}
public Node getRoot() {
return this.root;
}
public String getEncodedData() {
return this.encodedData;
}
}
Class HuffmanEncoder:
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.HashMap;
import java.util.Map;
import java.util.PriorityQueue;
public class HuffmanEncoder {
private static final int LETTER_SIZE = 256;
public HuffmanEncodedResult compress(final String data){
final int[] freqTable = buildFreqTable(data);
final Node root = buildHuffmanTree(freqTable);
final Map<Character,String> lookupTable = buildLookupTable(root);
return new HuffmanEncodedResult(generateEncodedData(data, lookupTable), root);
}
private static String generateEncodedData(String data, Map<Character, String> lookupTable) {
final StringBuilder builder = new StringBuilder();
for(final char character : data.toCharArray()){
builder.append(lookupTable.get(character));
}
return builder.toString();
}
private static Map<Character,String> buildLookupTable(final Node root){
final Map<Character,String> lookupTable = new HashMap<>();
buildLookupTableImpl(root,"",lookupTable);
return lookupTable;
}
private static void buildLookupTableImpl(Node node, String s, Map<Character, String> lookupTable) {
if(!node.isLeaf()){
buildLookupTableImpl(node.leftChild,s + '0', lookupTable);
buildLookupTableImpl(node.rightChild,s + '1', lookupTable);
}else{
lookupTable.put(node.character,s);
}
}
private static Node buildHuffmanTree(int[] freq){
final PriorityQueue<Node> pq = new PriorityQueue<>();
for(char i = 0; i<LETTER_SIZE; i++){
if(freq[i] > 0){
pq.add(new Node(i,freq[i],null,null));
}
}
if(pq.size() == 1){
pq.add(new Node('\0',0, null, null));
}
while (pq.size() > 1){
final Node left = pq.poll();
final Node right = pq.poll();
assert right != null;
assert left != null;
final Node parent = new Node('\0',left.frequency + right.frequency, left, right);
pq.add(parent);
}
return pq.poll();
}
private static int[] buildFreqTable(final String data){
final int[] freq = new int[LETTER_SIZE];
for(final char character :data.toCharArray()){
freq[character]++;
}
return freq;
}
public String decompress(HuffmanEncodedResult result){
final StringBuilder resultBuilder = new StringBuilder();
Node current = result.getRoot();
int i = 0;
while(i<result.getEncodedData().length()){
while(!current.isLeaf()){
char bit = result.getEncodedData().charAt(i);
if(bit == '1'){
current = current.rightChild;
}else if(bit == '0'){
current = current.leftChild;
}else{
throw new IllegalArgumentException("Invalid Bit in Message! " + bit);
}
i++;
}
resultBuilder.append(current.character);
current = result.getRoot();
}
return resultBuilder.toString();
}
private byte[] toByteArray(String input){
//to charArray
char[] preBitChars = input.toCharArray();
int bitShortage = (8 - (preBitChars.length%8));
char[] bitChars = new char[preBitChars.length + bitShortage];
System.arraycopy(preBitChars, 0, bitChars, 0, preBitChars.length);
for (int i= 0; i < bitShortage; i++) {
bitChars[preBitChars.length + i]='0';
}
//to bytearray
byte[] byteArray = new byte[bitChars.length/8];
for(int i=0; i<bitChars.length; i++) {
if (bitChars[i]=='1'){
byteArray[byteArray.length - (i/8) - 1] |= 1<<(i%8);
}
}
return byteArray;
}
public static void main(String[] args) {
final String test = "abcdeffg";
final HuffmanEncoder encoder = new HuffmanEncoder();
final HuffmanEncodedResult result = encoder.compress(test);
System.out.println("Encoded Message: " + result.encodedData);
System.out.println("Unencoded Message: " + encoder.decompress(result));
}
}
Class Node:
public class Node implements Comparable<Node>{
protected final char character;
protected final int frequency;
protected final Node rightChild;
protected final Node leftChild;
public Node(char character, int frequency, Node leftChild, Node rightChild) {
this.character = character;
this.frequency = frequency;
this.rightChild = rightChild;
this.leftChild = leftChild;
}
public boolean isLeaf(){
return this.leftChild == null && this.rightChild == null;
}
@Override
public int compareTo(Node o) {
final int frequencyComparison = Integer.compare(this.frequency,o.frequency);
if(frequencyComparison != 0){
return frequencyComparison;
}
return Integer.compare(this.character,o.character);
}
}
Console Output:
"C:\Program Files\Java\jdk-15.0.1\bin\java.exe" "-javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2020.3.1\lib\idea_rt.jar=62965:C:\Program Files\JetBrains\IntelliJ IDEA 2020.3.1\bin" -Dfile.encoding=UTF-8 -classpath C:\Users\ayber\Desktop\untitled\out\production\untitled HuffmanEncoder
Encoded Message: 1001010000010101111011
Unencoded Message: abcdeffg
Process finished with exit code 0
You need to write the data as a byte array using FileOutputStream.write
and with the byte array as a binary representation of your string of ones and zeros.
One way of converting your string to bits is to use BitSet
.
To create a BitSet of the correct size use:
BitSet b = new BitSet(string.length());
Then you can loop over the string and set the bits:
for (int i = 0; i < string.length(); i++) {
if (string.charAt(i) == '1') {
b.set(i);
}
}
Then you can convert the bitset to a byte array and write it to a file:
byte[] bytes = b.toByteArray();
FileOutputStream f = new FileOutputStream(filename);
f.write(bytes);
However, I would note that creating a string and then converting it to a bitset and then converting it to a byte array is not very efficient. It might be better to work with a bitset from the beginning.