I am trying to get this code work work properly, but when I try to encode things it doesn't seem to work as it should. I have a text file thats 60bytes. I encode it and the outputted file is 100 bytes. When I decode that file it goes to like 65bytes. It decodes properly but the file size is larger than the original. I tried encode a jpg and the file size did go down, however I couldn't open the file afters. I tried to decode the jpg file and it didn't work, seemed like cmd had frozen. This is the code I was trying to use.
import java.util.*;
import java.io.*;
public class LZW {
// Dictionary
public static short DSIZE = 256;
public static int DSIZEINT = 256;
/** Compress a string to a list of output symbols. */
public static List<Short> compress(String uncompressed) {
// Build the dictionary.
short dictSize = DSIZE;
Map<String,Short> dictionary = new HashMap<String,Short>();
for (short i = 0; i < DSIZE; i++)
dictionary.put("" + (char)i, i);
String w = "";
List<Short> result = new ArrayList<Short>();
for (char c : uncompressed.toCharArray()) {
String wc = w + c;
if (dictionary.containsKey(wc))
w = wc;
else {
result.add(dictionary.get(w));
// Add wc to the dictionary.
dictionary.put(wc, dictSize++);
w = "" + c;
}
}
// Output the code for w.
if (!w.equals(""))
result.add(dictionary.get(w));
return result;
}
/** Compress a string to a list of output symbols, supporting larger filesizes. */
public static List<Integer> compressInt(String uncompressed) {
// Build the dictionary.
int dictSize = DSIZEINT;
Map<String,Integer> dictionary = new HashMap<String,Integer>();
for (int i = 0; i < DSIZEINT; i++)
dictionary.put("" + (char)i, i);
String w = "";
List<Integer> result = new ArrayList<Integer>();
for (char c : uncompressed.toCharArray()) {
String wc = w + c;
if (dictionary.containsKey(wc))
w = wc;
else {
result.add(dictionary.get(w));
// Add wc to the dictionary.
dictionary.put(wc, dictSize++);
w = "" + c;
}
}
// Output the code for w.
if (!w.equals(""))
result.add(dictionary.get(w));
return result;
}
/** Decompress a list of output ks to a string. */
public static String decompress(List<Short> compressed) {
// Build the dictionary.
short dictSize = DSIZE;
Map<Short,String> dictionary = new HashMap<Short,String>();
for (short i = 0; i < DSIZE; i++)
dictionary.put(i, "" + (char)i);
String w = "" + (char)(short)compressed.remove(0);
String result = w;
for (short k : compressed) {
String entry;
if (dictionary.containsKey(k))
entry = dictionary.get(k);
else if (k == dictSize)
entry = w + w.charAt(0);
else
throw new IllegalArgumentException("Bad compressed k: " + k);
result += entry;
// Add w+entry[0] to the dictionary.
dictionary.put(dictSize++, w + entry.charAt(0));
w = entry;
}
return result;
}
/** Decompress a list of output ks to a string, supporting larger filesizes. */
public static String decompressInt(List<Integer> compressed) {
// Build the dictionary.
int dictSize = DSIZE;
Map<Integer,String> dictionary = new HashMap<Integer,String>();
for (int i = 0; i < DSIZE; i++)
dictionary.put(i, "" + (char)i);
String w = "" + (char)(int)compressed.remove(0);
String result = w;
for (int k : compressed) {
String entry;
if (dictionary.containsKey(k))
entry = dictionary.get(k);
else if (k == dictSize)
entry = w + w.charAt(0);
else
throw new IllegalArgumentException("Bad compressed k: " + k);
result += entry;
// Add w+entry[0] to the dictionary.
dictionary.put(dictSize++, w + entry.charAt(0));
w = entry;
}
return result;
}
public static void main(String[] args) {
String example = "";
String s = "";
int command = 0;
//Check for correct argument
if(args.length != 1) {
System.out.println("Please enter 1 argument.\nArg1: Command ('encode', 'decode', 'encodeInt', 'decodeInt')\nAnd ensure that you are feeding in an input file and output file using '<' and '>'");
System.exit(1);
}
if(args[0].equals("encode")){
command = 1;
}
else if(args[0].equals("decode")){
command = 2;
}
else if(args[0].equals("encodeInt")){
command = 3;
}
else if(args[0].equals("decodeInt")){
command = 4;
}
else {
System.out.println("Please use either 'encode', 'decode', 'encodeInt', 'decodeInt' as the argument.");
System.exit(1);
}
long start;
long elapsedTime;
//Compress
if(command == 1){
//Read input file
s = BinaryStdIn.readString();
//The actual compression
start = System.nanoTime();
List<Short> compressed = compress(s);
elapsedTime = System.nanoTime() - start;
//System.err.println(compressed);
//first writes the number of ints to write
BinaryStdOut.write(compressed.size());
//writes compression (to file)
Iterator<Short> compressIterator = compressed.iterator();
while (compressIterator.hasNext()){
BinaryStdOut.write(compressIterator.next());
}
System.err.println("LZW Encode time: " + elapsedTime + " ns");
}
//Decompress
else if(command == 2){
//Build Integer List with input
List<Short> compressed = new ArrayList<Short>();
int size = BinaryStdIn.readInt();
while(size > 0){
try{
compressed.add(BinaryStdIn.readShort());
}
catch(RuntimeException e){
System.err.print("*");
}
size--;
}
//System.err.println(compressed);
//The actual decompression
start = System.nanoTime();
String decompressed = decompress(compressed);
elapsedTime = System.nanoTime() - start;
//Print out decompressed data (to file)
System.out.println(decompressed);
System.err.println("LZW Decode time: " + elapsedTime + " ns");
}
//Compress using Integer size
else if(command == 3){
//Read input file
s = BinaryStdIn.readString();
//The actual compression
start = System.nanoTime();
List<Integer> compressed = compressInt(s);
elapsedTime = System.nanoTime() - start;
//System.err.println(compressed);
//first writes the number of ints to write
BinaryStdOut.write(compressed.size());
//writes compression (to file)
Iterator<Integer> compressIterator = compressed.iterator();
while (compressIterator.hasNext()){
BinaryStdOut.write(compressIterator.next());
}
System.err.println("LZW Encode time: " + elapsedTime + " ns");
}
//Decompress using Integer size
else if(command == 4){
//Build Integer List with input
List<Integer> compressed = new ArrayList<Integer>();
int size = BinaryStdIn.readInt();
while(size > 0){
try{
compressed.add(BinaryStdIn.readInt());
}
catch(RuntimeException e){
System.err.print("*");
}
size--;
}
//System.err.println(compressed);
//The actual decompression
start = System.nanoTime();
String decompressed = decompressInt(compressed);
elapsedTime = System.nanoTime() - start;
//Print out decompressed data (to file)
System.out.println(decompressed);
System.err.println("LZW Decode time: " + elapsedTime + " ns");
}
BinaryStdOut.close();
}
}
Appreciate any help. Thanks.
Even the best compression algorithm will occasionally create an output that's larger than the input. In fact it makes a good test case to find such input. LZW compresses by finding repeated sequences, so an input without any repeating sequences will by necessity get larger.
I once had to create a test input like this. I think it went something like "ABCD...ACBDEG...".
Edit: now that I look more closely at the code, I see that you're writing a list of Shorts to the output. That's almost certainly wrong; one of the necessary steps is to pack each output token into the smallest number of bits, and you're missing that step entirely.
Judging by your description the code has other problems too, but one's enough for now.