We have loyalty cards (like credit/debit cards, but processed by our bespoke code, as opposed to ones processed by interfacing with the banks). We need to store transaction data on the cards, as many transactions will be made using offline devices, and only uploaded when the card is next tapped on an online terminal.
Card storage space if limited (typically max 8Kb unless you pay silly prices for very smart cards), so I need to compress the data as much as possible.
Our transaction data is made up of three parts, all of which involve digits only (ie not alphabetic or special characters)...
yyMMddhhmmssfff
Representing this as a string of digits gives 37 digits per transaction.
I tried using the algorithms in System.IO.Compression
(following the code in this blog post, and the accompanying GitHub repo, not included here as it's bog-standard usage of the classes).
This gave some quite impressive results, with around 72% reduction using the optimal Gzip algorithm.
However, I was wondering if it would be possible to improve on this, given that we know something about the shape of the transaction data. For example, the date/time part of the data breaks down as follows...
Anyone any comment of whether or not these restrictions would help help me improve on this compression. Thanks
We can compress the data into 118
bit (or 15
bytes). So far so good we have ranges:
1 Jan 2000 0:0:0.000
up to 1 Jan 2100 0:0:0.000
which is 3_155_760_000_000
milliseconds1_000_000_000_000_000_000
possible numbers1_000_00
in penniesSo we have in total:
double dt = (new DateTime(2100, 1, 1) - new DateTime(2000, 1, 1)).TotalMilliseconds;
double sn = 1_000_000_000_000_000_000L;
double amount = 1_000_00;
Console.Write(Math.Log2(dt * sn * amount));
The result is 117.925470... bits, 118
bits since we can't use bit partially
Edit: Compress and decompress routine:
private static byte[] MyCompress(DateTime date, long serial, decimal amount) {
BigInteger ms = (long)(date - new DateTime(2000, 1, 1)).TotalMilliseconds;
BigInteger value =
ms * 1_000_000_000_000_000_000L * 1_000_00 +
(BigInteger)serial * 1_000_00 +
(BigInteger)(amount * 100);
byte[] result = new byte[15];
for (int i = result.Length - 1; i >= 0; --i, value /= 256)
result[i] = (byte)(value % 256);
return result;
}
private static (DateTime date, long serial, decimal amount) MyDecomress(byte[] data) {
BigInteger value = data.Aggregate(BigInteger.Zero, (s, a) => s * 256 + a);
BigInteger amount = value % 1_000_00;
BigInteger serial = (value / 1_000_00) % 1_000_000_000_000_000_000L;
BigInteger dt = value / 1_000_00 / 1_000_000_000_000_000_000L;
return (
new DateTime(2000, 1, 1).AddMilliseconds((double)dt),
(long)serial,
(decimal)amount / 100M
);
}
Demo:
var data = MyCompress(new DateTime(2023, 1, 25, 21, 06, 45), 12345, 345.87m);
Console.WriteLine(string.Join(" ", data.Select(b => b.ToString("X2"))));
var back = MyDecomress(data);
Console.Write(back);
Output:
00 0E 05 4C 23 D7 34 A8 BD E8 F7 CC 3D 95 80 BB
(25.01.2023 21:06:45, 12345, 345.87)
Edit: If we can store date and time up to 1/10
second (not up to millsecond) we can use 14
bytes only:
private static byte[] MyCompress(DateTime date, long serial, decimal amount) {
BigInteger ms = (long)(date - new DateTime(2000, 1, 1)).TotalMilliseconds / 100;
BigInteger value =
ms * 1_000_000_000_000_000_000L * 1_000_00 +
(BigInteger)serial * 1_000_00 +
(BigInteger)(amount * 100);
byte[] result = new byte[14];
for (int i = result.Length - 1; i >= 0; --i, value /= 256)
result[i] = (byte)(value % 256);
return result;
}
private static (DateTime date, long serial, decimal amount) MyDecomress(byte[] data) {
BigInteger value = data.Aggregate(BigInteger.Zero, (s, a) => s * 256 + a);
BigInteger amount = value % 1_000_00;
BigInteger serial = (value / 1_000_00) % 1_000_000_000_000_000_000L;
BigInteger dt = value / 1_000_00 / 1_000_000_000_000_000_000L;
return (
new DateTime(2000, 1, 1).AddMilliseconds((double)dt * 100),
(long)serial,
(decimal)amount / 100M
);
}