Search code examples
swiftbarcodepdf417

PDF417 decode and generate the same barcode using Swift


I have the following example of PDF417 barcode:

examplebarcode

which can be decoded with online tool like zxing

as the following result: 5wwwwwxwww0app5p3pewi0edpeapifxe0ixiwwdfxxi0xf5e�¼ô���������¬‚C`Ìe%�æ‹�ÀsõbÿG)=‡x‚�qÀ1ß–[FzùŽûVû�É�üæ±RNI�Y[.H»Eàó¼åñüì²�tØ¿ªWp…Ã�{�Õ*

or online-qrcode-generator

as 5wwwwwxwww0app5p3pewi0edpeapifxe0ixiwwdfxxi0xf5e~|~~~~~~~~~~d~C`~e%~~~~;To~B~{~dj9v~~Z[Xm~~"HP3~~LH~~~O~"S~~,~~~~~~~k1~~~u~Iw}SQ~fqX4~mbc_ (I don't know which encoding is used to encode this)

The first part of the encoded key that contains barcode is always known and it is 5wwwwwxwww0app5p3pewi0edpeapifxe0ixiwwdfxxi0xf5e

The second part of it can be decoded from the base64string and it always contains 88 bytes. In my case it is:

Frz0DAAAAAAAAAAArIJDYMxlJQDmiwHAc/Vi/0cpPYd4ghlxwDHflltGevmO+1b7GckT/OZ/sVJOSRpZWy5Iu0Xg87zl8fzssg502L+qV3CFwxZ/ewjVKg==

I'm using Swift on iOS device to generate this PDF417 barcode by decoding the provided base64 string like this:

let base64Str = "Frz0DAAAAAAAAAAArIJDYMxlJQDmiwHAc/Vi/0cpPYd4ghlxwDHflltGevmO+1b7GckT/OZ/sVJOSRpZWy5Iu0Xg87zl8fzssg502L+qV3CFwxZ/ewjVKg=="
let knownKey = "5wwwwwxwww0app5p3pewi0edpeapifxe0ixiwwdfxxi0xf5e"
let decodedData = Data(base64Encoded: base64Str.replacingOccurrences(of: "-", with: "+")
                                        .replacingOccurrences(of: "_", with: "/"))

var codeData=knownKey.data(using: String.Encoding.ascii)

codeData?.append(decodedData)
let image = generatePDF417Barcode(from: codeData!)
let imageView = UIImageView(image: image!)

//the function to generate PDF417 UIMAGE from parsed Data
func generatePDF417Barcode(from codeData: Data) -> UIImage? {

        if let filter = CIFilter(name: "CIPDF417BarcodeGenerator") {
            filter.setValue(codeData, forKey: "inputMessage")
            let transform = CGAffineTransform(scaleX: 3, y: 3)

            if let output = filter.outputImage?.transformed(by: transform) {
                return UIImage(ciImage: output)
            }
        }

        return nil
    }

But I always get the wrong barcodes generated. It can be seen visually.

Please help me correct the code to get the same result as the first barcode image.

I also have the another example of barcode:

enter image description here

The first part of key is the same but it's second part is known as int8 byte array and I also don't have an idea how to generate the PDF417 barcode from it (with prepended key) correctly.

Here's how I try:

let knownKey = "5wwwwwxwww0app5p3pewi0edpeapifxe0ixiwwdfxxi0xf5e"
let secretArray: [Int8] = [22, 124, 24, 12, 0, 0, 0, 0, 0, 0, 0, 0, 100, 127, 67, 96, -52, 101, 37, 0, -85, -123, 1, -64, 111, -28, 66, -27, 123, -25, 100, 106, 57, 118, -4, 16, 90, 91, 88, 109, -105, 126, 34, 72, 80, 51, -116, 28, 76, 72, -37, -24, -93, 79, -115, 34, 83, 18, -61, 44, -12, -13, -8, -59, -107, -9, -128, 107, 49, -50, 126, 13, -59, 50, -24, -43, 127, 81, -85, 102, 113, 88, 52, -60, 109, 98, 99, 95] 
let secretUInt8 = secretArray.map { UInt8(bitPattern: $0) }
let secretData = Data(secretUInt8)


let keyArray: [UInt8] = Array(knownKey.utf8)
var keyData = Data(keyArray)

keyData.append(secretData)

let image = generatePDF417Barcode(from: keyData!)
let imageView = UIImageView(image: image!)

Solution

  • There are a lot of things going on here. Gereon is correct that there are a lot of parameters. Choosing different parameters can lead to very different bar codes that decode identically. Your current barcode is "correct" (though a bit messy due to an Apple bug). It's just different.

    I'll start with the short answer of how to make your data match the barcode you have. Then I'll walk through what you should probably actually do, and finally I'll get to the details of why.

    First, here's the code you're looking for (but probably not the code you want, unless you have to match this barcode):

    filter.setValue(codeData, forKey: "inputMessage")
    filter.setValue(3, forKey: "inputCompactionMode")  // This is good (and the big difference)
    filter.setValue(5, forKey: "inputDataColumns")     // This is fine, but probably unneeded
    filter.setValue(0, forKey: "inputCorrectionLevel") // This is bad
    

    PDF 417 defines several "compaction modes" to let it pack a truly impressive amount of information into a very small space while still offering excellent error detection and correction, and handling a lot of real-world scanning concerns. The default compaction mode only supports Latin text and basic punctuation. (It compacts even more if you only use uppercase Latin letters and space.) The first part of your string can be stored with text compaction, but the rest can't, so it has to switch to byte compaction.

    Core Image actually does this switch shockingly badly by default (I opened FB9032718 to track). Rather than encoding in text and then switching to bytes, or just doing it all in bytes, it switches to bytes over and over again unnecessarily.

    There's no way for you to configure multiple compaction methods, but you can just set it to byte, which is what value 3 is. And that's also how your source is doing it.

    The second difference is the number of data columns, which drive how wide the output is. Your source is using 5, but Core Image is choosing 6 based on its default rules (which aren't fully documented).

    Finally, your source has set the error correction level to 0, which is not recommended. For a message of this size, the minimum recommended error correction level is 3, which is what Core Image chooses by default.

    If you just want a good barcode, and don't have to match this input, my recommendation would be to set inputCompactionMode to 3, and leave the rest as defaults. If you want a different aspect ratio, I'd use inputPreferredAspectRatio rather than modifying the number of data columns directly.


    You may want to stop reading now. This was a very enjoyable puzzle to spend the morning on, so I'm going to dump a lot of details here.

    If you want a deep dive into how this format works, I don't know anything currently available other than the ISO 15438 Spec, which will cost you around US$200. But there used to be some pages at GeoCities that explained a lot of this, and they're still available through the Wayback Machine.

    There also aren't a lot of tools for decoding this stuff on the command line, but pdf417decode does a reasonable job. I'll use output from it to explain how I knew all the values.

    The last tool you need is a way to turn jpeg output into black-and-white pbm files so that pdf417decode can read them. For that, I use the following (after installing netpbm):

    cat /tmp/barcode.jpeg | jpegtopnm | ppmtopgm | pamthreshold | pamtopnm > new.pbm && ./pdf417decode -c -e new.pbm
    

    With that, let's decode the first three rows of your existing barcode (with my commentary to the side). Everywhere you see "function output," that means this value is the output of some function that takes the other thing as the input:

    0 7f54 0x02030000 (0)    // Left marker
    0 6a38 0x00000007 (7)    // Number of rows function output
    0 218c 0x00000076 (118)  // Total number of non-error correcting codewords
    0 0211 0x00000385 (901)  // Latch to Byte Compaction mode
    0 68cf 0x00000059 (89)   // Data
    0 18ec 0x0000021c (540)
    0 02e7 0x00000330 (816)
    0 753c 0x00000004 (4)    // Number of columns function output
    0 7e8a 0x00030001 (1)    // Right marker
    
    1 7f54 0x02030000 (0)    // Left marker
    1 7520 0x00010002 (2)    // Security Level function output
    1 704a 0x00010334 (820)  // Data
    1 31f2 0x000101a7 (423)
    1 507b 0x000100c9 (201)
    1 5e5f 0x00010319 (793)
    1 6cf3 0x00010176 (374)
    1 7d47 0x00010007 (7)    // Number of rows function output
    1 7e8a 0x00030001 (1)    // Right marker
    
    2 7f54 0x02030000 (0)    // Left marker
    2 6a7e 0x00020004 (4)    // Number of columns function output
    2 0fb2 0x0002037a (890)  // Data
    2 6dfa 0x000200d9 (217)
    2 5b3e 0x000200bc (188)
    2 3bbc 0x00020180 (384)
    2 5e0b 0x00020268 (616)
    2 29e0 0x00020002 (2)    // Security Level function output 
    2 7e8a 0x00030001 (1)    // Right marker
    

    The next 3 lines will continue this pattern of function outputs. Note that the same information is encoded on the left and right, but in a different order. The system has a lot of redundancy, and can detect that it's seeing a mirror image of the barcode.

    We don't care about the number of rows for this purpose, but given a current row of n and a total number of rows of N, the function is:

    30 * (n/3) + ((N-1)/3)
    

    Where / always means "integer, truncating division." Given there are 24 rows, on row 0, this is 0 + (24-1)/3 = 7.

    The security level function's output is 2. Given a security level of e, the function is:

    30 * (n/3) + 3*e + (N-1) % 3
    => 0 + 3*e + (23%3) = 2
    => 3*e + 2 = 2
    => 3*e = 0
    => e = 0
    

    Finally, the number of columns can just be counted off in the output. For completeness, given a number of columns c, the function is:

    30 * (n/3) + (c - 1)
    => 0 + c - 1 = 4
    => c = 5
    

    If you look at the Data lines, you'll notice that they don't match your input data at all. That's because they have a complex encoding that I won't detail here. But for Byte compaction, you can think of it as similar to Base64 encoding, but instead of 64, it's Base900. Where Base64 encodes 3 bytes of data into 4 characters, Base900 encodes 6 bytes of data into 5 codewords.

    In the end, all these codewords get converted to symbols (actual lines and spaces). Which symbol is used depends on the line. Lines divisible by 3 use one symbol set, the lines after use a second, and the lines after that use a third. So the same codewords will look completely different on line 7 than on line 8.

    Taken together, all these things make it very difficult to look at a barcode and decide how "different" it is from another barcode in terms of content. You just have to decode them and see what's going on.