Search code examples
swiftswift3cryptographymd5cryptoswift

Why is MD5 hashing so hard and in Swift 3?


Ok, so every now and then you come across problems that you've solved before using various frameworks and libraries and whatnot found on the internet and your problem is solved relatively quick and easy and you also learn why your problem was a problem in the first place.

However, sometimes you come across problems that make absolute 0 sense, and even worse when the solutions make negative sense.

My problem is that I want to take Data and make an MD5 hash out of it.

I find all kinds of solutions but none of them work.

What's really bugging me out actually is how unnecessarily complicated the solutions seem to be for a trivial task as getting an MD5 hash out of anything.

I am trying to use the Crypto and CommonCrypto frameworks by Soffes and they seem fairly easy, right? Right?

Yes!

But why am I still getting the error fatal error: unexpectedly found nil while unwrapping an Optional value?

From what I understand, the data served by myData.md5 in the extension of Crypto by Soffes seem to be "optional". But why?

The code I am trying to execute is:

print(" md5 result: " + String(data: myData.md5, encoding: .utf8)!)

where myData has data in it 100% because after the above line of code, I send that data to a server, and the data exists.

On top of that, printing the count of myData.md5.count by print(String(myData.md5.count)) works perfectly.

So my question is basically: How do I MD5 hash a Data and print it as a string?

Edit:

What I have tried

That works

MD5:ing the string test in a PHP script gives me 098f6bcd4621d373cade4e832627b4f6 and the Swift code "test".md5() also gives me 098f6bcd4621d373cade4e832627b4f6

That doesn't work

Converting the UInt8 byte array from Data.md5() to a string that represents the correct MD5 value.

The different tests I've done are the following:

var hash = ""
for byte in myData.data.md5() {
    hash +=  String(format: "%02x", byte)
}
print("loop = " + hash) //test 1

print("myData.md5().toHexString() = " +  myData.md5().toHexString()) //test 2

print("CryptoSwift.Digest.md5([UInt8](myData)) = " + CryptoSwift.Digest.md5([UInt8](myData)).toHexString()) //test 3

All three tests with the 500 byte test data give me the MD5 value 56f6955d148ad6b6abbc9088b4ae334d while my PHP script gives me 6081d190b3ec6de47a74d34f6316ac6b

Test Sample (64 bytes): Raw data:

FFD8FFE0 00104A46 49460001 01010048 00480000 FFE13572 45786966 00004D4D
002A0000 0008000B 01060003 00000001 00020000 010F0002 00000012 00000092

Test 1, 2 and 3 MD5: 7f0a012239d9fde5a46071640d2d8c83

PHP MD5: 06eb0c71d8839a4ac91ee42c129b8ba3

PHP Code: echo md5($_FILES["file"]["tmp_name"])


Solution

  • The simple answer to your question is:

    String(data: someData, encoding: .utf8)
    

    returns nil if someData is not properly UTF8 encoded data. If you try to unwrap nil like this:

    String(data: someDate, encoding: .utf8)!
    

    you get:

    fatal error: unexpectedly found nil while unwrapping an Optional value

    So at it's core, it's got nothing to do with hashing or crypto.

    Both the input and the output of MD5 (or any hash algorithm for that matter) are binary data (and not text or strings). So the output of MD5 is not UTF8 encoded data. Thus why the above String initializer always failed.

    If you want to display binary data in your console, you need to convert it to a readable representation. The most common ones are hexadecimal digits or Base 64 encoding.

    Note: Some crypto libraries allow you to feed string into their hash functions. They will silently convert the string to a binary representation using some character encoding. If the encodings do not match, the hash values do not match across systems and programming languages. So you better try to understand why they really do in the background.