Search code examples
pythonhaskellutf-8hmacbytestring

different utf8 encoding outputs between Python and Haskell


I'm trying to encode a has which is computed by hmac sha256 in Haskell and sometimes my function has different output when compared to it's Python counterpart.

This is the python funciton:

import requests
import json
import hmac
import hashlib
import base64
import time

def strToSign(time, method, endpoint, body):
    return time + method + endpoint + body

str_to_sign2 = strToSign('1','GET','/api/v1/position?symbol=XBTUSDM','')

signature2 = base64.b64encode(
    hmac.new(api_secret.encode('utf-8'), str_to_sign2.encode('utf-8'), hashlib.sha256).digest())

And this is the Haskell function:

import qualified Data.ByteString.Char8      as BC
import qualified Data.Text                  as T
import qualified Data.ByteString.Base64.URL as U
import           Data.Text.Encoding         (encodeUtf8)
import qualified Crypto.Hash.SHA256         as H

apiSignTest :: BC.ByteString -> BC.ByteString -> BC.ByteString -> BC.ByteString -> IO BC.ByteString
apiSignTest time method endpoint body = do
  let timeStamp = time
  let secret = mconcat [timeStamp,method,endpoint,body]
  let hash = H.hmac (BC.pack C.apiSecretFuture) secret
  return $ (encodeUtf8 . U.encodeBase64) hash

some examples where the encoded outputs are different

Haskell : b'KbCFw8OYGeGB433L93vQvbsnzSXxG88r_-HR5AGDJmo='
Python : "KbCFw8OYGeGB433L93vQvbsnzSXxG88r/+HR5AGDJmo=" 

Python : b'dwSmCd75wZToIDt6I0Ik/sX8Vxk4W+RA0Sv1TO+x4WI='
Haskell : "dwSmCd75wZToIDt6I0Ik_sX8Vxk4W-RA0Sv1TO-x4WI="

Python : b'X8SE3ohju6VAu2Dt5nGIQP40+KU9RrhXORAUOdL7rJg='
Haskell : "X8SE3ohju6VAu2Dt5nGIQP40-KU9RrhXORAUOdL7rJg="

Solution

  • Data.ByteString.Base64.URL.encodeBase64 is specifically using the base64url encoding rather than vanilla base64. The purpose, as the name suggests, is that these encodings can be directly embedded in URLs, which the vanilla version cannot because / and + have special meanings in URLs.

    To get the same behaviour as Python's b64encode, use Data.ByteString.Base64.encodeBase64 instead. Or, the other way around, in Python you can use urlsafe_b64encode.