Search code examples
haskellboilerplateduplication

Dealing with boilerplate in Haskell


This is kind of a soft question, but in the following code, there's a lot of duplication in the section marked "caesar ciphers." What's the "Haskell" way to deal with this? Should I make a higher order function? I thought about that, but I don't know what makes sense. Is there a "cipher" type that I could define for making ciphers?

Also, I know it may appear a little bit overengineered, in the sense that I'm doing the same error checking in two places, but I think it makes sense from the perspective of what each of the functions "means." Suggestions?

import Data.Char
import Control.Applicative
import Control.Monad
import Math.NumberTheory.Powers

--Helpers

extendedGcd::Integer->Integer->(Integer, Integer)
extendedGcd a b | r == 0 = (0, 1)
                | otherwise = (y, x - (y * d))
                where
                    (d, r) = a `divMod` b
                    (x, y) = extendedGcd b r

modularInverse::Integer->Integer->Maybe Integer
modularInverse n b | relativelyPrime n b = Just . fst $ extGcd n b
                   | otherwise = Nothing
                   where
                        extGcd = extendedGcd

relativelyPrime::Integer->Integer->Bool
relativelyPrime m n = gcd m n == 1 

textToDigits::String->[Integer]
textToDigits = map (\x->toInteger (ord x - 97)) 

digitsToText::[Integer]->String
digitsToText = map (\x->chr (fromIntegral x + 97)) 

--Caesar Ciphers

caesarEncipher::Integer->Integer->Integer->Maybe Integer
caesarEncipher r s p | relativelyPrime r 26 = Just $ mod (r * p + s) 26
                     | otherwise = Nothing

caesarDecipher::Integer->Integer->Integer->Maybe Integer
caesarDecipher r s c | relativelyPrime r 26 = mod <$> ((*) <$> q <*> pure (c - s)) <*> pure 26
                     | otherwise = Nothing
    where
        q = modularInverse r 26

caesarEncipherString::Integer->Integer->String->Maybe String
caesarEncipherString r s p | relativelyPrime r 26 = fmap digitsToText $ mapM (caesarEncipher r s) plaintext
                           | otherwise = Nothing
    where
        plaintext = textToDigits p

caesarDecipherString::Integer->Integer->String->Maybe String
caesarDecipherString r s c | relativelyPrime r 26 = fmap digitsToText $ mapM (caesarDecipher r s) ciphertext
                           | otherwise = Nothing
    where
        ciphertext = textToDigits c

bruteForceCaesarDecipher::String->[Maybe String]
bruteForceCaesarDecipher c = caesarDecipherString <$> [0..25] <*> [0..25] <*> pure c

Solution

  • Make a Key type, and use smart constructors

    The main source of boilerplate seems to be the repeated checks that r is invertible, and calculation of its inverse. It makes sense to split your operations (eg encipher) into two steps: first check, then actually encipher. This way, you can write the checking part just once.

    One way to achieve this is by defining a new type CaesarKey which is guaranteed to contain only valid keys. We can guarantee this invariant using smart constructors, as follows:

    {-# LANGUAGE RecordWildCards #-} -- for the Key{..} syntax below
    
    -- invariant: r and q are inverses mod 26. 
    -- To ensure this invariant, we only export the 'caesarKey' smart constructor,
    -- and not the underlying 'Key' constructor
    data CaesarKey = Key { r :: Integer, s :: Integer, q :: Integer }
    
    caesarKey :: Integer -> Integer -> Maybe CaesarKey
    caesarKey r s = Key r s <$> invertMod r 26
    
    -- ciphers
    encipher :: CaesarKey -> Integer -> Integer
    encipher Key{..} p = mod (r * p + s) 26
    
    decipher :: CaesarKey -> Integer -> Integer
    decipher Key{..} c = mod (q * (c - s)) 26
    
    encipherString :: CaesarKey -> String -> String
    encipherString key = digitsToText . map (encipher key) . textToDigits
    
    decipherString :: CaesarKey -> String -> String
    decipherString key = digitsToText . map (decipher key) . textToDigits
    

    Define invert on keys

    Now we may take advantage of Daniel's observation that decipher is just encipher, but defined on a different key (namely the "inverse key"). So let's define an operation for inverting keys:

    -- turns a key suitable for encoding into one suitable for decoding, and vice versa.
    --   @invert (invert key) = key@
    invert :: CaesarKey -> CaesarKey
    invert (Key r s q) = Key q ((26-q)*s) r
    

    and now we could throw out the decipher and decipherString functions as they are unnecessary (i.e. it's preferable to use invert instead).

    Make an allKeys function

    Conceptually, we can split up bruteForceCaesarDecipher into two tasks: first, generate all possible keys; second, decode the text with each key. Let's implement this in code:

    allKeys :: [CaesarKey]
    allKeys = catMaybes $ caesarKey <$> [1,3..25] <*> [1,3..25]
    
    bruteForceCaesar :: String -> [String]
    bruteForceCaesar str = [encipherString key str | key <- allKeys]
    

    Besides giving easier-to-understand code (in my opinion), splitting the code up in this way has the advantage that we only build the list of keys once, rather than having to rebuild the keys for every string we want to decode.

    Note also a few other small changes:

    • I used catMaybes :: [Maybe a] -> [a] to throw out the Nothing keys

    • I followed Daniel's suggestions for how to make bruteForceCaesar more efficient.

    The complete code is here.