Search code examples
pythonhaskelloptimizationlanguage-comparisons

Why is this Haskell code so slow?


I'm kind of new to Haskell and tried making a scrabble solver. It takes in the letters you currently have, finds all permutations of them and filters out those that are dictionary words. The code's pretty simple:

import Data.List

main = do
    dict    <- readFile "words"
    letters <- getLine
    let dictWords = words dict
    let perms = permutations letters
    print [x | x <- perms, x `elem` dictWords]

However it's incredibly slow, compared to a very similar implementation I have with Python. Is there something fundamental I'm doing wrong?

*edit: Here's my Python code:

from itertools import permutations

letters = raw_input("please enter your letters (without spaces): ")

d = open('words')
dictionary = [line.rstrip('\n') for line in d.readlines()]
d.close()

perms = ["".join(p) for p in permutations(letters)]

validWords = []

for p in perms:
    if p in dictionary: validWords.append(p)


for validWord in validWords:
    print validWord

I didn't time them precisely, but roughly it feels like the Python implementation is about 2x as fast as the Haskell one. Perhaps I should't have said the Haskell code was "incredibly slow" in comparison, but since Haskell is statically typed I guess I just thought that it should've been much faster, and not slower than Python at all.


Solution

  • I'm kind of new to Haskell and tried making a scrabble solver.

    You can substantially improve things by using a better algorithm.

    Instead of testing every permutation of the input letters, if you sort them first you can make only one dictionary lookup and get all of the possible words (anagrams) which may be formed from them (using all of them).

    Here is code which creates that dictionary as a Data.Map. There is a start-up cost to creating the Map, but after the first query subsequent lookups are very fast.

    import Data.List
    import qualified Data.Map.Strict as Map
    import Control.Monad
    import System.IO
    
    main = do
      contents <- readFile "words"
      let pairs = [ (sort w, [w]) | w <- words contents ]
          dict = foldl' (\m (k,v) -> Map.insertWith (++) k v m) Map.empty pairs
          -- dict = foldr (\(k,v) m -> Map.insertWith (++) k v m) Map.empty pairs
      forever $ do
        putStr "Enter letters: " >> hFlush stdout
        letters <- getLine
        case Map.lookup (sort letters) dict of
          Nothing -> putStrLn "No words."
          Just ws -> putStrLn $ "Words: " ++ show ws
    

    Map creation time for a word file of 236K words (2.5 MB) is about 4-5 seconds. Better performance is likely possible by using ByteStrings or Text instead of Strings.

    Some good letter combinations to try:

    steer rat tuna lapse groan neat
    

    Note: Using GHC 7.10.2 I found this code performed the best without compiling with -O2.