I have a Haskell program which has 2 lines of code in the main:
putStrLn $ "Day11: part1: " ++ show (sum $ bigManhattan 1 galaxies <$> pairs)
putStrLn $ "Day11: part2: " ++ show (sum $ bigManhattan 999999 galaxies <$> pairs)
If I comment out either one of them the program runs in 0.01s. With both of them extant the program takes 90s.
I wonder if anyone has any ideas? Could they be competing to look at the data and getting in each others way?
Compiler options - I don't know what most of these do...
ghc-options:
- -Wall
- -Wcompat
- -Widentities
- -Wincomplete-record-updates
- -Wincomplete-uni-patterns
- -Wmissing-export-lists
- -Wmissing-home-modules
- -Wpartial-fields
- -Wredundant-constraints
- -O2
The code:
module Day11(day11) where
import Data.List ((\\))
import Data.Maybe (catMaybes)
type Coord = (Int, Int)
manhattan :: Coord -> Coord -> Int
manhattan (x1, y1) (x2, y2) = abs (x1 - x2) + abs (y1 - y2)
getF :: (String -> a) -> Int -> IO a
getF f n = do
s <- readFile $ "./Data/Day" ++ show n ++ ".in"
return $ f s
getLines :: Int -> IO [String]
getLines = getF lines
parse :: [String] -> [Coord]
parse css = concatMap (catMaybes . (\(y, cs) -> (\(x, c) -> if c=='#' then Just (x,y) else Nothing) <$> zip [0..] cs)) (zip [0..] css)
nSize :: Int
nSize = 140
bigManhattan :: Int -> [Coord] -> (Coord, Coord) -> Int
bigManhattan k galaxies ((c1, r1), (c2, r2)) = manhattan (c1+newc1, r1+newr1) (c2+newc2, r2+newr2)
where
baseC, baseR :: [Int]
baseC = [0..(nSize-1)] \\ (fst <$> galaxies)
baseR = [0..(nSize-1)] \\ (snd <$> galaxies)
newc1, newc2, newr1, newr2 :: Int
newc1 = k * length (filter (c1>) baseC)
newc2 = k * length (filter (c2>) baseC)
newr1 = k * length (filter (r1>) baseR)
newr2 = k * length (filter (r2>) baseR)
day11 :: IO ()
day11 = do
ls <- getLines 11
let galaxies = parse ls
pairs = [(x,y) | x <- galaxies, y <- galaxies, x<y ]
putStrLn $ "Day11: part1: " ++ show (sum $ bigManhattan 1 galaxies <$> pairs)
--putStrLn $ "Day11: part2: " ++ show (sum $ bigManhattan 999999 galaxies <$> pairs)
return ()
Using GHC 9.4.7.
Note that baseC
and baseR
depend on galaxies
but not the pairs
. When you print only one of the two results, GHC inlines all of bigManhattan
and is able to lift baseC
and baseR
out of the sum
. But, when you print both results, the inlining is suppressed, and baseC
and baseR
get recalculated for every bigManhattan
call on every element of pairs
. The simplest way to work around this is to move the sum
and map
into bigManhattan
to ensure baseC
and baseR
only get calculated once per sum. You could do even better, since they don't depend on k
either, but this seems to work fast enough.
bigManhattan :: Int -> [Coord] -> [(Coord, Coord)] -> Int
bigManhattan k galaxies = sum . map go
where
baseC, baseR :: [Int]
baseC = [0..(nSize-1)] \\ (fst <$> galaxies)
baseR = [0..(nSize-1)] \\ (snd <$> galaxies)
go ((c1,r1),(c2,r2)) = manhattan (c1+newc1, r1+newr1) (c2+newc2, r2+newr2)
where newc1, newc2, newr1, newr2 :: Int
newc1 = k * length (filter (c1>) baseC)
newc2 = k * length (filter (c2>) baseC)
newr1 = k * length (filter (r1>) baseR)
newr2 = k * length (filter (r2>) baseR)
day11 :: IO ()
day11 = do
ls <- getLines 11
let galaxies = parse ls
pairs = [(x,y) | x <- galaxies, y <- galaxies, x<y ]
putStrLn $ "Day11: part1: " ++ show (bigManhattan 1 galaxies pairs)
putStrLn $ "Day11: part2: " ++ show (bigManhattan 999999 galaxies pairs)
return ()