Search code examples
haskellfunctional-programminglist-processing

Given a list, how can I perform some transformation only on sub-lists whose each two elements satisfy a binary predicate?


(In my actual use case I have a list of type [SomeType], SomeType having a finite number of constructors, all nullary; in the following I'll use String instead of [SomeType] and use only 4 Chars, to simplify a bit.)

I have a list like this "aaassddddfaaaffddsssadddssdffsdf" where each element can be one of 'a', 's', 'd', 'f', and I want to do some further processing on each contiguous sequence of non-as, let's say turning them upper case and reversing the sequence, thus obtaining "aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD". (I've added the reversing requirement to make it clear that the processing involves all the contiguous non 'a'-s at the same time.)

To turn each sub-String upper case, I can use this:

func :: String -> String
func = reverse . map Data.Char.toUpper

But how do I run that func only on the sub-Strings of non-'a's?

My first thought is that Data.List.groupBy can be useful, and the overall solution could be:

concat $ map (\x -> if head x == 'a' then x else func x)
       $ Data.List.groupBy ((==) `on` (== 'a')) "aaassddddfaaaffddsssadddssdffsdf"

This solution, however, does not convince me, as I'm using == 'a' both when grouping (which to me seems good and unavoidable) and when deciding whether I should turn a group upper case.

I'm looking for advices on how I can accomplish this small task in the best way.


Solution

  • You could classify the list elements by the predicate before grouping. Note that I’ve reversed the sense of the predicate to indicate which elements are subject to the transformation, rather than which elements are preserved.

    {-# LANGUAGE ScopedTypeVariables #-}
    
    import Control.Arrow ((&&&))
    import Data.Function (on)
    import Data.Monoid (First(..))
    
    mapSegmentsWhere
      :: forall a. (a -> Bool) -> ([a] -> [a]) -> [a] -> [a]
    mapSegmentsWhere p f
      = concatMap (applyMatching . sequenceA)  -- [a]
      . groupBy ((==) `on` fst)                -- [[(First Bool, a)]]
      . map (First . Just . p &&& id)          -- [(First Bool, a)]
      where
        applyMatching :: (First Bool, [a]) -> [a]
        applyMatching (First (Just matching), xs)
          = applyIf matching f xs
    
        applyIf :: forall a. Bool -> (a -> a) -> a -> a
        applyIf condition f
          | condition = f
          | otherwise = id
    

    Example use:

    > mapSegmentsWhere (/= 'a') (reverse . map toUpper) "aaassddddfaaaffddsssadddssdffsdf"
    "aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
    

    Here I use the First monoid with sequenceA to merge the lists of adjacent matching elements from [(Bool, a)] to (Bool, [a]), but you could just as well use something like map (fst . head &&& map snd). You can also skip the ScopedTypeVariables if you don’t want to write the type signatures; I just included them for clarity.