algorithm artificial-intelligence grid-layout maze

Grid with obstacles coverage algorithm

I have to find an algorithm for a robot Agent to do the following (I'm sorry, I don't really know how to call it):

The robot is on a 10x10 grid with obstacles (each square is either a obstacle or traversable)
The robot has a bump sensor : it activates when the robot hits an obstacle.
On the grid there are carrots that are continously growing. There are fast-growing squares and slow growing squares.
Each step, the robot can : advance or turn 90° right or left or stay in place
The locations of the carrots and obstacles are not know before hand
The carrots continue growing while the robot is moving (even after harvest)
Carrots grow in most squares that are not obstacles
The robot does not know if the squares are fast or slow growing
In each square there can be between 0 and 20 carrots. At each time instance, there is a probability p = 0.01 (or p = 0.02 for fast-growing squares) for the amount of carrots of a square to increment
You can measure the amount of carrots you harvest.

The goal is to get the maximum amount of carrots in 2000 steps.

Would there be a lazy/easy way to do it?

So far, I am a bit lost, as it is not a maze-solving problem. Would it be a sort a flood-filling algorithm ? Is there anything simpler ?

I'm not necessarily searching to "solve" the problem, but rather for an easy approximation if possible

Solution

It is indeed a bit of work to find a robot implementation which has the perfect strategy, given that it does not know the location and the number of the food sources.

Any given strategy of a bot might not yield the maximum possible harvest in each run. So the question is rather, which strategy is most successful over a number of simulation runs.

To find a decent strategy for a given statistical distribution of square types (P(fastFood),P(slowFood),P(obstacle)), one might come up with the following idea:

Let Bot(npatch) be a bot which looks for npatch food spots. With the strategy to eat up what it finds in the first food patch before it searches the second and so on. When it visited npatch food sources (or found no more food patches), it returns to the first one found and re-harvests.

This class of bots (Bot(npatch)) can now compete against each other in a statistically relevant number of simulation runs. Best bot is winner of the competition.

This approach can be considered inspired by genetic algorithms, yet without mixing any genes but simply iterating all of them (1..npatch). Maybe someone has an idea how to turn this idea to a fully genetic algorithm. This could involve turning to a Bot(npatch,searchStrategy) and then, having multiple genes to apply a genetic algorithm.

Whenever the parameters of the simulation change, the competition has to be repeated, obviously as depending on the number of food patches in the world, it might or might not pay off to go find yet another food patch if some food patches are known already.

The code below is written in F# and is the simulator for that question (if I got all requirements right, that is...). Writing a new bot is as simple as writing a function, which is then passed to the simulator.

Consider this my easter egg for those of you who would like to try their own bots.

The 2 bots I wrote are called "marvinRobot" which does what Marvin would do and "lazyRobot" a bot which camps on the first food source it finds.

type Square =
    | Empty
    | Obstacle
    | Food of float * (float -> float) // available * growth
    | Unknown

let rnd = new System.Random()
let grow p a =
    let r = rnd.NextDouble()
    if r < p then a + 1.0
    else a

let slowGrowth a = grow 0.01 a
let fastGrowth a = grow 0.02 a

let eatPerTick = 1.0 
let maxFoodPerSquare = 20.0

let randomPick values =
    let count = List.length values
    let r = rnd.Next(0,count-1)
    values.Item(r)

type World = Square[,]

let randomSquare pobstacle pfood =
    let r = rnd.NextDouble()
    match r with
    | x1 when x1 < pobstacle -> Obstacle
    | x2 when x2 < (pobstacle + pfood) && x2 >= pobstacle -> 
        Food(rnd.NextDouble() * maxFoodPerSquare, randomPick [slowGrowth; fastGrowth])
    | _ -> Empty

let createRandomWorld n pobstacle pfood = 
    Array2D.init n n (fun col row -> randomSquare pobstacle pfood)

let createUnknownWorld n =
    Array2D.create n n Unknown

type Position = { Column : int; Row : int }

type RoboState = { Memory : Square[,]; Pos : Position; Heading : Position }
type RoboAction = 
    | TurnRight
    | TurnLeft
    | MoveOne
    | Eat
    | Idle

type RoboActor = World -> RoboState -> RoboAction

let right heading : Position =
    match heading with
    | { Column = 0; Row = 1 } -> { Column = -1; Row = 0 }
    | { Column = -1; Row = 0 } -> { Column = 0; Row = -1 }
    | { Column = 0; Row = -1 } -> { Column = 1; Row = 0 }
    | { Column = 1; Row = 0 } -> { Column = 0; Row = 1 }
    | _ -> failwith "Invalid heading!"

let left heading : Position =
    match heading with
    | { Column = -1; Row = 0 } -> { Column = 0; Row = 1 }
    | { Column = 0; Row = -1 } -> { Column = -1; Row = 0 }
    | { Column = 1; Row = 0 } -> { Column = 0; Row = -1 }
    | { Column = 0; Row = 1 } -> { Column = 1; Row = 0 } 
    | _ -> failwith "Invalid heading!"

let checkAccess n position =
    let inRange v = v >= 0 && v < n
    (inRange position.Column) && (inRange position.Row)

let tickWorld world =
    world 
    |> Array2D.map 
        (fun sq -> 
            match sq with 
            | Empty -> Empty 
            | Obstacle -> Obstacle 
            | Food(a,r) -> Food(min (r a) maxFoodPerSquare, r)
            | Unknown -> Unknown
        )

let rec step robot world roboState i imax acc = 
    if i < imax then
        let action = robot world roboState
        match action with
        | TurnRight ->
            let rs1 = { roboState with Heading = right roboState.Heading }
            let wrld1 = tickWorld world
            step robot wrld1 rs1 (i+1) imax acc
        | TurnLeft ->
            let rs1 = { roboState with Heading = left roboState.Heading }
            let wrld1 = tickWorld world
            step robot wrld1 rs1 (i+1) imax acc
        | MoveOne ->
            let rs1 =
                let c = 
                    { Column = roboState.Pos.Column + roboState.Heading.Column 
                      Row = roboState.Pos.Row + roboState.Heading.Row
                    }
                if checkAccess (Array2D.length1 world) c 
                then 
                    match world.[c.Column,c.Row] with
                    | Obstacle -> 
                        roboState.Memory.[c.Column,c.Row] <- Obstacle
                        roboState
                    | _ -> { roboState with Pos = c }
                else
                    roboState
            let wrld1 = tickWorld world
            step robot wrld1 rs1 (i+1) imax acc
        | Eat -> 
            let eat,acc1 = 
                match world.[roboState.Pos.Column,roboState.Pos.Row] with
                | Empty -> Empty,acc
                | Obstacle -> Obstacle,acc
                | Food(a,r) -> 
                    let eaten = if a >= eatPerTick then eatPerTick else 0.0
                    printfn "eating %f carrots" eaten
                    Food(a - eaten, r),eaten + acc
                | Unknown -> Unknown,acc
            world.[roboState.Pos.Column,roboState.Pos.Row] <- eat
            let wrld1 = tickWorld world
            step robot wrld1 roboState (i+1) imax acc1
        | Idle ->
            step robot (tickWorld world) roboState (i+1) imax acc
    else
        acc

let initRoboState n = 
    {   Memory = createUnknownWorld n; 
        Pos = { Column = 0; Row = 0;}; 
        Heading = {Column = 1; Row = 0}
    }

let simulate n pobstacle pfood imax robot =
    let w0 = createRandomWorld n pobstacle pfood
    let r0 = initRoboState n
    printfn "World: %A" w0
    printfn "Initial Robo State: %A" r0
    let result = step robot w0 r0 0 imax 0.0
    printfn "Final Robo State: %A" r0
    result

// Not that Marvin would care, but the rule for this simulator is that the 
// bot may only inspect the square in the world at the current position.
// This means, IT CANNOT SEE the neighboring squares.
// This means, that if there is a obstacle next to current square, 
// it costs a simulation tick to find out, trying to bump against it.
// Any access to other squares in world is considered cheating!
// world is passed in spite of all said above to allow for alternate rules.
let marvinRobot world roboState =
    Idle

// Tries to find a square with food, then stays there, eating when there is something to eat.
let lazyRobot (world : World) (roboState : RoboState) =
    let search() =
        let status action : RoboAction =
            match action with
            | TurnLeft -> printfn "%A TurnLeft at %A (heading: %A)" world.[roboState.Pos.Column,roboState.Pos.Row] roboState.Pos roboState.Heading
            | TurnRight -> printfn "%ATurnRight at %A (heading: %A)" world.[roboState.Pos.Column,roboState.Pos.Row]  roboState.Pos roboState.Heading
            | MoveOne -> printfn "%A MoveOne at %A (heading: %A)" world.[roboState.Pos.Column,roboState.Pos.Row] roboState.Pos roboState.Heading
            | Idle -> printfn "%A Idle at %A (heading: %A)" world.[roboState.Pos.Column,roboState.Pos.Row] roboState.Pos roboState.Heading
            | Eat -> printfn "%A Eat at %A (heading: %A)" world.[roboState.Pos.Column,roboState.Pos.Row] roboState.Pos roboState.Heading
            action
        let neighbors = 
            [ roboState.Heading, MoveOne;
              (roboState.Heading |> right),TurnRight;
              (roboState.Heading |> left),TurnLeft;
              (roboState.Heading |> right |> right),TurnRight
            ]
            |> List.map (fun (p,a) -> (p.Column,p.Row),a)
            |> List.map (fun ((c,r),a) -> (roboState.Pos.Column + c,roboState.Pos.Row + r),a)
            |> List.filter (fun ((c,r),a) -> checkAccess (Array2D.length1 world){Position.Column = c; Row = r})
            |> List.sortBy (fun ((c,r),a) -> match roboState.Memory.[c,r] with | Food(_,_) -> 0 | Unknown -> 1 | Empty -> 2 | Obstacle -> 3)
            |> List.map (fun ((c,r),a) -> { Column = c; Row = r},a)
        if neighbors.IsEmpty then failwith "It's a trap!" // can happen if bot is surrounded by obstacles, e.g.  in a corner
        else
            let p,a = neighbors.Head
            status a
    roboState.Memory.[roboState.Pos.Column, roboState.Pos.Row] <- 
            world.[roboState.Pos.Column,roboState.Pos.Row]
    match world.[roboState.Pos.Column,roboState.Pos.Row] with
    | Food(a,_) -> 
        printfn "Found food at %A" roboState.Pos
        Eat
    | _ -> 
        search()

//simulate 10 0.1 0.05 2000 marvinRobot
simulate 10 0.1 0.1 2000 lazyRobot

Last not least a tip: if you simulate with 0.0 food patches, your bot should have visited all squares on the map. If it fails to do that, it is for sure not a good bot ;)