How to fix cellular automata/spatial prisoners dilemma that is not replicating properly

I am trying to replicate the results of a paper (if you are interested, its Nowak & May, 1992: Evolutionary Games and Spatial Chaos) that create a set of fractals by running a prisoners dilemma on an n x n grid (for example, https://www.researchgate.net/figure/Spatial-version-of-the-Prisoners-Dilemma-for-symmetrical-initial-conditions-Nowak_fig3_277476479), but my results are not what they should be. The idea is that the grid is populated entirely by Cooperators, except for a single Defector object that is placed in the center of the grid. Different interactions yield different payoffs: mutual defectors yield a payoff of 0, mutual cooperators a payoff of 1 each, and a defector against a cooperator yields a payoff of b for the defector and 0 for the cooperator, where b > 1. All objects in the grid play against each other and receive a score according to the above payoff structure. After each generation, each object on a node is replaced by the neighbor with the highest score. Since the defector strategy is the superior strategy, it should invade the Cooperator population and produce said fractal images, as a cellular automata would.

The main way I have tried doing this (also the main area I have had trouble with) is through the replace_pop function shown below. After each round, the program loops through the grid and replaces any object on a node with a neighbour object that has a higher score. I thought that this would have been sufficient but as one can see after even a few generations, there is some form of replication but just not in the way it should happen, making it difficult to pinpoint what exactly is going wrong. At N = 1 (N is the number of generations) the result seems correct, as the neighbouring (neighbours are left, right, above and below) Cooperators become Defectors, but as N grows larger the image just goes astray.

I also reinitialized each objects score to 0 after each generation to ensure that proper replication can take place. When this is not done however, the population evolves in the same fashion as the N = 1 case above but for all subsequent generations, which is peculiar because there should be defectors that have higher scores than surrounding Cooperators. I am not sure where I am going wrong? My code is below (sorry for including all of it but I do not know where exactly is the problem). I am pretty new to Python and Stack so any help would be appreciated.

import random
import matplotlib.pyplot as plt

row = 99
col = 99

class Cooperator:
    def __init__(self):
        self.score = 0
        self.id = 'C'

class Defector:
    def __init__(self):
        self.score = 0
        self.id = 'D'

class Grid:
    def __init__(self, rowsize, colsize):
        self.rowsize = rowsize
        self.colsize = colsize

    def make_grid(self):
        n = self.rowsize
        m = self.colsize
        arr = [[0 for j in range(m)] for i in range(n)]
        return arr

    def populate_grid(self):
        empty_grid = self.make_grid()
        for i in range(self.rowsize):
            for j in range(self.colsize):
                empty_grid[i][j] = Cooperator()
        empty_grid[i//2][j//2] = Defector()
        return empty_grid

    def shuffle_population(self):
        populated_grid = self.populate_grid()
        for i in range(self.rowsize):
            random.shuffle(populated_grid[i])
        return populated_grid

def von_neumann_neighbourhood(array, row, col, wrapped=True):
    """gets von neumann neighbours for a specfic point on grid with or without wrapping"""
    neighbours = []
    #conditions for in bound points
    if row + 1 <= len(array) - 1:
        neighbours.append(array[row + 1][col])

    if row - 1 >= 0:
        neighbours.append(array[row - 1][col])

    if col + 1 <= len(array[0]) - 1:
        neighbours.append(array[row][col + 1])

    if col - 1 >= 0:    
        neighbours.append(array[row][col - 1])
    #if wrapped is on, conditions for out of bound points
    if row - 1 < 0 and wrapped == True:
        neighbours.append(array[-1][col])

    if col - 1 < 0 and wrapped == True:
        neighbours.append(array[row][-1])

    if row + 1 > len(array) - 1 and wrapped == True:
        neighbours.append(array[0][col])

    if col + 1 > len(array[0]) - 1 and wrapped == True:
        neighbours.append(array[row][0])

    return neighbours

def play_round(array, row, col):
    b = 1.70
    player = array[row][col]
    neighbours = von_neumann_neighbourhood(array, row, col)
    for neighbour in neighbours:
        if player.id == 'C' and neighbour.id == 'C':
            player.score += 1
            neighbour.score += 1
        if player.id == 'D' and neighbour.id == 'D':
            player.score += 0
            neighbour.score += 0
        if player.id == 'D' and neighbour.id == 'C':
            player.score += b
            neighbour.score += 0
        if player.id == 'C' and neighbour.id == 'D':
            player.score += 0
            neighbour.score += b

def replace_pop(array, row, col):   
    neighbour_score = 0
    type_neighbour = ""
    neighbours = von_neumann_neighbourhood(array, row, col)
    player_score = array[row][col].score

    for neighbour in neighbours:
        if neighbour.score > neighbour_score:
            neighbour_score = neighbour.score
            type_neighbour = neighbour.id
    if player_score < neighbour_score:
        if type_neighbour == "C":
            array[row][col] = Cooperator()
        if type_neighbour == "D":
            array[row][col] = Defector()

N = 1
last_gen = []

def generations(N, row, col, array):
    for gen in range(N):    
        for z in range(row):
            for x in range(col):
                play_round(array, z, x)

        for r in range(row):
            last_gen.append([])
            for c in range(col):
                last_gen[r].append(lattice[r][c].id)
                replace_pop(array, r, c)

        for obj in lattice:
            for ob in obj:
                ob.score = 0

lattice = Grid(row, col).populate_grid()
generations(N, row, col, lattice)

heatmap_stuff = []
for z in range(row):
    heatmap_stuff.append([])
    for v in range(col):
        if lattice[z][v].id == 'C' and last_gen[z][v] == 'C':
            heatmap_stuff[z].append(1)
        if lattice[z][v].id == 'D' and last_gen[z][v] == 'D':
            heatmap_stuff[z].append(0)
        if lattice[z][v].id == 'C' and last_gen[z][v] == 'D':
            heatmap_stuff[z].append(3)
        if lattice[z][v].id == 'D' and last_gen[z][v] == 'C':
            heatmap_stuff[z].append(4)

plt.imshow(heatmap_stuff, interpolation='nearest')
plt.colorbar()
plt.show()

Edit: I have updated the code in line with Ilmari's suggestions. Although the results look better, as well as returning an actual fractal in real-time, the results are still not optimal, leading me to think there might be a bug elsewhere since the cells seem to be updating correctly. Below is the updated code I have added/replaced to the previous code.

def get_moore_neighbours(grid, row, col):
    neighbours = []
    for x, y in (
            (row - 1, col), (row + 1, col), (row, col - 1),
            (row, col + 1), (row - 1, col - 1), (row - 1, col + 1),
            (row + 1, col - 1), (row + 1, col + 1)):
        if not (0 <= x < len(grid) and 0 <= y < len(grid[x])):
            # out of bounds
            continue
        else:
            neighbours.append(grid[x][y])
    return neighbours

def calculate_score(grid, row, col):
    b = 1.85
    player = grid[row][col]
    neighbours = get_moore_neighbours(grid, row, col)
    for neighbour in neighbours:
        if player.id == 'C' and neighbour.id == 'C':
            player.score += 1
            neighbour.score += 1
        if player.id == 'D' and neighbour.id == 'D':
            player.score += 0
            neighbour.score += 0
        if player.id == 'D' and neighbour.id == 'C':
            player.score += b
            neighbour.score += 0
        if player.id == 'C' and neighbour.id == 'D':
            player.score += 0
            neighbour.score += b
    return player.score

def best_neighbor_type(grid, row, col): 
    neighbour_score = 0
    type_neighbour = ""
    neighbours = get_moore_neighbours(grid, row, col)
    player_score = grid[row][col].score

    for neighbour in neighbours:
        if neighbour.score > neighbour_score:
            neighbour_score = neighbour.score
            type_neighbour = neighbour.id
    if player_score < neighbour_score:
        if type_neighbour == "C":
            return 'C'
        if type_neighbour == "D":
            return 'D'
    if player_score >= neighbour_score:
        return grid[row][col].id

N = 15
heatmap_data = Grid(row, col).make_grid()
lattice = Grid(row, col).populate_grid()
dbl_buf = Grid(row, col).populate_grid()

for gen in range(N):    
    for r in range(row):
        for c in range(col):
            lattice[r][c].score = calculate_score(lattice, r, c)

    for r in range(row):
        for c in range(col):
            dbl_buf[r][c].id = best_neighbor_type(lattice, r, c)

    for r in range(row):
        for c in range(col):
            if lattice[r][c].id == 'C' and dbl_buf[r][c].id == 'C':
                heatmap_data[r][c] = 1
            if lattice[r][c].id == 'D' and dbl_buf[r][c].id == 'D':
                heatmap_data[r][c] = 2
            if lattice[r][c].id == 'C' and dbl_buf[r][c].id == 'D':
                heatmap_data[r][c] = 3
            if lattice[r][c].id == 'D' and dbl_buf[r][c].id == 'C':
                heatmap_data[r][c] = 4

    plt.imshow(heatmap_data, interpolation='nearest')
    plt.pause(0.01)

    (lattice, dbl_buf) = (dbl_buf, lattice)

plt.show()

Solution

Looking at your code, a few issues jump out:

You never reset the last_gen array between generations, so you're constantly appending new (empty) rows to it and making the first row rows longer and longer. This is almost certainly a bug.
You also never use the last_gen array for anything except generating the heat map. In particular, your replace_pop() function is modifying the same array (creatively named array) that it reads the neighbor states from.

The second issue means that the behavior of your code will depend on the order in which you loop over the cells to call replace_pop() in each generation, since replacing one cell with a different neighbor will affect the neighborhood of all of its neighbors that haven't yet been updated in this generation.

In a cellular automaton like described in the paper you cite, all the cells are supposed to update their state effectively simultaneously, such that changes to each cell's state won't become visible to its neighbors until the next generation.

In practice, the simplest way to implement this kind of "simultaneous" updating is to use double buffering, where you first copy the state of all the cells into a second array, and then update the first array based on the copy you just made. Or, more efficiently, just swap the (references to) the arrays instead of copying one into the other. The code would look something like this:

lattice = Grid(row, col).populate_grid()
dbl_buf = Grid(row, col)

for gen in range(N):    
    for r in range(row):
        for c in range(col):
            lattice[r][c].score = calculate_score(lattice, r, c)

    # This is probably the best spot for generating output, since both
    # buffers contain consistent and up-to-date IDs and scores here.

    for r in range(row):
        for c in range(col):
            dbl_buf[r][c].id = best_neighbor_type(lattice, r, c)

    (lattice, dbl_buf) = (dbl_buf, lattice)

where the calculate_score() function returns the score of the given cell on the lattice based on the types of its neighbors, and the best_neighbor_id() function returns the type ID of the highest-scoring neighbor of the cell on the lattice.

Addendum: Your implementation of calculate_score() in your updated code has some bugs:

you start the calculations from the previous score value (which is actually from two generations back due to the double buffering),
you're redundantly writing the score directly to the grid inside the function, rather than just returning the score to the caller, and
you're also redundantly updating the scores of the cell's neighbors, leading to some interactions being effective counted twice.

However, the real reason why you're getting different results than in the Nowak & May paper is because of a conceptual difference: the paper assumes that cells also play the game with themselves, effectively giving cooperators a one point score boost. Your implementation doesn't include that, leading to different dynamics for the same parameter values.

Anyway, here's how I'd rewrite the function:

def calculate_score(grid, row, col):
    neighbours = get_moore_neighbours(grid, row, col)
    player = grid[row][col]
    score = 0
    if player.id == 'C': score += 1  # self-interaction!
    for neighbour in neighbours:
        if player.id == 'C' and neighbour.id == 'C':
            score += 1
        if player.id == 'D' and neighbour.id == 'C':
            score += b
    return score

With that change, your code produces very similar patterns as in the Nowak & May paper:

BTW, I'm not sure how Nowak & May handle the edges of the lattice, which might cause the patterns to diverge once they hit the edge. Your implementation effectively excludes any neighbors outside the edges from the score calculation, as if the lattice was surrounded by non-spreading defectors.