python memory data-structures turing-machines

What is the most efficient way to represent the infinite tape of a Turing machine in Python?

A Turing machine manipulates symbols on an infinite strip of tape according to a table of rules.

In my case entries on the tape can be either 0's and 1's (binary).

The tape starts off 'empty' with an infinity of 0's in both positive and negative directions. This tape gets written on based on the rules that the machine follows, potentially adding 0's or 1's to the unwritten edges.

Since the tape values are binary is there a most efficient way to represent them in memory as the machine writes to the tape, adding new values to the left or right? Obviously I don't need to represent the infinite zeros (unwritten tape), but I do need to expand the data structure as new values are added to the ends of the tape.

A python list could do the trick, shown here inserting a 1 on the left:

start_time = time.time()
tape = [0]*10000000
print tape[:10]
insertion_time = time.time()
tape = [1] + tape 
print tape[:10]
print "total execution time: ", time.time() - start_time
print "time to add entry: ", time.time() - insertion_time

this outputs:

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
total execution time:  0.257364034653
time to add entry:  0.109524011612

Using list.insert does make it 23x faster:

start_time = time.time()
tape = [0]*100000
print tape[:10]
insertion_time = time.time()
tape.insert(0, [1])
print tape[:10]
print "execution time: ", time.time() - start_time
print "time to add entry: ", time.time() - insertion_time

giving:

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[[1], 0, 0, 0, 0, 0, 0, 0, 0, 0]
execution time:  0.0270080566406
time to add entry:  0.00468802452087

Is there a more efficient way to represent the binary tape than using a list object?

Solution

Given that the tape is infinite at both ends, you will have to consider the (-inf, -1] range of indices too. The best representation, IMO, would be a pair of lists - one for the non-negative range and the other for the negative range (with indexing reversed). As you move beyond the allocated range of the tape, you will simply have to append to the respective list (which, unlike prepending to a list, is a fast operation).

A draft implementation (using an enhanced version of GrowingList from this answer):

class GrowingList(list):
    def __init__(self, default_value):
        self.default_value=default_value

    def __getitem__(self, i):
        return list.__getitem__(i) if i < len(self) else self.default_value

    def __setitem__(self, i, v):
        if i >= len(self):
            self.extend([self.default_value]*(i + 1 - len(self)))
        list.__setitem__(self, i, v)

class Tape:
    def __init__(self):
        self.pos_range=GrowingList(0)
        self.neg_range=GrowingList(0)

    def __getitem__(self, i):
        if i >= 0:
            return self.pos_range[i]
        else:
            return self.neg_range[-i-1]

    def __setitem__(self, i, v):
        if i >= 0:
            self.pos_range[i]=v
        else:
            self.neg_range[-i-1]=v

    def __repr__(self):
        start = -len(self.neg_range)
        end = len(self.pos_range)
        data = list(reversed(self.neg_range)) + self.pos_range
        return "Tape(range=[{}, {}), data={})".format(start, end, data)


t = Tape()
print(t)
t[4]=1
print(t)
t[-2]=1
print(t)

Output:

Tape(range=[0, 0), data=[])
Tape(range=[0, 5), data=[0, 0, 0, 0, 1])
Tape(range=[-2, 5), data=[1, 0, 0, 0, 0, 0, 1])