Search code examples
pythonperformancerandom-access

Random-access container for strings in python?


I manipulate with indexed instances (say, music tracks) and have to lookup object's name by its index (int->string). Dicts are too slow (I have about 10M objects). Memory is not a problem, so the most convinient solution would be to create a random-access array of strings from csv file with names.

However, I have failed to make this in python -- I got an error that 0-dim arrays (strings) couldn't be indexed. What's the native python way to create random access container for strings?


Solution

  • From what I remember, dictionaries in Python have O(1) average access time, but lists will definitely be faster. If your indices are not very sparse, you can try something like this:

    reader = [(1, 'a'), (2, 'b')] # Replace it with your CSV reader.
    
    # First, fill a dictionary:
    text_dict = {}
    for index, text in reader:
        text_dict[index] = text
    
    # Then create a sufficiently large list:
    max_index = max(text_dict.iterkeys())
    texts = [None] * (max_index + 1)
    
    # And fill it:
    for index, text in text_dict.iteritems():
        texts[index] = text
    
    print texts
    # prints: [None, 'a', 'b']
    print texts[1]
    # prints: a