Search code examples
pythonstringloopscsvreader

How can I use a loop to create similarly-named strings for a number of similar columns imported from a csv?


I want to work with data form a csv in python. I'm looking to make each column a separate string, and I am wondering if there is a way to loop through this process so that I don't have to specify the name of each string individually (as the naming conventions are very similar).

For a number of the csv columns, I am using the following code:

    dot_title=str(row[0]).lower()
    onet_title=str(row[1]).lower()

For [2]-[11], I would like each string to be named the same but numbered. I.e., row[2] would become a string called onet_reported_1, row[3] would be onet_reported_2, row[4] would be onet_reported_3... etc., all the way through to row[12].

Is there a way of doing this with a loop, instead of simply defining onet_reported_1, _2, _3, _4 etc. individually?

Thanks in advance!


Solution

  • So, first some clarity.

    A string is a variable type. In Python, you create a string by surrounding some text in either single or double quotes.

    "This is a string"
    'So is this. It can have number characters: 123. Or any characters: !@#$'
    

    Strings are values that can be assigned to a variable. So you use a string by giving it a name:

    my_string = "This is a string"
    another_string = "One more of these"
    

    You can do different kinds of operations on strings like joining them with the + operator

    new_string = my_string + another_string
    

    And you can create lists of strings:

    list_of_strings = [new_string, my_string, another_string]
    

    which looks like ["This is a stringOne more of these", "This is a string", "One more of these"].

    To create multiple strings in a loop, you'll need a place to store them. A list is a good candidate:

    list_of_strings = []
    for i in range(1, 11):
        list.append("onet_reported_" + i)
    

    But I think what you want is to name the variables "onet_reported_x" so that you end up with something equivalent to :

    onet_reported_1 = row[1] 
    onet_reported_2 = row[2] 
    

    and so forth, without having to type out all that redundant code. That's a good instinct. One nice way to do this kind of thing is to create a dictionary where the keys are the string names that you want and the values are the row[i]'s. You can do this in a loop:

    onet_dict = {}
    for i in range(1, 11):
        onet_dict["onet_reported_" + i] = row[i]
    

    or with a dictionary comprehension:

    onet_dict = {"onet_reported_" + i: row[i] for i in range(1,11)}
    

    Both will give you the same result. Now you have a collection of strings with then names you want as the keys of the dict that are mapped to the row values you want them associated to. To use them, instead of referring directly to the name onet_reported_x you have to access the value from the dict like:

    # Adding some other value to onet_reported_5. I'm assuming the values are numbers.
    
    onet_dict["onet_reported_5"] += 20457