I am training with Python and in an exercise I am supposed to open a file .csv and to find how many times in a file recurs the name "Max" in California("CA") between 1950 and 2000. This is what I done:
import csv
counter = 0
for line in file:
counter = counter + 1
line_splitted = line.strip().split(",")
if line_splitted[1] == "Max":
print(line_splitted)
An extract of the output (the entries are many more) is:
['17261', 'Max', '1965', 'M', 'AK', '6']
['20094', 'Max', '1983', 'M', 'AK', '5']
['20291', 'Max', '1984', 'M', 'AK', '5']
['20604', 'Max', '1986', 'M', 'AK', '10']
['20786', 'Max', '1987', 'M', 'AK', '10']
Then I wrote:
if line_splitted[1] == "Max" and line_splitted[2] >= 1950 and line_splitted[2] <= 2000 and line_splitted[3] == "M" and line_splitted[4]== "CA":
print(line_splitted)
else:
continue
And this is the result:
TypeError Traceback (most recent call last)
<ipython-input-53-d4b5d798cf33> in <module>
8 line_splitted = line.strip().split(",")
9 if line_splitted[1] == "Max":
---> 10 if line_splitted[1] == "Max" and line_splitted[2] >= 1950 and line_splitted[2] <= 2000 and line_splitted[3] == "M" and line_splitted[4]== "CA":
11 print(line_splitted)
12
TypeError: '>=' not supported between instances of 'str' and 'int'
I know that I should say to Python to convert the entry on index 2 in integers but I don´t know how to do it.Moreover I suspect that my solution is way too long in order to extract the informations I need. Thank you very much in advance for any suggestion.
The easiest way (for your example) is probably to compare to a string:
and line_splitted[2] >= "1950"
This way you don't have to convert to integer first.
This will only work if all those string are exactly 4 characters long.