I'm trying to have this kind of result :
Here is the csv-file :
OsmID,NewName,IdLocal
1020287758,NN1,Id0001
1021229973,NN2,Id0002
1025409497,NN3,Id0003
I'm using the code below:
import csv
input = r'C:\Users\_M92\csvFiles\csv0001.csv'
fileRead = open(input, 'r')
with open(input, 'r') as csv:
headerLine = fileRead.readline()
header = headerLine.split(",")
#print(header)
nameIndex = header.index("OsmID")
output = {}
for line in fileRead.readlines():
values = line.split(",")
output[values[nameIndex]] = values
print(output)
And it results in the following error:
File "c:\Users\_M92\Scripts\CsvToDict.py",
line 19, in <module>
nameIndex = header.index("OsmID")
ValueError: 'OsmID' is not in list
Instead of manually splitting each line by commas, use the CSV module that you've imported. This module contains a DictReader
class that will yield dictionaries for each row. Then, you just need to add this to your output dictionary.
# Create an empty dictionary
# We will add keys to this as needed
output = {}
# Keep track of number of rows, so we can add an empty column if needed
row_count = 0
# This function adds a row to the output dictionary
def add_row(row_dict):
global row_count # Need to declare this as global because we're assigning to the variable in this function
if not row_dict: return # If row is empty, do nothing
for k, v in row_dict.items():
# Loop over all key-value pairs in the row to add
if k not in output: # If the output doesn't contain this column, create a blank column
output[k] = [None] * row_count
output[k].append(v) # Append the value to the correct column in output
row_count += 1
input_file = r'C:\Users\_M92\csvFiles\csv0001.csv'
with open(input_file, 'r') as fh:
reader = csv.DictReader(fh) # Create a DictReader
for row in reader:
add_row(row) # Add every row to the output
This gives the following output
:
{'OsmID': ['1020287758', '1021229973', '1025409497'],
'NewName': ['NN1', 'NN2', 'NN3'],
'IdLocal': ['Id0001', 'Id0002', 'Id0003']}
Note: I removed the blank lines in the input csv you provided, but it doesn't make a difference to the program, since a blank line will yield an empty dictionary from DictReader
, and add_row
doesn't do anything with empty dicts
Note 2: You could discard the row_count
variable if you dynamically count the number of rows like so:
def add_row(row_dict):
row_count = 0
for first_key, first_val in output.items():
row_count = len(first_val)
break # We can just break out here because all keys should have the same number of values
# Create keys that do not yet exist in output but do exist in the new row
existing_keys = set(output.keys())
new_row_keys = set(row_dict.keys())
keys_to_create = new_row_keys - existing_keys
for key in keys_to_create:
output[key] = [None] * row_count
# Append to each column in output
for key in output:
output[key].append(row_dict.get(key, None)) # If the key doesn't exist in the current row, append None