I am creating a python program that take a CSV file as an input (location of file as command-line) and before doing any preprocessing, it asserts if the content of the file is in specific format and if not, raise an exception to the user stating choose correct files.
The content should be something like this:
Sr.no . Codes . v1 . v2 . v3 . v4 . ... v300
1 . code1 . val1 . val2 . val3 . NA . ... NA
2 . code2 . val4 . NA . NA . NA . ... NA
3 . code3 . val5 . val6 . NA . NA . ... NA
4 . code4 . val7 . val8 . val9 . NA . ... NA
.
.
Basically it should be a CSV file, with first two columns as SrNo. and Codes and next 300 columns as some values, followed by 'NA' up to 300.
If user uploads something like this
Sr.no . Codes . v1 . v2 . v3 . . . . . . v300
1 . code1 . NA . val1 . NA . . . . . . NA
2 . code2 . val2 . val3 . NA . . . . . . NA
It should raise an exception as in line with Srno=1, there is a value, in column v2 despite having NA in column v1.
Want to know, how I can assert if the content of file is in this format using Python(a sample code snippet would be helpful). Also sources from where I can learn how to assert file content content for not just this format but generic formats as well.
For now, I have reached up to here, and need to complete assert_format function
import sys
import csv
def assert_format(file_name):
csv_file = open(file_name)
reader = csv.reader(csv_file)
#code to check format
return True
file_name = sys.argv[1]
if assert_format(file_name):
print("format is correct")
else:
print("choose correct file")
Thanks in advance!
See if this fits your requirement:
import sys
import csv
def assert_format(file_name):
with open(file_name, 'rb') as csvfile:
reader = csv.reader(csvfile, delimiter='.')
for row in reader:
flag=False
for cell in row:
if(cell == 'NA' and not flag):
flag=True
elif(cell == 'NA' and flag):
return False
return True
file_name = sys.argv[1]
if assert_format(file_name):
print("format is correct")
else:
print("choose correct file")