I am able to read multiple csv files from S3 bucket with boto3 in python and finally combine those files in single dataframe in pandas.However, in some of the folders there are some empty files which results in the error "No columns to parse from file". Can we skip those empty files in the below codes?
s3 = boto3.resource('s3')
bucket = s3.Bucket('testbucket')
prefix_objs = bucket.objects.filter(Prefix="extracted/abc")
prefix_df = []
for obj in prefix_objs:
key = obj.key
body = obj.get()['Body'].read()
temp = pd.read_csv(io.BytesIO(body),header=None, encoding='utf8',sep=',')
prefix_df.append(temp)
I have used this ans [https://stackoverflow.com/questions/52855221/reading-multiple-csv-files-from-s3-bucket-with-boto3][1]
s3 = boto3.resource('s3')
bucket = s3.Bucket('testbucket')
prefix_objs = bucket.objects.filter(Prefix="extracted/abc")
prefix_df = []
for obj in prefix_objs:
try:
key = obj.key
body = obj.get()['Body'].read()
temp = pd.read_csv(io.BytesIO(body),header=None, encoding='utf8',sep=',')
prefix_df.append(temp)
except:
continue