Relatively New to pandas and trying to delete every row from file XYZ that is present in file ABC.
Code:
import pandas as pd
# Reads two excel files
clm1 = pd.read_csv('ABC.csv')
clm2 = pd.read_csv('XYZ.csv')
# Prints file length
print('Main file clm2: '+ str(len(clm2['image_url'])))
print('Referral file clm1': str(len(clm1['Input.image_url'])))
for index1 in clm1.index:
for index2 in clm2.index:
if clm2['image_url'][index2] == clm1['Input.image_url'][index1]:
print("Entered into deletion condition!!")
print(clm2['image_url'][index2])
print(clm1['Input.image_url'][index1])
print('\n \n')
clm2.drop(clm2['image_url'][index2], axis=0, inplace=True)
print('Deleted!!')
print('Main file clm2: ' + str(len(clm2['image_url'])))
On entering the deletion condion, it's printing the below line correctly:
print(clm2['image_url'][index2])
print(clm1['Input.image_url'][index1])
print('\n \n')
But getting an error on the line:
clm2.drop(clm2['image_url'][index2], axis=0, inplace=True)
Error says:
File "compare_delete_imagelinks.py", line 19, in <module>
clm2.drop(clm2['image_url'][index2], axis=0, inplace=False)
File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/frame.py", line 3940, in drop
errors=errors)
File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/generic.py", line 3780, in drop
obj = obj._drop_axis(labels, axis, level=level, errors=errors)
File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/generic.py", line 3812, in _drop_axis
new_axis = axis.drop(labels, errors=errors)
File "/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 4965, in drop
'{} not found in axis'.format(labels[mask]))
KeyError: "['https://Xxxxxxx.216PPU~V.JPG'] not found in axis"
(MyDjangoEnv) SL-SP-LAP-0384:scripts AjayB$
How to tackle this?
This should work if your csv look like this:
XYZ.csv:
name,value
a,1
b,2
c,3
d,4
e,5
f,6
ABC.csv:
name,value
a,1
b,2
c,3
d,4
Code:
import pandas as pd
import numpy as np
xyz = pd.read_csv("XYZ.csv", index_col='name')
abc = pd.read_csv("ABC.csv", index_col='name')
for i in abc.index:
if i in xyz.index:
xyz.drop(i, axis=0, inplace=True)
print(xyz)