Here from this 2 .csv files filtering is done and common emailid's are deleted,I am able to get the total after deletion ,But is there any option that gives how many rows are deleted using pandas.
using mysql : delete a from data a, data1 b where a.email=b.email; select row_count(); How can this be done using pandas
import pandas as pd
colnames=['id','emailid']
data=pd.read_csv("input.csv",names=colnames,header=None)
colnames=['email']
data1= pd.read_csv("compare.csv",names=colnames,header=None)
emailid_suppress1=data1['email'].str.lower()
suppress_md5=data[~data['emailid'].isin(emailid_suppress1)]
print suppress_md5.count()
I believe need sum
of True
s values which are processes like 1
:
data = pd.DataFrame({'id':list('abcde'), 'emailid':list('klmno')})
print (data)
id emailid
0 a k
1 b l
2 c m
3 d n
4 e o
data1 = pd.DataFrame({'email':list('ABCKLDEFG')})
print (data1)
email
0 A
1 B
2 C
3 K
4 L
5 D
6 E
7 F
8 G
emailid_suppress1=data1['email'].str.lower()
print ((~data['emailid'].isin(emailid_suppress1)).sum())
3
suppress_md5=data[~data['emailid'].isin(emailid_suppress1)]
print (suppress_md5)
id emailid
2 c m
3 d n
4 e o
EDIT:
print ((data['emailid'].isin(emailid_suppress1)).sum())
2
suppress_md5=data[data['emailid'].isin(emailid_suppress1)]
print (suppress_md5)
id emailid
0 a k
1 b l