Search code examples
mysqlpython-2.7pandaspymysql

how to know the count of number of rows deleted using pandas


Here from this 2 .csv files filtering is done and common emailid's are deleted,I am able to get the total after deletion ,But is there any option that gives how many rows are deleted using pandas.

using mysql : delete a from data a, data1 b where a.email=b.email; select row_count(); How can this be done using pandas

import pandas as pd

colnames=['id','emailid']

data=pd.read_csv("input.csv",names=colnames,header=None)

colnames=['email']

data1= pd.read_csv("compare.csv",names=colnames,header=None)

emailid_suppress1=data1['email'].str.lower()

suppress_md5=data[~data['emailid'].isin(emailid_suppress1)]

print suppress_md5.count()

Solution

  • I believe need sum of Trues values which are processes like 1:

    data = pd.DataFrame({'id':list('abcde'), 'emailid':list('klmno')})
    print (data)
      id emailid
    0  a       k
    1  b       l
    2  c       m
    3  d       n
    4  e       o
    
    data1 = pd.DataFrame({'email':list('ABCKLDEFG')})
    print (data1)
      email
    0     A
    1     B
    2     C
    3     K
    4     L
    5     D
    6     E
    7     F
    8     G
    
    emailid_suppress1=data1['email'].str.lower()
    
    print ((~data['emailid'].isin(emailid_suppress1)).sum())
    3
    
    suppress_md5=data[~data['emailid'].isin(emailid_suppress1)]
    print (suppress_md5)
      id emailid
    2  c       m
    3  d       n
    4  e       o
    

    EDIT:

    print ((data['emailid'].isin(emailid_suppress1)).sum())
    2
    
    suppress_md5=data[data['emailid'].isin(emailid_suppress1)]
    
    print (suppress_md5)
      id emailid
    0  a       k
    1  b       l