I have a dataframe like this;
df = pd.DataFrame(np.array([['apple', 'golden', 3], ['apple', 'green', 6], ['banana', 'golden', 9], ['apple', 'golden', 5], ['apple', 'green', 6], ['banana', 'golden', 6]]),
columns=['Column1', 'Column2', 'Column3'])
df
Column1 Column2 Column3
0 apple golden 3
1 apple green 6
2 banana golden 9
3 apple golden 5
4 apple green 6
5 banana golden 6
I want to compare "Column1" rows with iterating in a new Column4. If there is a difference I want to write down True, if not False.
Column1 Column2 Column3 Column4
0 apple golden 3 False
1 apple green 6 False
2 banana golden 9 True
3 apple golden 5 True
4 apple green 6 False
5 banana golden 6 True
And lastly, if comparing result is true, I want to add Column1 item to a list.
list = ['banana']
Compare shifted values for not equal with replace first value to original Column1
by fillna
:
df['Column4'] = df.Column1.shift().fillna(df.Column1).ne(df.Column1)
print (df)
Column1 Column2 Column3 Column4
0 apple golden 3 False
1 apple green 6 False
2 banana golden 9 True
3 apple golden 5 True
4 apple green 6 False
5 banana golden 6 True
For list dont use list
, because python code word:
L = df.loc[df['Column4'], 'Column1'].unique().tolist()
print (L)
['banana', 'apple']