I have a DataFrame and I want to genera new one changing values from just one column, and keep original dataframe intact.I have try with mask, where and iloc, but the original data frame always change.
import pandas as pd
data = {
"age": [50, 40, 30, 40, 20, 10, 30],
"qualified": [True, False, False, False, False, True, True]
}
df = pd.DataFrame(data)
newdf = df
newdf["age"] = newdf.where(newdf["age"] > 30,2)
print(newdf)
print(df)
Result:
age qualified
0 50 True
1 40 False
2 2 False
3 40 False
4 2 False
5 2 True
6 2 True
age qualified
0 50 True
1 40 False
2 2 False
3 40 False
4 2 False
5 2 True
6 2 True
Is there some way to change this values and keep the original?
Use df.copy(deep=True)
What is the difference between a deep copy and a shallow copy?
import pandas as pd
import numpy as np
data = {
"age": [50, 40, 30, 40, 20, 10, 30],
"qualified": [True, False, False, False, False, True, True]
}
df = pd.DataFrame(data)
# deep copy
newdf = df.copy(deep=True)
newdf["age"] = np.where(newdf["age"] > 30, newdf["age"], 2)
print(newdf)
age qualified
0 50 True
1 40 False
2 2 False
3 40 False
4 2 False
5 2 True
6 2 True
print(df)
age qualified
0 50 True
1 40 False
2 30 False
3 40 False
4 20 False
5 10 True
6 30 True