I have a data set with 40 columns and I want to define a function that operate the columns.
For example,
p = {'val1': [10, 20, 30, 40],
'val2': [15, 25, 35, 45]}
data = pd.DataFrame(p, columns=['val1', 'val2'])
data
I have this data and I did the following operation
inc = 100*((data.iloc[:, -1]/ data.iloc[:, -1-1])-1)
inc
the result is
0 50.000000
1 25.000000
2 16.666667
3 12.500000
dtype: float64
I want select the max and the index of the max value, I did the following
(inc.idxmax(), max(inc))
I obtained the following result
(0, 50.0)
Now, I define a function
def increase(column):
inc = 100*((data.iloc[:, -column]/ data.iloc[:, -column-1])-1)
return (inc.idxmax(), max(inc))
I select the columns backwards.
and I want to apply this function to all my columns
new_data = data.apply(increase)
when I use this I got the error
IndexError: positional indexers are out-of-bounds
and if I use applymap I got the same error
What can I do?
If I understand right, you want to use multiple columns in the function. The apply is not the solution here because it's working on one row or column (axis=0 or 1) at the time.
so my suggestion is to us for and enter 2 columns per iteration that way:
def increase(col1,col2):
inc = 100*((col2/ col1)-1)
return (inc.idxmax(), max(inc))
lst = []
for i in range(len(data.columns)):
j=i+1
if j<len(data.columns):
col1 = data[data.columns[i]]
col2 = data[data.columns[j]]
lst.append(increase(col1,col2))
pd.DataFrame(lst)