I have two dataframes in 2 separate .csv files the first one being:
Apple.count.out Orange.count.out Banana.count.out
Total stock 1000 500 1000
average_price 2 2 0.5
I have casted the first one into a dictionary
fruit={'apple':1000,'orange':500,'banana':1000}
df2 consists of a list of store_name containing the stock of individual fruits and as follows:
apple orange banana
store_a 60 25 100
store_b 100 25 200
store_c 200 100 200
store_d 300 300 200
My task is to use use the value from the dictionary to normalise the stocks in individual store for respective food as the total number of stock across different fruits are different in the first place. I think of matching the dictionary name the retrieve the key and writing a function to loop every column to divide the count in every single store by the total number of count(the value in the dictionary) if the column name matches the key. Turning something into this:
apple orange banana
store_a 0.06 0.05 0.1
store_b 0.10 0.05 0.2
store_c 0.20 0.20 0.2
store_d 0.30 0.60 0.2
The i need to log10 transform every value. I tried using the following the function:
import numpy as np
import pandas as pd
fruits = ['apple','orange','banana']
df2[fruits] = df2[fruits].apply(lambda x: (x/t) if t = df2.columns.values for t,u in fruit.items())
df2[fruits] = np.log10(df2[fruits])
But the function seems to not working properly and i am not sure whether this function is elegant enough? Can i have some suggestion on how to improve it please?
And as i hard-coded the dictionary for the first df, is there another way of constructing a function casting them into a dictionary? Thank you very much in advance.
You can use:
fruit = df1.rename(columns=lambda x: x.split('.')[0].lower()).loc['Total stock']
out = df2 / fruit
Output:
>>> out
apple orange banana
store_a 0.06 0.05 0.1
store_b 0.10 0.05 0.2
store_c 0.20 0.20 0.2
store_d 0.30 0.60 0.2
>>> np.log10(out)
apple orange banana
store_a -1.221849 -1.301030 -1.00000
store_b -1.000000 -1.301030 -0.69897
store_c -0.698970 -0.698970 -0.69897
store_d -0.522879 -0.221849 -0.69897