Search code examples
pythonlambdadataframecalculated-columnsgraphlab

How to create a new column by dividing two columns in Graphlab SFrame?


Given a Graphlab SFrame as such:

+-------+------------+---------+-----------+
| Store |    Date    |  Sales  | Customers |
+-------+------------+---------+-----------+
|   1   | 2015-07-31 |  5263.0 |   555.0   |
|   2   | 2015-07-31 |  6064.0 |   625.0   |
|   3   | 2015-07-31 |  8314.0 |   821.0   |
|   4   | 2015-07-31 | 13995.0 |   1498.0  |
|   3   | 2015-07-20 |  4822.0 |   559.0   |
|   2   | 2015-07-10 |  5651.0 |   589.0   |
|   4   | 2015-07-11 | 15344.0 |   1414.0  |
|   5   | 2015-07-23 |  8492.0 |   833.0   |
|   2   | 2015-07-19 |  8565.0 |   687.0   |
|   10  | 2015-07-09 |  7185.0 |   681.0   |
+-------+------------+---------+-----------+
[986159 rows x 4 columns]

How do I add an "Sales per Customer" column by dividing the Sales by the Customers for each row?

I have tried the following but they don't work (sf is my SFrame:

sf['salespercustomer'] = sf.apply(lambda x: sf['Sales']/sf['Customers'])

Interestingly I get an output of a SArray with:

sf['Sales'] / sf['Customers']

But that doesn't really help to add the column back to the sf, so this does't work =( :

sf['salescustomer'] = sf['Sales'] / sf['Customers']

Solution

  • The last line of code should do the trick, but you said your SFrame is called sf, not train. When I try it with sf it works fine.