I have a dataframe that has a few ID's and then a column for money like this,
Id1 Id2 Id3 Money
1 10 13 10000
2 15 12 12500
3 20 11 60000
I need a script to randomly select rows until I hit $80M in money. I'm assuming a while loop such as...
while sum(money) < 80000000:
df.sample()
To perhaps rephrase your question a bit, it seems that you're looking for a random sample of rows such that the sum of Money
is < 80000000. One way to do that would be to use .sample()
to do shuffling, combined with .cumsum()
:
>>> reordered = df.sample(n=df.shape[0])
>>> lim = reordered[reordered.Money.cumsum() < 80000000]
This will sample without replacement.
This is perhaps not the most memory-efficient in comparison to taking rows one-by-one, but should do the trick for something of a reasonable size.