I have an inventory journal that contains products and their relative inventory qty (resulting_qty) as well as the loss/gain every time inventory is added or subtracted (delta_qty).
The issue is that inventory records do not get updated daily, rather they are only updated when a change in inventory occurs. For this reason, it is difficult to extract the total inventory qty for all items on a given day, because some items are not recorded on certain days, despite the fact that they do have available inventory given their last entry resulting_qty was greater than 0. Logically, this would mean that an item went without a change in qty for a certain amount of days equal to the number of days between the max date and the last recorded date.
my data looks something like this, except in reality there are thousands of product ids
| date | timestamp | pid | delta_qty | resulting_qty |
|------------|---------------------|-----|-----------|---------------|
| 2017-03-06 | 2017-03-06 12:24:22 | A | 0 | 0.0 |
| 2017-03-31 | 2017-03-31 02:43:11 | A | 3 | 3.0 |
| 2017-04-08 | 2017-04-08 22:04:35 | A | -1 | 2.0 |
| 2017-04-12 | 2017-04-12 18:26:39 | A | -1 | 1.0 |
| 2017-04-19 | 2017-04-19 09:15:38 | A | -1 | 0.0 |
| 2019-01-16 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-19 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-05 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-22 | 2019-04-22 11:06:33 | B | -1 | 1.0 |
| 2019-04-23 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-05-09 | 2019-05-09 16:25:41 | C | 2 | 2.0 |
Essentially, I need to make the data look something more like this so that I can simply pull a date and get the sum of total inventory for a given day when grouping by date (e.g. df.groupby(date).resulting_qty.sum()):
Note I removed the PID= A rows due to character limitations, but I hope you get the idea:
| date | timestamp | pid | delta_qty | resulting_qty |
|------------|---------------------|-----|-----------|---------------|
| 2019-01-16 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-17 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-18 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-19 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-20 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-21 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-22 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-23 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-24 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-25 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-26 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-27 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-28 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-29 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-30 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-01-31 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-01 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-02 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-03 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-04 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-05 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-06 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-07 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-08 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-09 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-10 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-11 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-12 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-13 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-14 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-15 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-16 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-17 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-18 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-19 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-20 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-21 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-22 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-23 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-24 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-25 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-26 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-27 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-02-28 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-01 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-02 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-03 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-04 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-05 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-06 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-07 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-08 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-09 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-10 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-11 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-12 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-13 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-14 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-15 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-16 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-17 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-18 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-19 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-20 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-21 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-22 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-23 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-24 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-25 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-26 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-27 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-28 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-29 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-30 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-03-31 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-04-01 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-04-02 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-04-03 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-04-04 | 2019-01-16 23:37:17 | B | 0 | 0.0 |
| 2019-04-05 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-06 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-07 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-08 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-09 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-10 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-11 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-12 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-13 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-14 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-15 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-16 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-17 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-18 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-19 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-20 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-21 | 2019-04-05 16:40:32 | B | 2 | 2.0 |
| 2019-04-22 | 2019-04-22 11:06:33 | B | -1 | 1.0 |
| 2019-04-23 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-04-24 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-04-25 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-04-26 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-04-27 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-04-28 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-04-29 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-04-30 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-05-01 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-05-02 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-05-03 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-05-04 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-05-05 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-05-06 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-05-07 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-05-08 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-05-09 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-05-10 | 2019-04-23 13:23:17 | B | -1 | 0.0 |
| 2019-01-19 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-20 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-21 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-22 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-23 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-24 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-25 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-26 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-27 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-28 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-29 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-30 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-01-31 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-01 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-02 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-03 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-04 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-05 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-06 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-07 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-08 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-09 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-10 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-11 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-12 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-13 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-14 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-15 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-16 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-17 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-18 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-19 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-20 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-21 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-22 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-23 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-24 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-25 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-26 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-27 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-02-28 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-01 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-02 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-03 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-04 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-05 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-06 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-07 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-08 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-09 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-10 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-11 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-12 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-13 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-14 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-15 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-16 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-17 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-18 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-19 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-20 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-21 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-22 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-23 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-24 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-25 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-26 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-27 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-28 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-29 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-30 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-03-31 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-01 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-02 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-03 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-04 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-05 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-06 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-07 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-08 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-09 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-10 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-11 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-12 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-13 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-14 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-15 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-16 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
| 2019-04-17 | 2019-01-19 09:40:38 | C | 0 | 0.0 |
So far what I've done was created a series of loops that generates a date range between the min date of the product lifecycle and the max date of all products. I then append the last recorded row values as a new row with a new date if there is no information for said new date. I append these to lists, and then generate a new dataframe with the updated lists. The code is terribly slow and takes 2+ hours to complete on the total dataset:
date_list = []
pid_list= []
time_stamp_list = []
delta_qty_list = []
resulting_qty_list = []
timer = len(test.product_id.unique().tolist())
counter = 0
for product in test.product_id.unique().tolist():
counter+=1
print((counter/timer)*100)
temp_df = test.query(f'product_id=={product}', engine='python')
for idx,date in enumerate(pd.date_range(temp_df.index.min(),test.index.max()).tolist()):
min_date= temp_df.index.min()
if date.date() == min_date:
date2=min_date
pid = temp_df.loc[date2]['product_id']
timestamp = temp_df.loc[date2]['timestamp']
delta_qty = temp_df.loc[date2]['delta_qty']
resulting_qty = temp_df.loc[date2]['resulting_qty']
date_list.append(date2)
pid_list.append(pid)
delta_qty_list.append(delta_qty)
time_stamp_list.append(timestamp)
resulting_qty_list.append(resulting_qty)
else:
if date.date() in temp_df.index:
date2= date.date()
pid = temp_df.loc[date2]['product_id']
timestamp = temp_df.loc[date2]['timestamp']
delta_qty = temp_df.loc[date2]['delta_qty']
resulting_qty = temp_df.loc[date2]['resulting_qty']
date_list.append(date2)
pid_list.append(pid)
delta_qty_list.append(delta_qty)
time_stamp_list.append(timestamp)
resulting_qty_list.append(resulting_qty)
elif date.date() > date2:
date_list.append(date.date())
pid_list.append(pid)
time_stamp_list.append(timestamp)
delta_qty_list.append(delta_qty)
resulting_qty_list.append(resulting_qty)
else:
pass
Can someone please help me to understand what is the right way I should approach this as I'm 100% sure this is not the best approach.
Thank you
The idea here is to reindex the DataFrame
to fill the gaps you have.
Setup a DataFrame
generated using your sample:
from io import StringIO
buffer = StringIO()
buffer.write('''\
date|timestamp|pid|delta_qty|resulting_qty
2017-03-06|2017-03-06 12:24:22|A|0|0.0
2017-03-31|2017-03-31 02:43:11|A|3|3.0
2017-04-08|2017-04-08 22:04:35|A|-1|2.0
2017-04-12|2017-04-12 18:26:39|A|-1|1.0
2017-04-19|2017-04-19 09:15:38|A|-1|0.0
2019-01-16|2019-01-16 23:37:17|B|0|0.0
2019-01-19|2019-01-19 09:40:38|C|0|0.0
2019-04-05|2019-04-05 16:40:32|B|2|2.0
2019-04-22|2019-04-22 11:06:33|B|-1|1.0
2019-04-23|2019-04-23 13:23:17|B|-1|0.0
2019-05-09|2019-05-09 16:25:41|C|2|2.0
''')
buffer.seek(0)
df = pd.read_csv(buffer, sep='|', parse_dates=['date', 'timestamp'])
First, we generate a new, gap-less index between the min and max dates for each product. This has the effect of having no rows for a product after the last existing update, per your example. However, this step is easily customizable to suit the exact requirements you have. E.g., if you want the dates to span from first product entry to today, you can just set start
and end
manually.
from itertools import chain, cycle
date_ranges = df.groupby('pid').agg({'date': ['min', 'max']})
pairs = (zip(cycle([pid]), pd.date_range(start, end))
for pid, (start, end) in date_ranges.iterrows())
new_index = pd.Index(chain.from_iterable(pairs), name=['pid', 'date'])
Then we apply the new index. Here we have two options:
delta_qty
with 0
and the remaining columns per the last update (this is deviating from your request, but seemed logical and is only a minor change)In either case, the two essential concepts are the .reindex
method and the .fillna
method. We can use reindex
to expand the dense DataFrame
to include all the dates but have sparse data. Then, we fill in the nan
s with the proper data. Since we're forward-padding from the last update, we want to specify method='ffill'
per the docs
# this fills the rows per last update
results = df.set_index(['pid', 'date'])\
.reindex(new_index).reset_index()
results.fillna(method='ffill', inplace=True)
This returns
pid date timestamp delta_qty resulting_qty
0 A 2017-03-06 2017-03-06 12:24:22 0.0 0.0
1 A 2017-03-07 2017-03-06 12:24:22 0.0 0.0
2 A 2017-03-08 2017-03-06 12:24:22 0.0 0.0
3 A 2017-03-09 2017-03-06 12:24:22 0.0 0.0
.. .. ... ... ... ...
24 A 2017-03-30 2017-03-06 12:24:22 0.0 0.0
25 A 2017-03-31 2017-03-31 02:43:11 3.0 3.0
.. .. ... ... ... ...
29 A 2017-04-04 2017-03-31 02:43:11 3.0 3.0
for pid == 'A'
results = df.set_index(['pid', 'date'])\
.reindex(new_index).reset_index()
results['delta_qty'].fillna(0, inplace=True)
results.fillna(method='ffill', inplace=True)
This returns:
pid date timestamp delta_qty resulting_qty
0 A 2017-03-06 2017-03-06 12:24:22 0.0 0.0
1 A 2017-03-07 2017-03-06 12:24:22 0.0 0.0
2 A 2017-03-08 2017-03-06 12:24:22 0.0 0.0
3 A 2017-03-09 2017-03-06 12:24:22 0.0 0.0
.. .. ... ... ... ...
24 A 2017-03-30 2017-03-06 12:24:22 0.0 0.0
25 A 2017-03-31 2017-03-31 02:43:11 3.0 3.0
.. .. ... ... ... ...
29 A 2017-04-04 2017-03-31 02:43:11 0.0 3.0