Search code examples
pythonpandascoalesce

Pandas Coalesce Multiple Columns, NaN


I want to coalesce 4 columns using pandas. I've tried this:

final['join_key'] = final['book'].astype('str') + final['bdr'] + final['cusip'].fillna(final['isin']).fillna(final['Deal'].astype('str')).fillna(final['Id'])

When I use this it returns:

+-------+--------+-------+------+------+------------+------------------+
| book  |  bdr   | cusip | isin | Deal |     Id     |     join_key     |
+-------+--------+-------+------+------+------------+------------------+
| 17236 | ETFROS |       |      |      | 8012398421 | 17236.0ETFROSnan |
+-------+--------+-------+------+------+------------+------------------+

The field Id is not properly appending to my join_key field.

Any help would be appreciated, thanks.

Update:

+------------+------+------+-----------+--------------+------+------------+----------------------------+
|  endOfDay  | book | bdr  |   cusip   |     isin     | Deal |     Id     |          join_key          |
+------------+------+------+-----------+--------------+------+------------+----------------------------+
| 31/10/2019 |   15 | ITOR | 371494AM7 | US371494AM77 |  161 | 8013210731 | 20191031|15|ITOR|371494AM7 |
| 31/10/2019 |   15 | ITOR |           |              |      | 8011898573 | 20191031|15|ITOR|          |
| 31/10/2019 |   15 | ITOR |           |              |      | 8011898742 | 20191031|15|ITOR|          |
| 31/10/2019 |   15 | ITOR |           |              |      | 8011899418 | 20191031|15|ITOR|          |
+------------+------+------+-----------+--------------+------+------------+----------------------------+

df['join_key'] = ("20191031|" + df['book'].astype('str') + "|" + df['bdr'] + "|" + df[['cusip', 'isin', 'Deal', 'id']].bfill(1)['cusip'].astype(str))

For some reason this code isnt picking up Id as part of the key.


Solution

  • The last chain fillna for cusip is too complicated. You may change it to bfill

    final['join_key'] = (final['book'].astype('str') + 
                         final['bdr'] + 
                         final[['cusip', 'isin', 'Deal', 'Id']].bfill(1)['cusip'].astype(str))