I have a table like this:
|-----|-----|-----|
| A | B | C |
|-----|-----|-----|
| | 5 | |
|-----|-----|-----|
| 1 | | |
|-----|-----|-----|
| | 5 | |
|-----|-----|-----|
| | | 2 |
|-----|-----|-----|
| | | 2 |
|-----|-----|-----|
where each column in the desired range has only one integer in its row. I want to merge these columns into a single new column that would look like this:
|-----|-----|-----| |-----|
| A | B | C | | Z |
|-----|-----|-----| |-----|
| | 5 | | → | 5 |
|-----|-----|-----| |-----|
| 1 | | | → | 1 |
|-----|-----|-----| |-----|
| | 5 | | → | 5 |
|-----|-----|-----| |-----|
| | | 2 | → | 2 |
|-----|-----|-----| |-----|
| | | 2 | → | 2 |
|-----|-----|-----| |-----|
I have been searching, but the closest solution I can find is doing something like:
df.iloc[:,some_column:another_column].apply( lambda x: "".join(x.astype(str)), axis=1)
However, this also concatenates "NaN"s from the blank cells, which is obviously undesirable.
How might I get my desired output?
I think it is what you want.
import pandas as pd
df = pd.DataFrame({"A":[np.nan, 1, np.nan, np.nan, np.nan],
"B": [5, np.nan, 5, np.nan, np.nan]})
df['Z'] = df.sum(axis = 1)
Alternatively, you can use
df['Z'] = df.max(axis = 1)
Which might be safer if (per chance) you have multiple non-NULL values and just want one of them (the largest one in this case).