Search code examples
pythonnumpytime-seriesmissing-data

Remove Nan from two numpy array with different dimension Python


I would like to remove nan elements from two pair of different dimension numpy array using Python. One numpy array with shape (8, 3) and another with shape (8,). Meaning if at least one nan element appear in a row, the entire row need to be removed. However I faced issues when this two pair of array has different dimension.

For example,

[1.7 2.3 3.4] 4.2

[2.3 3.4 4.2] 4.6

[3.4 nan 4.6] 4.8

[4.2 4.6 4.8] 4.6

[4.6 4.8 4.6] nan

[4.8 4.6 nan] nan

[4.6 nan nan] nan

[nan nan nan] nan

I want it to become

[1.7 2.3 3.4] 4.2

[2.3 3.4 4.2] 4.6

[4.2 4.6 4.8] 4.6

This is my code which generate the sequence data,

def split_sequence(sequence, n_steps):
X, y = list(), list()
for i in range(len(sequence)):
    # find the end of this pattern
    end_ix = i + n_steps
    # check if we are beyond the sequence
    if end_ix > len(sequence)-1:
        break
    # gather input and output parts of the pattern
    seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
    X.append(seq_x)
    y.append(seq_y)
return array(X), array(y)

n_steps = 3
sequence = df_sensorRefill['sensor'].to_list()
X, y = split_sequence(sequence, n_steps)

Thanks


Solution

  • You could use np.isnan(), np.any() to find rows containing nan's and np.delete() to remove such rows.