please i'am working on a project and i have to do some data preprocessing i have a dataframe that looks like this (this is just an example for simplification
index | pixels
0 | 10 20 30 40
1 | 11 12 13 14
and I want to convert it to a np array of shape (2,2,2,1) the type of the pixels column is object is there any solution to do that without loops cause I have a 28k rows data frame with big images ? i have tried looping but it takes so long to execute on my machine
Use str.split
+ astype
+ to_numpy
+ reshape
:
a = (
df['pixels'].str.split(' ', expand=True)
.astype(int).to_numpy()
.reshape((2, 2, 2, 1))
)
a
:
[[[[10]
[20]]
[[30]
[40]]]
[[[11]
[12]]
[[13]
[14]]]]
Complete Working Example:
import pandas as pd
df = pd.DataFrame({'pixels': ['10 20 30 40', '11 12 13 14']})
a = (
df['pixels'].str.split(' ', expand=True)
.astype(int).to_numpy()
.reshape((2, 2, 2, 1))
)
print(a)