I am trying to read a file includes string, floats and integers in a form of matrix,
I tried the following code:
import numpy as np
with open('data.dat', 'r') as f:
input_data = f.readlines()
for p in input_data:
pizza_details = p.split(",")
print(pizza_details[1][0])
# pizza =[[1 3389.0 36]
[2 3148.0 28]
[3 3012.0 40]
[4 3321.0 61]
[5 1761.0 41]]
Assuming that print(pizza_details[1][0])
is the part that still works and you want to get it into a numpy format?
It seems you didnt initialize pizza_details
, so it keeps getting overwritten within the loop and only the last row is in it after the loop. Initialize it as empty array before the loop and then within the loop add the split row to it.
import numpy as np
with open('data.dat', 'r') as f:
input_data = f.readlines()
pizza_details = []
for p in input_data:
pizza_details.append(p.split(","))
now this is a list of lists
print(pizza_details)
> [['1', '3389.0', '36'], ['2', '3148.0', '28'], ['3', '3012.0', '40'], ['4', '3321.0', '61'], ['5', '1761.0', '41']]
now convert to numpy 2D array (all values are still string)
pizza_details = np.array(pizza_details)
print(pizza_details)
> [['1' '3389.0' '36']
['2' '3148.0' '28']
['3' '3012.0' '40']
['4' '3321.0' '61']
['5' '1761.0' '41']]
now set type to float so you can use the numerical values
pizza_details = pizza_details.astype(float)
print(pizza_details)
> [[1.000e+00 3.389e+03 3.600e+01]
[2.000e+00 3.148e+03 2.800e+01]
[3.000e+00 3.012e+03 4.000e+01]
[4.000e+00 3.321e+03 6.100e+01]
[5.000e+00 1.761e+03 4.100e+01]]