I would like to make a much faster python3 program. Please give me some nice ideas.
Background
I am using python3 for visualizing a 3D dataset calculated from a Fortran90 program.
When I write the calculated 3D matrix out into a text file, I have to use the shape of 2D matrix in the program.
Its output structure is below:
Value, x, y, z
e.g.
123443.0, 1, 1, 1
123343.0, 1, 1, 2
134554.0, 1, 1, 3
A value is an element of a 3D matrix. x, y, and z mean the positions of each element of a 3D matrix.
To read this file, I use this python3 code below.
input_path="/user_path/3D_CUBE_data.txt"
#READ DATASET
read_1=np.loadtxt(input_path)
read_2 = pd.DataFrame(read_1, columns=["data","x","y","z"])
print("input_shape=", read_2.shape)
#CREATE A 3D MATRIX
#(dis, tra, rows)
dis = 41
tra = 18
rows = 4096
data_1 = np.zeros( shape = (dis, tra, rows) )
x_l = list(range(1,dis+1))
y_l = list(range(1,tra+1))
z_l = list(range(1,rows+1))
#ASSIGN VALUES IN THE 3D MATRIX
for y in y_l :
for x in x_l :
line_1 = read_2[(read_2["x"]==int(x)) & (read_2["y"]==int(y))]
data_1[x-1,y-1,:] = line_1.loc[:,["data"]].T
I think assigning values is slow. The reason may be the two for-loops.
Question
Therefore, my question is how to speed up this process in python3?
You could do the following, without the need to use a Pandas DataFrame, by using the NumPy ravel_multi_index
function to convert your coordinates into the indices of a flattened version of your required matrix:
# read in the comma separated values
inputdata = np.loadtxt(input_path, delimiter=",")
# extract the values and their coordinates
values = inputdata[:, 0]
coords = inputdata[:, 1:].astype(int) - 1 # subtract 1 due to indices starting at 0
# create your matrix
dis = 41
tra = 18
rows = 4096
data_1 = np.zeros(shape=(dis, tra, rows))
# fill in your matrix at the appropriate coordinates
data_1.flat[np.ravel_multi_index(coords.T, data_1.shape)] = values