python floating-point type-conversion tensor scientific-notation

How can I parse a text file with scientific notation ( in tensor format) and turn them into float

I have multiple txt file in the format:

[tensor([[1.7744e+02, 4.7730e+02, 1.2396e+02, 1.1678e+02, 5.9988e-01],
         [7.8410e+02, 1.7532e+02, 6.2769e+02, 2.1083e+02, 9.9969e-01],
         device='cuda:0')]

I want to remove tensor, [], (), ,device='cuda:0' and convert scientific notation to decimal to get the output as:

177.44 4.77.30 1.23.96 1.16.78 5.9.988
784.10 175.32 627.69 210.83 99.969

This is my program:

for i in os.listdir():
if i.endswith(".txt"):
with open(i, "r+") as f:
    content = f.readlines()

    f.truncate(0)
    f.seek(0)

    for line in content:
        if not line.startswith("[tensor(["):
            f.write(line)
        elif not line.startswith('        '):
            f.write(line)
        elif not line.startswith("device='"):
            f.write(line)

The tensor character is gone but all the other are remaining, how to remove other characters ( also the white space at the beginning of each line)

Solution

Hi you can leverange numpy.matrix ability to transform an string with array shape to create a matrix, then if you need in array not matrix convert with numpy.array

#data Definition
data = """[tensor([[1.7744e+02, 4.7730e+02, 1.2396e+02, 1.1678e+02, 5.9988e-01],
         [7.8410e+02, 1.7532e+02, 6.2769e+02, 2.1083e+02, 9.9969e-01],
         device='cuda:0')]"""

#cleaningStep, remove tensor, and all other things
elementsToRemove= ['\n',' ','[tensor(','device=',"'cuda:0')"]

cleanData = data
for el in elementsToRemove:
    cleanData = cleanData.replace(el,'')

#convert to numeric using np.matrix
import numpy as np

numericData_matrix = np.matrix(cleanData)
numericData_array = np.array(numericData_matrix)

hope this solves your problem!