I have multiple txt file in the format:
[tensor([[1.7744e+02, 4.7730e+02, 1.2396e+02, 1.1678e+02, 5.9988e-01],
[7.8410e+02, 1.7532e+02, 6.2769e+02, 2.1083e+02, 9.9969e-01],
device='cuda:0')]
I want to remove tensor, [], (), ,device='cuda:0' and convert scientific notation to decimal to get the output as:
177.44 4.77.30 1.23.96 1.16.78 5.9.988
784.10 175.32 627.69 210.83 99.969
This is my program:
for i in os.listdir():
if i.endswith(".txt"):
with open(i, "r+") as f:
content = f.readlines()
f.truncate(0)
f.seek(0)
for line in content:
if not line.startswith("[tensor(["):
f.write(line)
elif not line.startswith(' '):
f.write(line)
elif not line.startswith("device='"):
f.write(line)
The tensor character is gone but all the other are remaining, how to remove other characters ( also the white space at the beginning of each line)
Hi you can leverange numpy.matrix ability to transform an string with array shape to create a matrix, then if you need in array not matrix convert with numpy.array
#data Definition
data = """[tensor([[1.7744e+02, 4.7730e+02, 1.2396e+02, 1.1678e+02, 5.9988e-01],
[7.8410e+02, 1.7532e+02, 6.2769e+02, 2.1083e+02, 9.9969e-01],
device='cuda:0')]"""
#cleaningStep, remove tensor, and all other things
elementsToRemove= ['\n',' ','[tensor(','device=',"'cuda:0')"]
cleanData = data
for el in elementsToRemove:
cleanData = cleanData.replace(el,'')
#convert to numeric using np.matrix
import numpy as np
numericData_matrix = np.matrix(cleanData)
numericData_array = np.array(numericData_matrix)
hope this solves your problem!