Search code examples
pythonpandasnumpycsvvariable-assignment

Python ValueError: could not convert string to float: '1,000000E+06'


I have a list of around 70 string values, all of them with the E+(05 or 06), which they should be converted to float (as whenever i try to print the values, it somehow ignores the 100000 or 1000000).

raw_data = 'EISTest16_10CV_01.txt' # name of your file
# Define the columns from our txt table
columns = ["Pt.","Frequency","Z'","Z''","Frequency (Hz)","|Z|","theta"]
# Read the txt file
data = pd.read_csv(raw_data,names=columns, sep="\t" or " ", skiprows =7) #skiprows is used to remove the header as parameters

frequency = np.asarray((data["Frequency"]))

#print("first 3 frequencies: ")
#test = '{0:,.2f}'.format(float(frequency[0]))

#floatfrequency = float(frequency[1])
#print(floatfrequency)

the 4 bottom outcommented lines are where my problem is. How can I basically change the str to float? Or how can I read the E+06 from a list (or str)?

Some similar answer suggest to use the function strip(). Should I use it, as an example, to remove the E+05 and then add manually the times 100000?

Also try: test = '{0:,.2f}'.format(float(frequency[0]))

but the same error occurs.


Solution

  • your problem is in the comas in a decimal number. You need to use a point for the decimal numbers. You should replace the comas with points like this

    df['YourColumn'] = df['YourColumn'].str.replace(',', '.').astype(float)
    

    or you can try to read the file with an option for the decimal numbers, like this:

    data = pd.read_csv(raw_data,names=columns, sep="\t" or " ", skiprows =7, decimal=",")