Search code examples
pythonnumpyalignmentsequence

numpy.loadtxt "could not convert string to float"


A minimal example that reproduces this is:

import numpy
numpy.loadtxt("data.txt", delimiter='\t')

with data.txt being:

    A   R   N   D   C   Q
A   5   -2  -1  -2  -1  -1
R   -2  7   -1  -2  -4  1
N   -1  -1  7   2   -2  0

When running the code I get a ValueError:

[root@mycomp]$ python Needleman-Wunsch.py 
Traceback (most recent call last):
    File "Needleman-Wunsch.py", line 92, in <module>
        (alignedSeq1, alignedSeq2) = computeFMatrix(seq1, seq2, -6)
    File "Needleman-Wunsch.py", line 34, in computeFMatrix
        similarityMatrixMap = readBLOSUM50("BLOSUM50.txt")
    File "Needleman-Wunsch.py", line 16, in readBLOSUM50
        similarityMatrix = np.loadtxt(fileName, delimiter='\t')
    File "/usr/local/lib/python2.7/site-packages/numpy/lib/npyio.py", line 827, in loadtxt
        items = [conv(val) for (conv, val) in zip(converters, vals)]    
    ValueError: could not convert string to float: A  R  N  D  C  Q  E  G  H  I  L  K  M  F  P  S  T  W  Y  V

You can also have the original BLOSUM50.txt file and the full code is from the link above.

Erasing the first line of BLOSUM50.txt gave the same error.


Solution

  • You can just replace the loadtxt with

    numpy.genfromtxt("data.txt", delimiter='\t', skip_header=True)[:, 1:]
    

    This skips the header, converts the column names to nan and then chops them off.