Search code examples
pythonnumericalsorting

Sorting a csv file in Python with sorted() returns values in programmer DESC order, not time DESC order


I'm not doing anything overly complex I believe. I'm presorting a large csv data file because it is full of data that arrives in random time order. The index is correct, but the return formatting is off.

    sortedList=sorted(reader,key=operator.itemgetter(1))

So instead of sorting like [-100 -10 -1 0 10 100 5000 6000]; I get [-1 -10 -100 0 100 5000 60]

I tried both the lambda function examples and itemgetter, but I don't really know where to go from there.

Thanks for the help.

The answer to my question is in the comments. The numerical value was being sorted as a string and not a number. I didn't know that I could specify the data type of the key in sorted(). This code works as I intended:

    sortedList=sorted(reader,key=lambda x:float(x[1]))

Solution

  • Just from the output you see there, it looks like these are being sorted as strings rather than as numbers.

    So you could do:

    sortedList=sorted(reader, key=lambda t: int( t[1] ))
    

    or

    sortedList=sorted(reader, key=lambda t: float( t[1] ))
    

    Or better, try to ensure that the sequence reader gets populated with numbers, rather than strings, when it's created, perhaps using QUOTE_NONNUMERIC as a fmtparam for the reader (see http://docs.python.org/library/csv.html#csv.QUOTE_NONNUMERIC).