Search code examples
juliajulia-dataframe

Methoderror: no method matching isless when using sort() on Dataframe in Julia


I have an excelfile, which I read into a Dataframe.

using DataFrames, XLSX
df = DataFrame(XLSX.readtable("Stocks/WIHL-Wihlborgs.xlsx", "PriceMonth")...)

│ Row │ Date       │ Openprice │ Highprice │ Lowprice │ Closeprice │ Volume   │
│     │ Any        │ Any       │ Any       │ Any      │ Any        │ Any      │
├─────┼────────────┼───────────┼───────────┼──────────┼────────────┼──────────┤
│ 1   │ 2020-12-23 │ 189.1     │ 189.7     │ 170.3    │ 181.5      │ 4170122  │
│ 2   │ 2020-11-30 │ 160.0     │ 191.6     │ 158.0    │ 189.1      │ 8006506  │
│ 3   │ 2020-10-30 │ 178.4     │ 184.3     │ 151.2    │ 160.6      │ 6760931  │
│ 4   │ 2020-09-30 │ 138.5     │ 178.8     │ 137.9    │ 177.5      │ 9005351  │
│ 5   │ 2020-08-31 │ 147.2     │ 152.1     │ 137.2    │ 138.3      │ 4865386  │

sort(df)

But when I try to use the sort()function on this DateFrame, I get LoadError: MethodError: no method matching isless(::String, ::Dates.Date)

I think this is weird as I have another tab in the same excelfile where I have prices/day, and that works okey to sort. And the only difference that I see, is that in the Excel file, the prices/month tab has the Date field as Date, and the prices/day tab has the Date field as Any. But in the DataFrame, both tabs the Date field as Any.

I have a hard time understanding why the prices/day tab is working with sort() and the prices/month tab isn't, instead it gives me the LoadError.


Solution

  • Changing the columns to their appropriate data type should fix this (infer_eltypes):

    using DataFrames, XLSX
    
    df = DataFrame(XLSX.readtable("./dat.xlsx", "Sheet1", infer_eltypes=true)..., )
    #5×6 DataFrame
    # Row │ Date        Openprice  Highprice  Lowprice  Closeprice  Volume  
    #     │ Date        Any        Float64    Any       Float64     Int64   
    #─────┼─────────────────────────────────────────────────────────────────
    #   1 │ 2020-12-23  189.1          189.7  170.3          181.5  4170122
    #   2 │ 2020-11-30  160            191.6  158            189.1  8006506
    #   3 │ 2020-10-30  178.4          184.3  151.2          160.6  6760931
    #   4 │ 2020-09-30  138.5          178.8  137.9          177.5  9005351
    #   5 │ 2020-08-31  147.2          152.1  137.2          138.3  4865386
    

    Now, sorting the Dates works fine:

    df[sortperm(df.Date),1:2]
    #5×2 DataFrame
    # Row │ Date        Openprice 
    #     │ Date        Any       
    #─────┼───────────────────────
    #   1 │ 2020-08-31  147.2
    #   2 │ 2020-09-30  138.5
    #   3 │ 2020-10-30  178.4
    #   4 │ 2020-11-30  160
    #   5 │ 2020-12-23  189.1
    

    Setting the type of Dates manually is also possible:

    using Dates
    
    df.Date = Dates.Date.(string.(df.Date), "yyyy-mm-dd")
    #5-element Array{Date,1}:
    # 2020-12-23
    # 2020-11-30
    # 2020-10-30
    # 2020-09-30
    # 2020-08-31