Search code examples
rbandwidthkernel-density

R: Problem with bandwith selection of kernel density


I want to calculate the optimal bandwith value for my kernel density estimation. I have a .csv with two columns (longitude and latitude). I tried several different functions but always get different errors. So far i tried:

h.amise(x, deriv.order = 0)

which give my following error: argument 'x' must be numeric and need at least 3 data points However, I checked my dataframe and it is numeric.

Then I tried:

dpik(x)

which give my following error: 'list' object cannot be coerced to type 'double'

Is is wrong to use a .csv as dataframe with two columns or what can be the issue?


Solution

  • Based on your sample data:

    library(kedd)
    library(KernSmooth)
    
    h.amise(x$long)
    # 
    # Call:     Aymptotic Mean Integrated Squared Error
    # 
    # Derivative order = 0
    # Data: x$long (10 obs.);   Kernel: gaussian
    # AMISE = 0.02761525;   Bandwidth 'h' = 1.57264
    
    h.amise(x$lat)
    # 
    # Call:     Aymptotic Mean Integrated Squared Error
    # 
    # Derivative order = 0
    # Data: x$lat (10 obs.);    Kernel: gaussian
    # AMISE = 0.01352194;   Bandwidth 'h' = 3.37266
    
    dpik(x$long)
    # [1] 0.4912055
    dpik(x$lat)
    # [1] 1.079109
    

    Read the manual pages ?h.amise and ?dpik for the details. The functions take a single numeric vector as input.