Search code examples
kdb

How do I calculate the standard deviation of a table in kdb


I have a table of returns that I want to calculate standard deviation on. My columns look like

`day1`day2`day3 

How can I calculate the standard deviation of each column efficiently?

I know there's a dev function. However, unlike avg, dev cannot be called on the entire table Any help will be much appreciated!

Thank you!


Solution

  • Yes dev doesn't support a table as input but you can use dev on individual columns in a select query:

    q)t:([]day1:til 10;day2:2*til 10;day3:3*til 10)
    q)t
    day1 day2 day3
    --------------
    0    0    0
    1    2    3
    2    4    6
    3    6    9
    4    8    12
    5    10   15
    6    12   18
    7    14   21
    8    16   24
    9    18   27
    
    q)select dev day1, dev day2, dev day3 from t
    day1     day2     day3
    --------------------------
    2.872281 5.744563 8.616844
    

    Edt: If unsure how to create a dynamic query with functional form use parse:

    q)parse"select dev day1 from t"
    
    ?
    `t
    ()
    0b
    (,`day1)!,(dev;`day1)
    

    It is useful for creating the code for multiple columns:

    // [table;where;by;cols]
    ?[t;();0b;
    
        raze { (enlist x)!enlist (dev;x) } each `day1`day2`day3]
    
    day1     day2     day3
    --------------------------
    2.872281 5.744563 8.616844
    
    

    or since you have 100+ columns, use cols with except to get all columns you want to get the standard deviation of and ignore the columns you don't

    ?[t;();0b;
    
        raze { (enlist x)!enlist (dev;x) } each except[cols[t];`columns`to`ignore]