I am trying to find a quick way to transform time series data imported from a relational database (from a single sql query) in the form
ticker date price num_tickers num_dates
------- ---- -------- ---------- ---------
t001 d1 pr001_d1 k n
t001 d2 pr001_d2 k n
...
t001 dn pr001_dn k n
...
t002 d1 pr002_d1 k n
t002 d2 pr002_d2 k n
...
t002 dn pr002_dn k n
...
t00k d1 pr00k_d1 k n
t00k d2 pr00k_d2 k n
...
t00k dn pr00k_dn k n
(where I have included the last 2 columns so the number of tickers and dates are known without iterating through the data)
which gets imported into Mathematica in the form
data = {{'t001',d1,pr001d1,k,n},{'t001',d2,pr001d2,k,n},...,{'t001',dn,pr001dn,k,n},
{'t002',d1,pr002d1,k,n},{'t002',d2,pr002d2,k,n},...,{'t002',dn,pr002dn,k,n}
...
{'t00k',d1,pr00kd1,k,n},{'t00k',d2,pr00kd2k,k,n},...,{'t00k',dn,pr00kdn,k,n}}
But I need it in the form:
tickers = {'t001','t002',...,'t00k'}
dates = {d1,d2,...,dn}
timeseries ={{pr001_d1,pr002_d1,...,pr00k_d1},
{pr001_d2,pr002_d2,...,pr00k_d2},
...
{pr001_dn,pr002_dn,...,pr00k_dn}}
I could do this by brute force looping through everything, but I know that Mathematica has some very powerful list manipulation functions (of which I'm not that familiar) and I was hoping that someone might know a slick way of doing this. Thanks!
You want to split the data according to the first element, which is some kind of label. Use SplitBy
, like so:
Module[{split=SplitBy[data,First]},
tickers=split[[All,1,1]];
dates=split[[1,All,2]];
timeseries=split[[All,All,3]];]