Search code examples
sqlpostgresqlfinancecomputational-finance

pl/python receiving and outputting a set of rows


Problem description: I'm trying to define a pl/python aggregator that receives a set of rows and returns a set of rows, for every set of rows generated from a query. The input set of rows should be a group generated with the group by statement in a query. I'm trying to find my way through the syntax, but I'm having a hard time with that. pl/pgsql functions are welcome, but I'm afraid they may lack the expressive power to do this.

Context description: I have a table with three columns: ticker symbol, date and value. I'm trying to calculate the macd for each ticker symbol and date according to the values I have. I can group and sort the table by date, then all I would have to do is for each group calculate the value of the aggregator for that date according to the current value for each ticker and the previous value for the aggregator, which I guess I will store in SD, then all I have to return is a set of rows with that value, and possibly the ticker to make a join, but I think it should be possible to preserve the order and save the time of a join.

This is how the table would be created:

create table candles(ticker text, day date, value real,
  primary key (ticker, day));

Here you can find an example in Java (pseudocode-ish) that shows how the groups (group by day) should be processed after sorted by ticker. In Java these groups are represented as an ArrayList, since they are sorted by ticker, the same position corresponds to the same ticker (database is checked on generation so no records are missing) and since they are grouped by day every group has the values for the same day and the next day than the previous call.

PD: I've also added the code to calculate MACD in pl/pgsql but for this I order by ticker and day, I'd like to order by day and do all the calculations for all the tickers in a loop, because that should be more efficient, specially considering the final output has to be ordered by day to be consumed in this way.

If there is anything that is not completely clear please ask me, I'm trying to be clear and concise.

Thank you very much.


Solution

  • Sorry if the description was confusing. I solved the problem I was facing by:

    1. creating arrays from the values I wanted to group by using a subquery. select array_agg(x) from y group by z order by v;
    2. creating a function that accepts the arrays, just by adding [] to the parameters description. And an aggregator for that function.
    3. using the aggregator and the subquery in the final query.

    It was fairly easy, but I didn't know about the array type and all its possibilities. Sorry and thank you.