I have a table with data like the tableA example below. the date column is formatted as a string.
the close column is an integer, ticker is formatted as string.
I'm trying to run the query below on a mysql database and it is taking a very long time.
is there anything I can do to speed this up, like changing the format of the date column, or adding
indices or primary keys? The combination of ticker and date should be a unique value, and the date field is a timestamp, it's just currently formatted as string.
code:
select avg((a.close-b.close)/b.close) as avg_annual_returns,
a.ticker
from tableA a
join tableA b
on cast(a.date as date)=date_add(cast(b.date as date),interval 365 DAY)
and a.ticker=b.ticker
where b.close is not null
group by a.ticker
tableA
+--------+-----+------+
|date |close|ticker|
+--------+-----+------+
|2/1/2019|5 |abc |
+--------+-----+------+
|2/3/2019|7 |efd |
+--------+-----+------+
|2/4/2019|3 |hij |
+--------+-----+------+
update answer:
select ticker,date, ( -1 +
a.close / max(a.close) over (partition by ticker
order by date
range between interval 365 day preceding and interval 365 day preceding
)
) as annual_returns
from tableA a
) b where annual_returns is not null
group by ticker
If you want the difference from a year ago, then use window functions. Before that, though, fix the data model! Do not store dates as strings. So:
alter table talbeA modify column date date;
Then to get the close from a year ago:
select( -1 +
a.close / max(a.close) over (partition by ticker
order by date
range between interval 365 day preceding and interval 365 day preceding
)
)
from tablea a;
You don't have to worry about NULL
values because AVG()
ignores them.
Here is a db<>fiddle.