I have a very big table (over 140 mil of rows and size of around 230 gb). There is a column, called "date_dim" with a clustered index asc. In the table are stored all data from 2015.
My target is now to get the first date of a month that data are stored. So it could be that in January the data started at the 5th. So I have to build a 'dynamic' query.
To retrieve this data, I created following query:
SELECT
*
FROM (
SELECT
date_dim
,row_num = ROW_NUMBER() OVER (PARTITION BY MONTH(date_dim) ORDER BY date_dim ASC)
FROM myTable_2015 WITH (NOLOCK)
) AS s
WHERE
row_num = 1
ORDER BY
1
Now my Issue is that the query has a very long time to get the data. Is there maybe an other option to build a query with better performance ? Do you have any Performance advise for selecting such big tables for my specific case ?
Actually I cannot make any changes on the table itself, only 'Read' Access...
If I understand your question, two quick options
Select BOMonth = min(date_dim)
From myTable_2015
Group By year(date_dim),month(date_dim)
Or even
Select BOMonth = min(date_dim)
From (select distinct date_dim from myTable_2015 ) A
Group By year(date_dim),month(date_dim)