I have a report that runs daily. I want to send the output of this report to a csv file. Due to the nature of the report, from time to time some data can be lost (new data is generated when the job is executing so sometimes, some is lost during this process as it is a lengthy job).
Is there a way to cross check on a daily basis that there is not any data from the previous day that has been lost- Perhaps with a tick or cross at the end of each row to show that the data has not been exported as a csv?
I am working with sensitive information so cant share any of the report details.
This is a fairly common question. Without specifics, it's very hard to give you a concrete answer - but here are a few solutions I've used in the past.
Typically, such reports have "grand total" lines - your widget report might be broken down by month, region, sales person, product type, etc. - but you usually have a "total widgets sold" line. If that's a quick query (you may need to remove joins and other refinements) then running that query after you've generated the report data allows you to compare your report grand total with the grand total at the end of the report. If the results are different, you know that the data changed while running the report.
Another option - SQLServer specific - is to use a checksum over the data you're reporting on. If the checksum changes between the start and end of the reporting run, you know you've had data changes.
Finally - and most dramatically - if the report's accuracy is critical, you can store the fact that a particular row was included in a reporting run. This makes your report much more complex, but it allows you to be clear that you've included all the data you need. For instance:
insert into reporting_history
select @reportID, widget_sales_id
from widget_sales
--- reporting logic here
select widget.cost,
from widgets inner join widget sales on ...
inner join reporting_history on widget_sales.widget_sales_id = widget_sales.widget_sales_id
---- all your other logic