visualization azure-data-explorer kql azure-log-analytics azure-log-analytics-workspace

Log Analytics :: Find Avg CPU usage of an Azure SQL Database for the last month

Log Analytics comes with queries that you can use out-of-the-box like the Avg CPU usage :

GOAL: I would like to target a specific database and see the Avg CPU consumption for the last 30 days with a granularity of 5 minutes.

My Kusto query is this:

AzureMetrics
| where ResourceProvider == "MICROSOFT.SQL" // /DATABASES
| where TimeGenerated >= ago(30day)
| where MetricName in ('cpu_percent') 
| where Resource contains 'MyDatabaseName'
| parse _ResourceId with * "/microsoft.sql/servers/" Resource  // subtract Resource name for _ResourceId
| summarize CPU_Maximum_last15mins = max(Maximum), CPU_Minimum_last15mins = min(Minimum), CPU_Average_last15mins = avg(Average) by bin(TimeGenerated, 5m)
| render timechart

The query is pulling data but in the timechart I only have one red line.

What am I doing wrong?

Solution

The built in query is fine for displaying the stats, such as minimum, maximum and average, but if you just want average, then we need modify the query

AzureMetrics
| where TimeGenerated >= ago(30d)
| where ResourceProvider == "MICROSOFT.SQL" // /DATABASES
| where Resource == "PURVIEWNINJASQL"
| where MetricName in ('cpu_percent') 
| parse _ResourceId with * "/microsoft.sql/servers/" Resource 
| summarize CPU_Average = avg(Average) by Resource, bin(TimeGenerated, 5m) 
| render timechart

In my example, I've only changed the summarize line, to remove the minimum and maximum. I've also include Resource, which has been parsed in to include the server and database name, otherwise your aggregations could be averaging across multiple databases with the same name.

If you still see the same chart, and your data has been ingested into the log analytics workspace over that time range, then it is likely it is correct. You can switch to the results pane and view the raw results of your query in a grid view.

In your chart, you can see one large spike and a smaller one as well. It is possible that the average CPU is closer to 0, but the outlier that is spiking to 65% / 70% is not allowing you to see the normal trend. You can run this query, utilizing series_decompose_anomalies to remove the anomalies and view the trend without the anomalous spikes.

AzureMetrics
| where TimeGenerated >= ago(30d)
| where ResourceProvider == "MICROSOFT.SQL" // /DATABASES
| where Resource == "PURVIEWNINJASQL"
| where MetricName in ('cpu_percent') 
| parse _ResourceId with * "/microsoft.sql/servers/" Resource 
| make-series CPU_Average = avg(Average) default = 0 on TimeGenerated step 5m by Resource
| extend series_decompose_anomalies(CPU_Average)
| project TimeGenerated, CPUAverageBaseline = series_decompose_anomalies_CPU_Average_baseline
| render timechart

If you are concerned that data isn't being ingested, you can run this query to check the number of samples returned in a five minute period. If it drops to 0, then no data was ingested for that period. Don't worry on the odd spike or trough,

AzureMetrics
| where TimeGenerated >= ago(30d)
| where ResourceProvider == "MICROSOFT.SQL" // /DATABASES
| where Resource == "PURVIEWNINJASQL"
| where MetricName in ('cpu_percent') 
| parse _ResourceId with * "/microsoft.sql/servers/" Resource 
| make-series count() default = 0 on TimeGenerated step 5m by Resource
| render timechart

I hope this helps

Alistair