Using Azure SQL Query Performance Insight To Pinpoint Bottlenecks

Back in the day, Microsoft SQL Server Tuning Wizard along with the SQL Server Profiler was the best way to track performance of SQL queries. In production, you might even add in custom perfmon metrics to the mix. But these days, Azure SQL has you covered with an extremely powerful query performance insights tool that does all of the heavy lifting for you.

Accessing Query Performance Insights

On an Azure SQL Database, simply access the Query Performance Insight tool under the Intelligent Performance sub-heading. Note that this is at the database level, not the server level. While some metrics (Such as DTU/CPU) can be tracked at the server level, when looking at individual queries, we have to look at each database individually.

From here, we can access :

  • Resource Consuming Queries – These are queries that cost the most resource (CPU, Data) as a *sum* of all queries. That means even if a query is performant, but is executed often, it may appear in this list.
  • Long Running Queries – These are queries that take the most time to execute, but again are the *sum* of all queries. So even if a query returns fast, if it’s called often, it will appear in this list.
  • Custom – This is where we can create custom reports to better drill down into poorly performing SQL queries. This is generally our best bet at finding bad queries.

Selecting any query allows you to view the actual query text :

As well as the average CPU, Data, Duration and execution count over the time period :

Importantly, there is also a chart below which allows you to track during hour intervals the same metrics. This can help you pinpoint certain times of day that may be more problematic for certian SQL queries :

Overall, utilizing this data can go a long way to giving you very simple metrics to act upon, all with very digestible queries, charts, and graphs.

The thing to note with all of these graphs, is that there isn’t one single metric that will be able to tell you the exact performance issues with your application. For example, a SQL query may run 100 times across 100 different users in your application, but is only non-performant on a single user (Maybe they have far more data than all the others). If you look at the average of all of these queries, it may look perfectly fine, whereas sorting by “max” may pinpoint that at times, this query is non performant.

Custom Queries To Utilize

Earlier, we talked a little bit about how using Custom queries were the best way to diagnose performance issues. Here’s some of the queries that I utilize to better understand the performance of my Azure SQL Databases, and what I’m looking for when running them.

Execution Count Metrics

I utilize the Execution Count metric to understand if there are additional caching needs for my application. A good example is if every page load requires you to return how many “unread notifications” a user has in your system. Or maybe every page load, we check the current logged in user in the database.

For the former (notifications), maybe we can cache this value so we don’t hit the database so often for something that isn’t *too* important. For example, if a user gets a notification, does their notification count really need to increase in real time, or is it OK to be cached every 30 seconds?

For the latter, sometimes there isn’t anything you can do. Checking whether someone’s JWT/Authentication Cookie corresponds with a valid user in the database is probably unavoidable.

But what I try to look for is outliers and things that really don’t need to happening in real time.

Duration/CPU Average

I utilize both CPU and Duration average to find queries that have the slowest average time of executing. But we need to be careful here, because sometimes the queries in these reports truly are slow, but are unavoidable. A good example might be generating an admin report that happens once per week. Sure, we could offload this to something better at number crunching, but if it’s getting ran once a week, it’s probably not a big issue.

The real gold finds are when we can take a query that appears on the slowest average duration and on the execution count report. This means not only is it one of the slowest queries overall, but it’s also getting executed often. Sometimes the “sum” query aggregation can help you here, but not always, so I often run the two independently.

Duration/CPU Max

Finally, I utilize the Duration and CPU max to find outliers in queries that may not on average be slow, but are slow under certain conditions. Often these can be a bit of a guess. When looking at a query within the Azure Portal, you won’t be able to see the query parameters. Therefore you can’t always know the exact conditions that caused the query to slow down, but often you can start making educated guesses, and from there do test scenarios locally.

Really, what you look for out of queries from this panel are queries you wouldn’t expect to be slow, but could under certain conditions be loading a lot of data. A good example might be a user on an ecommerce site who buys things regularly. They may have hundreds or even thousands of “orders” attached to their user, but the average user may only have a couple. Here we may see the query show up here due to the max duration being extremely long for that one customer, but not show up on the average report.

Azure SQL Performance Recommendations

Spend any time using Azure SQL and you’re going to run across it’s own “Performance recommendations” engine. These are performance recommendations (generally indexes), that Azure recommends periodically to improve your applications performance. Personally, I don’t utilize them that much, and here’s why :

  • Generally speaking, Azure Performance Recommendations mostly end up recommending you create indexes. While this can be helpful, for the most part if you are watching your slow running queries using the Query Performance Insights tool, you’re going to find them yourself anyway and probably have a better understanding of the actual issue.
  • The recommendation engine also can update your database behind the scenes without you having to lift a finger. This is bad. In most scenarios, you’re going to want to add that missing index in your own source control. It’s very rare that I accept a chance via this performance recommendation engine, and let Azure implement it for me.
  • The performance recommendations don’t take into business logic, or domain knowledge into account. There may be specific reasons why queries are acceptably slow, and/or it may only be slow in some use cases which you are happy with.

In general, I think that the performance recommendations are a helpful tool for any developer, but maybe not as automated as it appears on the surface. Generally, I’ve had to go away and validate it’s findings and then implement the changes myself rather than the one click tool.

Leave a Reply

Your email address will not be published.