Performance tuning in Azure SQL Database can feel like solving a puzzle—get it right, and your queries fly; get it wrong, and you're stuck waiting. Whether you're dealing with sluggish reports or slow transaction processing, the right indexing strategy, partitioning, or columnstore indexing can make all the difference.
In this post, I'll walk you through real-world techniques I've used to speed up Azure SQL databases—without the fluff. We'll cover smart indexing strategies, effective partitioning for large tables, and the strategic use of Clustered Columnstore indexes.
Indexing: The Right Way
Indexes are essential database structures that help your queries find data quickly, similar to how a book's index helps you locate specific information without reading every page. However, too many indexes can slow down write operations since each index must be updated whenever data changes.
Clustered vs. Non-Clustered Indexes
A Clustered Index (CI) physically sorts the table's data by the indexed key. Since a table can only have one clustered index, choose it wisely—usually an ID column or date field that's frequently used for access patterns. For example, in an Orders table, the OrderID typically makes an excellent clustered index:
-- Good for Orders table where OrderID is the primary lookup CREATE CLUSTERED INDEX IX_Orders_OrderID ON Orders(OrderID);
When you create this index, SQL Server physically reorganizes the table data to be stored in the order of OrderID values. This makes retrieving records by OrderID extremely efficient but means any other access pattern will require additional work.
A Non-Clustered Index (NCI) creates a separate lookup structure without rearranging the underlying table data. It's useful for speeding up searches on columns that aren't part of the clustered index. For instance, if customers frequently search by email, this index can speed up queries:
-- Speeds up customer email searches CREATE NONCLUSTERED INDEX IX_Customers_Email ON Customers(Email);
This index allows SQL Server to quickly find a customer by email without scanning the entire table. You can have multiple non-clustered indexes on a table, each optimizing different query patterns.
The "Covering Index" Technique
When a query frequently retrieves specific columns after filtering, a covering index can dramatically improve performance by including those columns in the index itself. This avoids "key lookups"—expensive operations where SQL Server must jump from the index to the main table data.
-- Covers common Order queries filtering by CustomerID CREATE NONCLUSTERED INDEX IX_Orders_CustomerID ON Orders(CustomerID) INCLUDE (OrderDate, TotalAmount);
This index allows SQL Server to completely satisfy queries that filter by CustomerID and only need OrderDate and TotalAmount—without touching the clustered index or table data. For applications that frequently run queries like SELECT OrderDate, TotalAmount FROM Orders WHERE CustomerID = 123, this can provide significant performance improvements.
Filtered Indexes for Specific Conditions
Filtered indexes include only the rows that match a specified condition, making them smaller and more efficient for queries that always include that condition.
-- Smaller, faster index for active products only CREATE NONCLUSTERED INDEX IX_Active_Products ON Products(ProductName) WHERE IsActive = 1;
This technique is particularly valuable when:
- A large percentage of your data has one value (like inactive products)
- Queries against that subset are common
- You want to reduce the storage and maintenance overhead of indexing the entire table
Monitoring and Maintaining Indexes
Unused indexes waste space and slow down write operations. To identify which indexes are being used (and which aren't), use the following query:
-- Find unused indexes (modify the table name) SELECT OBJECT_NAME(i.object_id) AS TableName, i.name AS IndexName, user_seeks, user_scans, user_lookups, user_updates -- High updates? Index may hurt writes FROM sys.dm_db_index_usage_stats us JOIN sys.indexes i ON us.object_id = i.object_id AND us.index_id = i.index_id WHERE OBJECT_NAME(i.object_id) = 'Orders'; -- Change to your table
This query shows how many times each index has been used in different operations. Low seek/scan/lookup counts combined with high update counts indicate an index that's costing more than it's helping.
Over time, indexes can become fragmented, reducing their effectiveness. You can rebuild them to restore performance:
-- Rebuild a single index ALTER INDEX IX_Orders_OrderID ON Orders REBUILD; -- Rebuild all indexes on a table (use caution in production environments) ALTER INDEX ALL ON Orders REBUILD;
Index rebuilding reorganizes the physical storage of the index, but it can be resource-intensive. Consider scheduling rebuilds during off-peak hours for production systems.
Partitioning: Managing Large Tables Effectively
When tables grow to millions of rows, operations like reporting queries or maintenance tasks become increasingly difficult. Table partitioning helps by dividing large tables into smaller, more manageable pieces based on a partition key (often a date column).
Time-Based Partitioning
For most business applications, time-based partitioning provides the most benefit since data often has a natural time dimension. Implementing partitioning requires three steps. First, create a partition function that defines the boundaries between partitions:
CREATE PARTITION FUNCTION PF_OrderDateRange (DATE) AS RANGE RIGHT FOR VALUES ('2023-01-01', '2024-01-01', '2025-01-01');
This function creates four partitions: orders before 2023, orders from 2023, orders from 2024, and orders from 2025 onward. The RANGE RIGHT
specification means the boundary value belongs to the right partition.
Next, create a partition scheme that maps partitions to filegroups:
CREATE PARTITION SCHEME PS_OrderDateRange AS PARTITION PF_OrderDateRange TO ([PRIMARY], [PRIMARY], [PRIMARY], [PRIMARY]);
In this simple example, all partitions are stored on the PRIMARY filegroup. In more advanced scenarios, you might place different partitions on different filegroups, potentially on different storage devices.
Finally, create or alter your table to use the partition scheme:
CREATE TABLE Orders ( OrderID INT, OrderDate DATE, CustomerID INT, TotalAmount DECIMAL(10,2) ) ON PS_OrderDateRange(OrderDate);
The table is now partitioned by OrderDate. When you run a query with a date filter, SQL Server can use partition elimination to ignore partitions that don't match the filter:
-- This query only needs to scan one partition SELECT * FROM Orders WHERE OrderDate BETWEEN '2023-01-01' AND '2023-12-31';
Partitioning provides several benefits:
- Improved query performance through partition elimination
- Faster maintenance operations that can work on one partition at a time
- Efficient archiving of old data by switching out entire partitions
- Better management of very large tables
However, partitioning is not a magic solution for all performance problems and adds complexity to your database design. It works best when:
- Tables exceed several million rows
- Data has a clear partitioning key that aligns with common query filters
- You need to archive or manage data in large chunks
Clustered Columnstore Indexes: An Analytics Powerhouse
Traditional row-based storage works well for transactional workloads with frequent single-row operations. However, analytical queries that scan and aggregate large amounts of data benefit from a completely different approach: columnar storage.
A Clustered Columnstore Index (CCI) stores data by column rather than by row, dramatically improving performance for analytical queries while also providing exceptional compression.
Converting a table to use columnstore is straightforward:
-- Transforms the entire table into columnar storage CREATE CLUSTERED COLUMNSTORE INDEX CCI_Orders ON Orders;
This replaces the table's existing storage structure with columnar format. The transformation is significant—data is now stored by column, compressed, and optimized for scan operations.
Columnstore indexes excel in the following scenarios:
- Data Warehousing and Reporting: Queries that scan large portions of a table and perform aggregations (SUM, AVG, COUNT) see dramatic performance improvements, often 10-100x faster.
- Large Tables: Tables with millions of rows benefit most from columnstore compression and query optimization.
- Batch Processing: Systems that load data in large batches rather than row-by-row operations.
- Storage Efficiency: Columnstore typically achieves 5-10x compression compared to row storage, reducing both storage costs and I/O requirements.
A typical reporting query showing the power of columnstore:
-- This query might be 50x faster with a columnstore index SELECT DATEPART(YEAR, OrderDate) AS OrderYear, DATEPART(MONTH, OrderDate) AS OrderMonth, SUM(TotalAmount) AS MonthlyRevenue FROM Orders GROUP BY DATEPART(YEAR, OrderDate), DATEPART(MONTH, OrderDate) ORDER BY OrderYear, OrderMonth;
When to Avoid Columnstore
Despite their advantages, columnstore indexes aren't appropriate for all workloads:
- OLTP Systems with Single-Row Operations: Applications that frequently update individual rows will perform poorly with columnstore indexes, which are optimized for batch operations.
- Small Tables: The overhead of columnstore may not be justified for tables with fewer than a million rows.
- Highly Volatile Data: Tables where data changes constantly throughout the day may not benefit from columnstore.
If your workload includes a mix of analytical and transactional queries, consider a hybrid approach: keep the main tables optimized for transactions, and use columnstore indexes on reporting copies or aggregate tables.
Practical Optimizations for Azure SQL
Beyond the major strategies already discussed, several Azure SQL-specific features can boost performance with minimal effort.
Enable and Use Query Store
Query Store captures query execution statistics and plans over time, acting like a flight recorder for your database. This feature is invaluable for identifying performance regressions and troubleshooting intermittent issues.
ALTER DATABASE [YourDB] SET QUERY_STORE = ON;
Once enabled, Query Store allows you to:
- Compare performance before and after changes
- Identify queries that have regressed in performance
- Force specific execution plans for critical queries
- Analyze workload patterns over time
The data collected by Query Store can be viewed through Management Studio or queried directly from system views. This information provides insights into which queries might benefit most from the optimization techniques we've discussed.
Manual Statistics Updates
While SQL Server automatically maintains statistics, these automatic updates use sampling rather than analyzing all data. For critical queries on large tables, occasionally updating statistics with a full scan can improve plan quality:
UPDATE STATISTICS Orders WITH FULLSCAN;
This ensures the query optimizer has accurate information about data distribution, leading to better execution plans. Consider scheduling full statistics updates during maintenance windows, especially after large data loads or changes.
Automatic Tuning
Azure SQL Database offers automatic tuning capabilities that can automatically identify and fix performance issues:
- Automatic plan correction identifies queries with regressed performance and reverts to better-performing plans
- Index management can suggest, create, and verify the impact of new indexes
- Automated statistics maintenance ensures optimal query performance
To enable automatic tuning through the Azure portal:
- Navigate to your database
- Select "Automatic tuning" under Performance
- Enable the desired options
This feature leverages machine learning and continuous monitoring to optimize your database with minimal human intervention—essentially applying the techniques we've discussed automatically.
Final Thoughts
Optimizing Azure SQL Database performance isn't about applying every possible technique—it's about understanding your workload and applying the right optimizations strategically. The most effective approach combines:
- Smart indexing tailored to your specific query patterns
- Partitioning only when table size and access patterns justify the complexity
- Columnstore indexes for analytical workloads with large data volumes
- Azure-specific features like Query Store and Automatic Tuning
Always remember to test performance optimizations in a non-production environment first, as techniques that benefit one workload might degrade another. Performance tuning is iterative—measure, optimize, and measure again to ensure your changes have the desired effect.
What performance challenges are you facing with your Azure SQL databases? Each database workload has unique characteristics, and sometimes the most effective solutions come from combining these techniques in creative ways.