From Rows to Pages: The Hidden Chaos Behind SQL Server’s Sampling Methods
Learn about the TABLESAMPLE option in T-SQL and uncover some of the pitfalls of assuming this works as you think it does.
2025-08-22
2,082 reads
Learn about the TABLESAMPLE option in T-SQL and uncover some of the pitfalls of assuming this works as you think it does.
2025-08-22
2,082 reads
When a SQL Server Express-based factory app started crawling, the culprit wasn’t hardware or network — it was a decades-old WHILE loop migrated from C/C++ to SQL. This real-world story breaks down how procedural habits, memory grants, and lack of window functions nearly derailed a production floor.
2025-07-28
14,242 reads
Introduction It was the week before Black Friday — the biggest online ad rush of the year. Our US-based ad-tech platform was gearing up for an insane traffic spike. Hundreds of real-time campaigns were about to go live across multiple brands, each with thousands of user sessions flowing through our system. Every incoming user impression […]
2025-07-22
2,304 reads
In this article, I wanted to test a common assumption we DBAs make – that adding INCLUDE columns to indexes is harmless. I created a FULL recovery test database with a realistic wide Orders table containing extra large VARCHAR columns to simulate an ERP workload. I ran updates and measured transaction log backup sizes before and after adding INCLUDE columns to a nonclustered index. The results shocked me. The update without INCLUDE columns generated a 10 MB log backup, while the same update with INCLUDE columns produced over 170 MB – a 17x increase in log volume. I explain why this happens: INCLUDE columns are physically stored in index leaf rows, so updates affecting them write bigger log records. I also clarify that updating key columns generates even more log than INCLUDE updates because it involves row movement (delete + insert), but INCLUDE updates still cost more log than if those columns weren’t indexed at all. The takeaway is clear – INCLUDE columns are powerful, but they silently increase transaction log generation, impacting backup sizes, replication lag, and DR readiness. Always measure their real cost before deploying to production.
2025-07-18
660 reads
This article dives deep into cxpacket and cxconsumer in sql server, explaining how to simulate each, when they appear, and why they matter. Using live execution plans, wait monitoring, and worker thread diagnostics, we uncover how uneven parallelism triggers thread sync waits—and how SQL Server sometimes hides real issues behind innocent-looking CXCONSUMER waits. Includes step-by-step queries, tuning tips, and a real-world scenario where repartition streams quietly ruined performance.
2025-07-07
3,385 reads
Misusing MAXDOP can silently kill performance across your SQL Server. In this deep dive, we uncover how one bad query caused CPU meltdown, run real-world tests, and show how tuning—not parallelism—often holds the true fix.
2025-06-23
3,853 reads
This article examines how tempdb is affected by recursive queries, using a few different methods.
2025-05-23
1,981 reads
Learn how to safely remove a SQL Server .ndf data file without any downtime using DBCC SHRINKFILE (EMPTYFILE). This hands-on tutorial walks through real-world Azure-based setup, data redistribution, and storage cleanup — ideal for DBAs managing enterprise SQL Server environments.
2025-05-16
2,293 reads
By Chris Yates
For decades, enterprises have approached data management with the same mindset as someone stuffing...
Truncate Table Pitfalls Truncating a table can be gloriously fast—and spectacularly dangerous when used carelessly....
You can find all the session materials for the presentation “Indexing for Dummies” that...
Comments posted to this topic are about the item DBCC CHECKIDENT
Comments posted to this topic are about the item Distributed Availability Group Health: T-SQL...
Hi, our peer who owns a remote mysql server from which we extract warehouse...
What is returned as a result set when I run this command without a new seed value?
See possible answers