Well, here it is again. The second Tuesday of the month, which means that it’s T-SQL Tuesday. This month, Dev Nambi (blog | twitter) is hosting the party, and he has decided that the topic is to be on Assumptions. Specifically, he wants us to write about:
A big assumption you encounter at work, one that people are uncomfortable talking about. Every team has an elephant in the room.
One assumption that I have seen repeatedly is “if there is a table with a clustered index, I can select all of the rows out of that table and they will be in the order of the clustered index without having to specify an ORDER BY clause”.
So, let’s test this out. Remember, to prove that this assumption is false, all that is needed is to get the results out of order.
First up for this test: create and populate a table.
-- let's use a safe database USE tempdb; GO -- create a table to play around with IF OBJECT_ID('tempdb..temp') IS NOT NULL DROP TABLE temp; CREATE TABLE temp ( RowID INTEGER PRIMARY KEY CLUSTERED, MyGUID UNIQUEIDENTIFIER DEFAULT (NEWID())); -- populate the table with an ascending number -- let's try a million rows, generated with a dynamic tally table WITH Tens (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1), Hundreds(N) AS (SELECT 1 FROM Tens t1, Tens t2), Millions(N) AS (SELECT 1 FROM Hundreds t1, Hundreds t2, Hundreds t3), Tally (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Millions) INSERT INTO temp (RowID) SELECT N FROM Tally; -- now, checkpoint the database (it's in simple, so all pages will now be "clean") CHECKPOINT; -- and let's remove the clean pages from memory DBCC DROPCLEANBUFFERS;
Now let’s just do a few random searches on this data.
-- let's select some data from the table SELECT RowID, MyGUID FROM temp WHERE RowID BETWEEN 100 AND 200; -- and some more data SELECT RowID, MyGUID FROM temp WHERE RowID BETWEEN 1000 AND 1200; -- and some more data SELECT RowID, MyGUID FROM temp WHERE RowID BETWEEN 5000 AND 5100; -- notice that all of these are ordered?
Now let’s run our query against the entire table.
-- now, let's get some more data SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; SELECT RowID, MyGUID FROM temp;
On my system, I get the following results:
As you can see, we went from value 1953 to 867133. As can be plainly seen, the assumption is false.
So, what happened? And are you complaining about that READ UNCOMMITTED transaction isolation level? Well, it turns out that this isolation level isn’t necessary to reproduce this – this query will also reproduce the same result set:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED; SELECT RowID, MyGUID FROM temp WITH (TABLOCKX);
Are you wondering what is happening here? Well, anytime that you are in the read uncommitted isolation level (you might be familiar with it as the NOLOCK query hint), you are telling SQL that you don’t care about data consistency, so SQL will read the data as fast as it can. Which happens to be an Allocation Order Scan… it will read the pages from the data file in the order that they were allocated for this table. And, for higher isolation levels, anytime that SQL can be sure that the data won’t change underneath it (such as when you have an exclusive lock on the table, or reading from a read-only filegroup), it will also perform an Allocation Order Scan.
Can we see that this is what is happening? Sure we can!
First off, re-run this query with the Actual Execution Plan turned on. You’ll get an execution plan like this:
Notice that the scan is not ordered. (If you re-run any of the previous queries, you will see that they are ordered.) With an unordered scan, SQL will get the data as fast as it can since it doesn’t need to follow an index… and in this case, it is an Allocation Order Scan (if you run this query without the TABLOCKX query hint in the READ COMMITTED isolation level, it will still be unordered, but it can’t do an Allocation Order Scan since there isn’t a guarantee that the data can’t be changed). This can include reading pages that are already in memory prior to others (I’ve seen this happen before, and that is why I included those previous queries).
But let’s continue to see if these results are indeed in allocation order. In a separate query window, run this query (undocumented command, but Paul Randal blogged about it here) to see the pages that are allocated for this table:
DBCC IND (tempdb, 'temp', -1)
The results that I get are (unnecessary columns are removed for the sake of brevity):
And it continues, but this is enough. We are interested in the rows where PageType = 1; these are the pages where the data is stored.
Side note: Do you see a few rows where the next logical page (NextPagePID) is not the next physical page? What we have here is a fragmented index.
So, let’s modify the query to return what file, page and slot that the row is being returned from:
SELECT RowID, MyGUID, sys.fn_PhysLocFormatter(%%physloc%%) FROM temp WITH (TABLOCKX);
Here we added a function to the result. The %%physloc%% system variable returns the row/page/slot that a row is on, but in a binary format. The sys.fn_PhysLocFormatter function transforms that to something easier to read, in the format “File:Page:Slot”. (Again, this is undocumented, and Paul Randal blogged about this also here). When this query is run, browse through the results, and you will see the pages being returned in the same order as they were allocated in (from the DBCC IND statement earlier). On my system, the results that I get are:
Browse through this result set for the earlier rows and you will see the pages returned in the same order of the physical file allocation of these pages.
In summary: under certain conditions, SQL will choose to perform an Allocation Order Scan of the data. This is not the same as the physical order of the pages in the file, but the order that the pages were actually allocated for this data. The results will be in the order of the pages, and an assumed order is not be guaranteed without using an ORDER BY clause.