Blog Post

SQL 101 : Without “ORDER BY”, order is not guaranteed.

,

It’s as simple as that. If you don’t specify ORDER BY, then the order of your result set is not guaranteed. Oh sure , there may be situations where the data looks like it is being returned ordered, but this is not really the case. So, for example :

CREATE TABLE #Example
    (
    row_id  INTEGER
    );
GO    
INSERT  #Example 
select top(10) ROW_NUMBER() over (order by (Select null))
from syscolumns
go
SELECT  E.row_id
FROM    #Example E;

On a number of occasions on different forums, the suggestion has been made that row_id will be in order. Certainly it is here (or if not, let me know ). Is that enough to say "It is in order" ?

insert into #Example(row_id) values(-1)

go

SELECT E.row_id

FROM #Example E;

Now, that’s out of order. Ahh , you may say, "That’s a heap table. A clustered key will define the order. After all, the data is already in that order, so no need for an order by. It’s going to start at row 1 and scan all the way through". NO, parallelism can cause multiple streams to start at multiple points in the table. Connor Cunningham goes through that scenario

here .

Ok , what if you have a small number of rows and parallelism is not an issue. Now we don’t need the order by.

How about this ?

Drop TABLE #Example
go
CREATE TABLE #Example
    (
    row_id  INTEGER primary key,
    uuid    uniqueidentifier not null
    );
GO    
Create unique index idxExample on #Example(uuid)
go
INSERT  #Example 
select top(10) 
       ROW_NUMBER() over (order by (Select null)),
       NEWID()
from syscolumns
go
SELECT  *
FROM    #Example E;

That’s going to scan the clustered key , right ? Nope, at least not on my machine.

OrderScan

It produces a scan of the NON-clustered index. A discussion of why is not for now, but an illustration of my previous point.

So , to avoid using ORDER BY , we need:

A) A non heap table

B) To avoid parallelism

C) A clustered index

D) No non clustered indexes

During a recent discussion Paul White posted this example

CREATE TABLE dbo.Example
    (
    row_id  INTEGER IDENTITY NOT NULL PRIMARY KEY CLUSTERED,
    data    AS CONVERT(VARCHAR(11), row_id) PERSISTED NOT NULL UNIQUE
    );
GO    
INSERT  dbo.Example (data)
DEFAULT VALUES;
GO 10
SELECT  E.row_id, E.data
FROM    dbo.Example E;

Again that is now out of order due to the index to maintain the uniqueness on the data column. If I need to add a column like this to a table, the issue of non – ordered data would be low down on my list of thoughts.

All of these scenarios, you can, perhaps legitimately, argue don’t apply to your system and that you *know* you can meet all of these conditions. Will that always be true? Are you really happy with a table that you can’t add an index to? Even so, the engine may well change in the future and completely destroy your assumption. Then what will happen? Will you be able to confidently find and fix all the statements? That’s a big price to pay to save an extra few keystrokes now.

Just add ORDER BY now, and sleep safe at night.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating