lmarkum (6/19/2015)
I have an UPDATE statement that joins two table by SendId. One table, I'll call it T1, has a clustered index on SendId ASC. The other table I will call T2 also has a clustered index on SendID ASC. All the columns from T2 are used to update T1. The execution plan shows a Clustered index scan on T2 and a Clustered Index Seek on T1 going into a Nested Loops inner join. Immediately following is a Distinct Sort that is done on SendId ASC. Why the Distinct SORT if the tables are already ordered by SendID?
I think you've more or less answered your question with some encouragement from other folks. Whether you have or not, here's a demo which clearly shows the point - without a unique index on the source set, SQL Server can't determine if it contains dupes or not, so dedupes with a DISTINCT SORT.
DROP TABLE #a; DROP TABLE #b;
WITH
e1 AS (SELECT d.n FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) d (n)),
e2 AS (SELECT e1.n FROM e1 CROSS JOIN e1 b),
e4 AS (SELECT e2.n FROM e2 CROSS JOIN e2 b),
iT AS (SELECT n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM e4 CROSS JOIN e2 CROSS JOIN e1)
SELECT
n,
c1 = NEWID(),
c2 = DATEADD(millisecond,CHECKSUM(NEWID()),GETDATE()),
c3 = ABS(CHECKSUM(NEWID()))
INTO #a
FROM iT
CREATE CLUSTERED INDEX cx_Thing ON #a (n)
SELECT TOP 24 *
INTO #b
FROM #a
CREATE CLUSTERED INDEX cx_NotUnique ON #b (n)
-- Check the execution plan: DISTINCT SORT
UPDATE a SET
c1 = b.c1,
c2 = b.c2,
c3 = b.c3
FROM #a a
INNER JOIN #b b ON b.n = a.n
DROP INDEX cx_NotUnique ON #b
CREATE UNIQUE CLUSTERED INDEX cx_Unique ON #b (n)
-- Check the execution plan: NO SORT
UPDATE a SET
c1 = b.c1,
c2 = b.c2,
c3 = b.c3
FROM #a a
INNER JOIN #b b ON b.n = a.n
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden