• Hi Paul.

    Please consider the following code, using the sample data from your blog:

    WITH RowIDs

    AS (

    SELECT DD.id

    FROM [20091201_ties].t_distinct DD

    WHERE DD.lorderer =

    (

    SELECT MIN(DI.lorderer)

    FROM [20091201_ties].t_distinct DI

    WHERE DI.glow = DD.glow

    )

    )

    SELECT COUNT(*),

    SUM(LEN(LookUp.stuffing))

    FROM RowIDs

    CROSS

    APPLY (

    SELECT stuffing

    FROM [20091201_ties].t_distinct TD

    WHERE TD.id = RowIDs.id

    ) LookUp;

    On my 2005 installation, this runs slower than DISTINCT:

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 1 ms.

    (????? ??????????: 1)

    Table 't_distinct'. Scan count 11, logical reads 2250, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 241 ms.

    for DISTINCT,

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 1 ms.

    (????? ??????????: 1)

    Table 't_distinct'. Scan count 1, logical reads 2188, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 359 ms, elapsed time = 356 ms.

    for your query.

    I created the index with id explicitly added, and it's used in both queries. The optimizer built the same plan as shown in your post.

    Segment operator still requires a full index scan which is costly. The sample table contains only a million records and the whole index fits into the cache, so the performance difference it not so obvious, but as the index size grows, the extra physical reads make the query perform much worse.

    CTE query was intended to get rid of the full index scans. It completes in only IndexDepth page reads per distinct record. Even for billion rows, the index depth will be around 6, and with 100 groupers, the same query would compete in 600 page reads from the index. Even if all these reads are physical reads, this is not that much.

    Of course, having a dedicated table instead of the CTE, as Matt suggested, will improve the query yet a little more.

    One other thing. I notice that the index definitions given in your blog effectively INCLUDE the id column since that is the clustering key for the table. The id therefore only exists at the leaf level of the index, which makes it unavailable at higher levels. (As an aside, the name of the index seems to imply that id is a key column, rather than an include.)

    I've heard this before, but my googlefu was not enough for me to be able to find any reference which proves or disproves it 🙂

    But let us conduct a little experiment:

    SET NOCOUNT OFF

    CREATE TABLE t_clustered (

    id INT NOT NULL,

    val INT NOT NULL,

    lid INT NOT NULL,

    CONSTRAINT PK_clustered_id PRIMARY KEY (id),

    CONSTRAINT UX_clustered_lid UNIQUE (lid)

    )

    CREATE INDEX ix_clustered_val ON t_clustered (val)

    CREATE INDEX ix_clustered_val__lid ON t_clustered (val) INCLUDE (lid)

    GO

    WITH q (id) AS

    (

    SELECT 1

    UNION ALL

    SELECT id + 1

    FROM q

    WHERE id < 1000000

    )

    INSERT

    INTO t_clustered (id, val, lid)

    SELECT id, 1, id

    FROM q

    OPTION (MAXRECURSION 0)

    This table has 1,000,000 records, clustered PK on id and two indexes: one on val only, another one on val covering lid.

    lid holds the same values as id, and is marked UNIQUE

    Val is the only declared key column in both indexes, and the only value of val is 1.

    Let us issue the following query which searches both for val and lid:

    SELECT val, lid

    FROM t_clustered WITH (INDEX (ix_clustered_val__lid))

    WHERE val = 1

    AND lid = 987654

    , which forces use of the index, with the following plan:

    |--Index Seek(OBJECT:([test].[dbo].[t_clustered].[ix_clustered_val__lid]), SEEK:([test].[dbo].[t_clustered].[val]=(1)), WHERE:([test].[dbo].[t_clustered].[lid]=(987654)) ORDERED FORWARD)

    Formally, it's an index seek, but since val = 1 holds for every record in the table and lid is not included in the key, the engine had to do the full index scan which required reading of all index pages:

    SQL Server parse and compile time:

    CPU time = 2 ms, elapsed time = 2 ms.

    (????? ??????????: 1)

    Table 't_clustered'. Scan count 1, logical reads 1862, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 62 ms, elapsed time = 75 ms.

    But if we issue a similar query against (val, id)

    SELECT val, id

    FROM t_clustered WITH (INDEX (ix_clustered_val))

    WHERE val = 1

    AND id = 987654

    , we get this plan:

    |--Index Seek(OBJECT:([test].[dbo].[t_clustered].[ix_clustered_val]), SEEK:([test].[dbo].[t_clustered].[val]=(1) AND [test].[dbo].[t_clustered].[id]=(987654)) ORDERED FORWARD)

    , and the query completes in 3 page reads (which is the index depth):

    SQL Server parse and compile time:

    CPU time = 2 ms, elapsed time = 2 ms.

    (????? ??????????: 1)

    Table 't_clustered'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    id, I remind, is not declared a part of the index, but we see that it is used by the Seek operator against the index, and the query does not have to fetch all index pages. Somehow, 3 page reads is enough for the engine to locate the required record.

    I cannot figure out how would this be possible if id were not a part of the key.

    Could you please provide any reference that would show the key layout of a secondary index in a clustered table?