• autoexcrement (12/6/2013)


    I can't rival the other guys on this board, but I am using this as practice for myself and would like (constructive but friendly) feedback on the solution I came up with if anyone is willing. 🙂

    Actually you have done quite well my young Jedi! Witness the following 1,000,000 row test harness incorporating the 4 solutions suggested.

    CREATE TABLE [Person]

    (

    IPCode INT

    );

    CREATE TABLE [Profile]

    (

    IPCode INT

    , ID NVARCHAR(32)

    );

    --CREATE INDEX id1 ON [Profile](IPCode, ID);

    --INSERT INTO [Person] VALUES (1);

    --INSERT INTO [Person] VALUES (2);

    --INSERT INTO [Person] VALUES (3);

    --INSERT INTO [Person] VALUES (4);

    WITH Tally (n) AS

    (

    SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns a CROSS JOIN sys.all_columns b

    )

    INSERT INTO [Person]

    SELECT n

    FROM Tally;

    --INSERT INTO [Profile] VALUES (1,'AAAA');

    --INSERT INTO [Profile] VALUES (1,'AAAA');

    --INSERT INTO [Profile] VALUES (2,'BBBB');

    --INSERT INTO [Profile] VALUES (2,'BBBB');

    --INSERT INTO [Profile] VALUES (2,'BBBC');

    --INSERT INTO [Profile] VALUES (3,'CCCC');

    --INSERT INTO [Profile] VALUES (3,'CCCC');

    --INSERT INTO [Profile] VALUES (3,'CCCC');

    --INSERT INTO [Profile] VALUES (4,'DDDD');

    --INSERT INTO [Profile] VALUES (4,'DDDA');

    --INSERT INTO [Profile] VALUES (4,'DDDB');

    --INSERT INTO [Profile] VALUES (4,'DDDD');

    --INSERT INTO [Profile] VALUES (4,'DDDD');

    WITH Tally (n) AS

    (

    SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns a CROSS JOIN sys.all_columns b

    )

    INSERT INTO [Profile]

    SELECT a.n, RIGHT(10000+b.n/20, 4)

    FROM Tally a

    CROSS APPLY

    (

    SELECT n

    FROM Tally

    WHERE n <= 100

    ) b;

    PRINT 'Alan.B';

    SET STATISTICS IO, TIME ON;

    WITH

    xxx AS

    (

    SELECTpe.IPCode AS pe_ip,

    pr.IPCode AS pr_ip,

    pr.ID,

    DENSE_RANK() OVER (PARTITION BY pr.IPCode ORDER BY ID) AS x

    FROM Person pe

    CROSS APPLY [Profile] pr

    WHERE pe.IPCode=pr.IPCode

    ),

    qualifiers AS

    (

    SELECT DISTINCT pr_ip

    FROM xxx

    WHERE x>1

    )

    SELECT q.pr_ip , xxx.ID

    FROM qualifiers q

    CROSS APPLY xxx

    WHERE q.pr_ip=xxx.pr_ip

    GROUP BY q.pr_ip, ID;

    SET STATISTICS IO, TIME OFF;

    PRINT 'Luis Cazares';

    SET STATISTICS IO, TIME ON;

    SELECT DISTINCT IPCode, ID

    FROM Profile p

    WHERE EXISTS( SELECT IPCode

    FROM Profile x

    WHERE p.IPCode = x.IPCode

    GROUP BY IPCode HAVING COUNT( DISTINCT ID) > 1)

    SET STATISTICS IO, TIME OFF;

    PRINT 'Dwain.C';

    SET STATISTICS IO, TIME ON;

    SELECT a.IPCode, ID

    FROM Profile a

    JOIN

    (

    SELECT IPCode

    FROM [Profile]

    GROUP BY IPCode

    HAVING MAX(ID) <> MIN(ID)

    ) b ON a.IPCode = b.IPCode

    GROUP BY a.IPCode, ID;

    SET STATISTICS IO, TIME OFF;

    PRINT 'Autoexcrement';

    SET STATISTICS IO, TIME ON;

    WITH CTE1 AS

    (

    SELECT IPCode, ID

    FROM [Profile]

    GROUP BY IPCode, ID

    ),

    CTE2 AS

    (

    SELECT IPCode

    FROM CTE1

    GROUP BY IPCode

    HAVING COUNT(*) > 1

    )

    SELECT IPCode, ID

    FROM [Profile]

    WHERE IPCode IN

    (SELECT IPCode FROM CTE2)

    GROUP BY IPCode, ID;

    SET STATISTICS IO, TIME OFF;

    GO

    DROP TABLE [Person];

    DROP TABLE [Profile];

    Without the INDEXI've proposed the results are as follows:

    Alan.B

    (60000 row(s) affected)

    Table 'Person'. Scan count 10, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Profile'. Scan count 10, logical reads 6192, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 4461 ms, elapsed time = 4203 ms.

    Luis Cazares

    (60000 row(s) affected)

    Table 'Profile'. Scan count 10, logical reads 6192, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 2073 ms, elapsed time = 938 ms.

    Dwain.C

    (60000 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Profile'. Scan count 10, logical reads 6192, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 3526 ms, elapsed time = 1330 ms.

    Autoexcrement

    (60000 row(s) affected)

    Table 'Profile'. Scan count 10, logical reads 6192, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 2199 ms, elapsed time = 892 ms.

    With the INDEX, the results are this:

    Alan.B

    (60000 row(s) affected)

    Table 'Person'. Scan count 10, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Profile'. Scan count 10, logical reads 6192, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 4791 ms, elapsed time = 3098 ms.

    Luis Cazares

    (60000 row(s) affected)

    Table 'Profile'. Scan count 2, logical reads 7508, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 405 ms, elapsed time = 470 ms.

    Dwain.C

    (60000 row(s) affected)

    Table 'Profile'. Scan count 10, logical reads 7776, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 998 ms, elapsed time = 584 ms.

    Autoexcrement

    (60000 row(s) affected)

    Table 'Profile'. Scan count 2, logical reads 7508, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 405 ms, elapsed time = 456 ms.

    So yours is the fastest in both cases, no doubt due to the fact that you aggregate the aggregate results.

    Nicely done!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St