• dwain.c (7/4/2013)


    I'm always interested in cases like this so with more time on my hands than I care to admit, I came up with the following test harness:

    CREATE TABLE #TABLE1 ([filename] varchar(15), jobname varchar(15)

    ,PRIMARY KEY (jobname, [filename]));

    CREATE TABLE #TABLE2 ([filename] varchar(15), SuccessFailIndicator bit, checkdate datetime

    ,PRIMARY KEY ([filename], checkdate));

    GO

    insert into #table1 values('file1.txt', 'job1');

    insert into #table1 values('file2.txt', 'job1');

    insert into #table1 values('file3.txt', 'job1');

    WITH Tally (n) AS (

    SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns a CROSS JOIN sys.all_columns b)

    INSERT INTO #TABLE1

    SELECT 'f' + CAST(b.n AS VARCHAR(2)) + RIGHT('00000000' + CAST(a.n AS VARCHAR(9)), 9) + '.txt'

    ,'j' + RIGHT('00000000' + CAST(a.n AS VARCHAR(9)), 9)

    FROM Tally a

    CROSS APPLY (SELECT TOP (1+ABS(CHECKSUM(NEWID()))%9) n FROM Tally) b;

    insert into #TABLE2 values('file1.txt', 1, '6/5/2013');

    insert into #TABLE2 values('file2.txt', 0, '6/4/2013');

    insert into #TABLE2 values('file3.txt', 1, '6/1/2013');

    insert into #TABLE2 values('file1.txt', 0, '6/1/2013');

    insert into #TABLE2 values('file2.txt', 0, '6/1/2013');

    WITH Tally (n) AS (

    SELECT TOP 10 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns a)

    INSERT INTO #TABLE2

    SELECT [filename], ABS(CHECKSUM(NEWID()))%2

    ,DATEADD(day, -(n-1), '6/5/2013')

    FROM #TABLE1 a

    CROSS APPLY (SELECT TOP (1+ABS(CHECKSUM(NEWID()))%8) n FROM Tally) b

    WHERE [filename] NOT IN ('file1.txt','file2.txt','file3.txt');

    --SELECT * FROM #TABLE1;

    --SELECT * FROM #TABLE2;

    DECLARE @jobname VARCHAR(15);

    DBCC FREEPROCCACHE;

    PRINT 'MARK';

    SET STATISTICS TIME ON;

    WITH CTE1 AS (

    SELECT Filename,SuccessFailIndicator,checkdate,

    ROW_NUMBER() OVER(PARTITION BY Filename ORDER BY checkdate DESC) AS rn

    FROM #TABLE2)

    SELECT @jobname=t1.jobname

    FROM #TABLE1 t1

    LEFT OUTER JOIN CTE1 c ON c.Filename = t1.Filename

    AND c.SuccessFailIndicator = 1

    AND c.rn=1

    GROUP BY t1.jobname

    HAVING COUNT(t1.Filename) = COUNT(c.Filename)

    ORDER BY t1.jobname;

    SET STATISTICS TIME OFF;

    DBCC FREEPROCCACHE;

    PRINT 'DWAIN';

    SET STATISTICS TIME ON;

    SELECT @jobname=jobname

    FROM #TABLE1 a

    CROSS APPLY (

    SELECT TOP 1 SuccessFailIndicator=1-SuccessFailIndicator

    FROM #TABLE2 b

    WHERE a.[Filename] = b.[Filename]

    ORDER BY checkdate DESC

    ) b

    GROUP BY a.jobname

    HAVING SUM(b.SuccessFailIndicator) = 0

    ORDER BY a.jobname;

    SET STATISTICS TIME OFF;

    GO

    DROP TABLE #TABLE1;

    DROP TABLE #TABLE2;

    Which yielded these results:

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    MARK

    SQL Server Execution Times:

    CPU time = 4352 ms, elapsed time = 2032 ms.

    Warning: Null value is eliminated by an aggregate or other SET operation.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    DWAIN

    SQL Server Execution Times:

    CPU time = 1763 ms, elapsed time = 450 ms.

    Note that parallelism played a role in both queries because CPU > Elapsed time.

    Interesting! However, the two queries don't give the same results. If you run this

    delete from TABLE2 where Filename<>'file3.txt'

    on the OPs sample data so that job1 only has one out of three successfully completed files, your query still reports the job as successful

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537