• Mark-101232 (7/4/2013)


    Interesting! However, the two queries don't give the same results. If you run this

    delete from TABLE2 where Filename<>'file3.txt'

    on the OPs sample data so that job1 only has one out of three successfully completed files, your query still reports the job as successful

    Degenerative case huh? Sorry I missed that in the requirements. Minor fix though I think will get it:

    SELECT jobname

    FROM #TABLE1 a

    OUTER APPLY (

    SELECT TOP 1 SuccessFailIndicator=1-SuccessFailIndicator

    FROM #TABLE2 b

    WHERE a.[Filename] = b.[Filename]

    ORDER BY checkdate DESC

    ) b

    GROUP BY a.jobname

    HAVING SUM(ISNULL(b.SuccessFailIndicator, 1)) = 0;

    Test harness is now:

    CREATE TABLE #TABLE1 ([filename] varchar(15), jobname varchar(15)

    ,PRIMARY KEY (jobname, [filename]));

    CREATE TABLE #TABLE2 ([filename] varchar(15), SuccessFailIndicator bit, checkdate datetime

    ,PRIMARY KEY ([filename], checkdate));

    GO

    insert into #table1 values('file1.txt', 'job1');

    insert into #table1 values('file2.txt', 'job1');

    insert into #table1 values('file3.txt', 'job1');

    WITH Tally (n) AS (

    SELECT TOP 50000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns a CROSS JOIN sys.all_columns b)

    INSERT INTO #TABLE1

    SELECT 'f' + CAST(b.n AS VARCHAR(2)) + RIGHT('00000000' + CAST(a.n AS VARCHAR(9)), 9) + '.txt'

    ,'j' + RIGHT('00000000' + CAST(a.n AS VARCHAR(9)), 9)

    FROM Tally a

    CROSS APPLY (SELECT TOP (1+ABS(CHECKSUM(NEWID()))%9) n FROM Tally) b;

    insert into #TABLE2 values('file1.txt', 1, '6/5/2013');

    insert into #TABLE2 values('file2.txt', 0, '6/4/2013');

    insert into #TABLE2 values('file3.txt', 1, '6/1/2013');

    insert into #TABLE2 values('file1.txt', 0, '6/1/2013');

    insert into #TABLE2 values('file2.txt', 0, '6/1/2013');

    WITH Tally (n) AS (

    SELECT TOP 10 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns a)

    INSERT INTO #TABLE2

    SELECT [filename], ABS(CHECKSUM(NEWID()))%2

    ,DATEADD(day, -(n-1), '6/5/2013')

    FROM #TABLE1 a

    CROSS APPLY (SELECT TOP (1+ABS(CHECKSUM(NEWID()))%8) n FROM Tally) b

    WHERE [filename] NOT IN ('file1.txt','file2.txt','file3.txt');

    --SELECT * FROM #TABLE1;

    --SELECT * FROM #TABLE2;

    DECLARE @jobname VARCHAR(15);

    DBCC FREEPROCCACHE;

    PRINT 'MARK';

    SET STATISTICS TIME ON;

    WITH CTE1 AS (

    SELECT Filename,SuccessFailIndicator,checkdate,

    ROW_NUMBER() OVER(PARTITION BY Filename ORDER BY checkdate DESC) AS rn

    FROM #TABLE2)

    SELECT @jobname=t1.jobname

    --SELECT t1.jobname

    FROM #TABLE1 t1

    LEFT OUTER JOIN CTE1 c ON c.Filename = t1.Filename

    AND c.SuccessFailIndicator = 1

    AND c.rn=1

    GROUP BY t1.jobname

    HAVING COUNT(t1.Filename) = COUNT(c.Filename);

    SET STATISTICS TIME OFF;

    DBCC FREEPROCCACHE;

    PRINT 'DWAIN';

    SET STATISTICS TIME ON;

    --SELECT jobname

    SELECT @jobname=jobname

    FROM #TABLE1 a

    OUTER APPLY (

    SELECT TOP 1 SuccessFailIndicator=1-SuccessFailIndicator

    FROM #TABLE2 b

    WHERE a.[Filename] = b.[Filename]

    ORDER BY checkdate DESC

    ) b

    GROUP BY a.jobname

    HAVING SUM(ISNULL(b.SuccessFailIndicator, 1)) = 0;

    SET STATISTICS TIME OFF;

    GO

    DROP TABLE #TABLE1;

    DROP TABLE #TABLE2;

    The results aren't much different than before.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    MARK

    SQL Server Execution Times:

    CPU time = 13993 ms, elapsed time = 6558 ms.

    Warning: Null value is eliminated by an aggregate or other SET operation.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    DWAIN

    SQL Server Execution Times:

    CPU time = 7831 ms, elapsed time = 2105 ms.

    The reason (at least by the execution plan on my machine) why the SELECT TOP 1 within APPLY is faster on my machine is because it is able to do a clustered index seek on TABLE2 (scan on TABLE1) while the JOIN does two clustered index scans.

    Edit Corrected (removed) an ORDER BY I left in to check the results and bumped up the size of the test harness a bit.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St