How to fix this query?

  • This query has been driving me nuts and need some help please.  I have a working example below but, to try to explain, I collect data on what databases have been used or not since the last service start/restart.  So, to sum the Days Uptime, I need to sum on a Group based on the SQL Server instance name and the database name.   However, the tricky part is that there are some basic rules to identify 'unused' databases in a sequence of periods of uptime and what rows to sum.   For example:-

    a) If there are n rows for a Server and Database representing periods of uptime and the given database has never been marked as Used on any row, return and sum all the rows in that group
    b) If a given database has been used in a previous period of uptime but not marked as used in more recent period periods of uptime, only sum the rows AFTER the last period of uptime where it was used.
    c) In any case, if the most recent row for a given Server and Database shows the database has been used, don't sum or return any rows for it.

    If you refer to the query I've developed below, it's using an INNER JOIN.  I've tried using a LEFT JOIN but that isn't the solution.   I can see the issue in the query and that is that the subquery on which the INNER JOIN is made is based on a clause, WHERE Used = 1.   This works for groups that have at least one record where Used = 1 but, of course, zero records are return for groups of rows where all of those rows have Used = 0.  In these cases, instead of summing all the rows because all are Used = 0, those Databases are not returned/summed at all.   So, at this point, I'm guessing there is a smart way to write this query to avoid the problem but I don't know what it is!  Would appreciate some assistance with this.



    IF EXISTS (SELECT OBJECT_ID('TEMPDB..#tbla')) DROP TABLE #tbla
    GO
    CREATE TABLE #tbla
    (
    SQLInstance varchar(16),
    DBName varchar(8),
    SvcStartDate datetime,
    Used bit,
    DaysUptime SMALLINT
    )

    SET NOCOUNT ON

    -- DB1 is all Used = 1 so ignore completely
    INSERT INTO #tbla VALUES ('SQL1', 'DB1','2018-06-14 04:14:25', 1, 4)
    INSERT INTO #tbla VALUES ('SQL1', 'DB1','2018-06-15 04:14:00', 1, 1)
    INSERT INTO #tbla VALUES ('SQL1', 'DB1','2018-06-16 02:10:10', 1, 1)

    -- DB2 is 0 in every case so all rows should be included (summing the last column)
    INSERT INTO #tbla VALUES ('SQL1', 'DB2','2018-06-14 04:14:25', 0, 4)
    INSERT INTO #tbla VALUES ('SQL1', 'DB2','2018-06-15 04:14:00', 0, 1)
    INSERT INTO #tbla VALUES ('SQL1', 'DB2','2018-06-16 02:10:10', 0, 1)

    -- DB3 has the middle of the 3 rows where Used = 1 but the most recent row is 1 so ignore/exclude all.
    INSERT INTO #tbla VALUES ('SQL1', 'DB3','2018-06-14 04:14:25', 1, 4)
    INSERT INTO #tbla VALUES ('SQL1', 'DB3','2018-06-15 04:14:00', 0, 1)
    INSERT INTO #tbla VALUES ('SQL1', 'DB3','2018-06-16 02:10:10', 1, 1)

    -- DB4 shows Used = 0 on the most recent row but Used = 1 on the two preceding rows - therefore only include the most recent row where Used = 0
    INSERT INTO #tbla VALUES ('SQL1', 'DB4','2018-06-14 04:14:25', 1, 4)
    INSERT INTO #tbla VALUES ('SQL1', 'DB4','2018-06-15 04:14:00', 1, 1)
    INSERT INTO #tbla VALUES ('SQL1', 'DB4','2018-06-16 02:10:10', 0, 1)


    -- DB5 shows the two most recent rows where Used = 0 so these can be included but the oldest row has Used = 1 so this is to be included
    INSERT INTO #tbla VALUES ('SQL1', 'DB5','2018-06-14 04:14:25', 1, 4)
    INSERT INTO #tbla VALUES ('SQL1', 'DB5','2018-06-15 04:14:00', 0, 1)
    INSERT INTO #tbla VALUES ('SQL1', 'DB5','2018-06-16 02:10:10', 0, 1)

    -- DB6, include ALL rows as all are consecutive where Used = 0
    INSERT INTO #tbla VALUES ('SQL2', 'DB6','2018-04-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB6','2018-05-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB6','2018-06-01 01:00:00', 0, 20)
    INSERT INTO #tbla VALUES ('SQL2', 'DB6','2018-06-21 05:12:11', 0, 1)

    -- DB7, The most recent shows Used = 1 so exlude all rows
    INSERT INTO #tbla VALUES ('SQL2', 'DB7','2018-04-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB7','2018-05-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB7','2018-06-01 01:00:00', 0, 20)
    INSERT INTO #tbla VALUES ('SQL2', 'DB7','2018-06-21 05:12:11', 1, 1)

    -- DB8, The most recent row has Used = 0 so is to be included but the preceding row has Used = 1 so all preceding rows excluded regardless.
    INSERT INTO #tbla VALUES ('SQL2', 'DB8','2018-04-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB8','2018-05-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB8','2018-06-01 01:00:00', 1, 20)
    INSERT INTO #tbla VALUES ('SQL2', 'DB8','2018-06-21 05:12:11', 0, 1)  --<< only row to be included

    -- select the source data
    SELECT * FROM #tbla ORDER BY SQLInstance, DBName, SvcStartDate desc

    -- select the max SvcStartDate in each group where Used = 1
    -- This query doesn't take into account cases where the Used values in a group are ALL equal to zero.  For example, SQL1 and DB2
    SELECT
     a.SQLInstance,
     a.DBName,
     Max(a.SvcStartDate) as MaxSvcStartDate,
     Sum(a.DaysUpTime) as SumDaysUptime
    FROM
     #tbla a
     INNER JOIN  (
      SELECT 
       SQLInstance,
       DBName,
       max(svcStartDate) as MaxsvcStartDate
      FROM
       #tbla
      WHERE
       Used = 1
      GROUP BY
       SQLInstance,
       DBName
     ) b
     ON a.SQLInstance = b.SQLInstance AND a.DBName = b.DBName AND a.SvcStartDate > b.MaxsvcStartDate
    WHERE
     a.Used = 0
    GROUP BY
     a.SQLInstance,
     a.DBName
    ORDER BY
     SQLInstance,
     DBName


    -- result set is not correct as required - Inner Join - The 3 rows returns are correct but missing DB2 and DB6 (ie. DB2 and DB6 have all rows Used = 0)
    /*
    SQLInstance DBName MaxSvcStartDate               SumDaysUptime
    SQL1             DB4         2018-06-16 02:10:10.000   1
    SQL1             DB5         2018-06-16 02:10:10.000   2
    SQL2             DB8         2018-06-21 05:12:11.000   1
    */

    -- Required result should be
    /*
    SQLInstance DBName MaxSvcStartDate              SumDaysUptime
    SQL1            DB2         2018-06-16 02:10:10.000   6
    SQL1            DB4         2018-06-16 02:10:10.000   1
    SQL1            DB5         2018-06-16 02:10:10.000   2
    SQL2            DB6         2018-06-21 05:12:11.000   81
    SQL2            DB8         2018-06-21 05:12:11.000   1
    */

  • Updating for anyone interested, if I insert a dummy/seed row in the input table set to USED = 1, DaysUpTime = 0, my query returns the correct results.  It's a messy workaround and I would rather not use it.


    IF EXISTS (SELECT OBJECT_ID('TEMPDB..#tbla')) DROP TABLE #tbla
    GO
    CREATE TABLE #tbla
    (
    SQLInstance varchar(16),
    DBName varchar(8),
    SvcStartDate datetime,
    Used bit,
    DaysUptime SMALLINT
    )

    SET NOCOUNT ON

    -- DB1 is all Used = 1 so ignore completely
    INSERT INTO #tbla VALUES ('SQL1', 'DB1','1753-01-01', 1, 0)  --<< Seed row as a workaround, if zero for DaysUptime, doesn't affect correct sum
    INSERT INTO #tbla VALUES ('SQL1', 'DB1','2018-06-14 04:14:25', 1, 4)
    INSERT INTO #tbla VALUES ('SQL1', 'DB1','2018-06-15 04:14:00', 1, 1)
    INSERT INTO #tbla VALUES ('SQL1', 'DB1','2018-06-16 02:10:10', 1, 1)

    -- DB2 is 0 in every case so all rows should be included (summing the last column)
    INSERT INTO #tbla VALUES ('SQL1', 'DB2','1753-01-01', 1, 0) --<< Seed row as a workaround, if zero for DaysUptime, doesn't affect correct sum
    INSERT INTO #tbla VALUES ('SQL1', 'DB2','2018-06-14 04:14:25', 0, 4)
    INSERT INTO #tbla VALUES ('SQL1', 'DB2','2018-06-15 04:14:00', 0, 1)
    INSERT INTO #tbla VALUES ('SQL1', 'DB2','2018-06-16 02:10:10', 0, 1)

    -- DB3 has the middle of the 3 rows where Used = 1 but the most recent row is 1 so ignore/exclude all.
    INSERT INTO #tbla VALUES ('SQL1', 'DB3','1753-01-01', 1, 0) --<< Seed row as a workaround, if zero for DaysUptime, doesn't affect correct sum
    INSERT INTO #tbla VALUES ('SQL1', 'DB3','2018-06-14 04:14:25', 1, 4)
    INSERT INTO #tbla VALUES ('SQL1', 'DB3','2018-06-15 04:14:00', 0, 1)
    INSERT INTO #tbla VALUES ('SQL1', 'DB3','2018-06-16 02:10:10', 1, 1)

    -- DB4 shows Used = 0 on the most recent row but Used = 1 on the two preceding rows - therefore only include the most recent row where Used = 0
    INSERT INTO #tbla VALUES ('SQL1', 'DB4','1753-01-01', 1, 0) --<< Seed row as a workaround, if zero for DaysUptime, doesn't affect correct sum
    INSERT INTO #tbla VALUES ('SQL1', 'DB4','2018-06-14 04:14:25', 1, 4)
    INSERT INTO #tbla VALUES ('SQL1', 'DB4','2018-06-15 04:14:00', 1, 1)
    INSERT INTO #tbla VALUES ('SQL1', 'DB4','2018-06-16 02:10:10', 0, 1)


    -- DB5 shows the two most recent rows where Used = 0 so these can be included but the oldest row has Used = 1 so this is to be included
    INSERT INTO #tbla VALUES ('SQL1', 'DB5','1753-01-01', 1, 0) --<< Seed row as a workaround, if zero for DaysUptime, doesn't affect correct sum
    INSERT INTO #tbla VALUES ('SQL1', 'DB5','2018-06-14 04:14:25', 1, 4)
    INSERT INTO #tbla VALUES ('SQL1', 'DB5','2018-06-15 04:14:00', 0, 1)
    INSERT INTO #tbla VALUES ('SQL1', 'DB5','2018-06-16 02:10:10', 0, 1)

    -- DB6, include ALL rows as all are consecutive where Used = 0
    INSERT INTO #tbla VALUES ('SQL2', 'DB6','1753-01-01', 1, 0) --<< Seed row as a workaround, if zero for DaysUptime, doesn't affect correct sum
    INSERT INTO #tbla VALUES ('SQL2', 'DB6','2018-04-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB6','2018-05-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB6','2018-06-01 01:00:00', 0, 20)
    INSERT INTO #tbla VALUES ('SQL2', 'DB6','2018-06-21 05:12:11', 0, 1)

    -- DB7, The most recent shows Used = 1 so exlude all rows
    INSERT INTO #tbla VALUES ('SQL2', 'DB7','1753-01-01', 1, 0) --<< Seed row as a workaround, if zero for DaysUptime, doesn't affect correct sum
    INSERT INTO #tbla VALUES ('SQL2', 'DB7','2018-04-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB7','2018-05-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB7','2018-06-01 01:00:00', 0, 20)
    INSERT INTO #tbla VALUES ('SQL2', 'DB7','2018-06-21 05:12:11', 1, 1)

    -- DB8, The most recent row has Used = 0 so is to be included but the preceding row has Used = 1 so all preceding rows excluded regardless.
    INSERT INTO #tbla VALUES ('SQL2', 'DB8','1753-01-01', 1, 0) --<< Seed row as a workaround, if zero for DaysUptime, doesn't affect correct sum
    INSERT INTO #tbla VALUES ('SQL2', 'DB8','2018-04-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB8','2018-05-01 01:00:00', 0, 30)
    INSERT INTO #tbla VALUES ('SQL2', 'DB8','2018-06-01 01:00:00', 1, 20)
    INSERT INTO #tbla VALUES ('SQL2', 'DB8','2018-06-21 05:12:11', 0, 1)  --<< only row to be included

    -- select the source data
    SELECT * FROM #tbla ORDER BY SQLInstance, DBName, SvcStartDate desc

  • SELECT a.SQLInstance,
       a.DBName,
       Max(a.SvcStartDate) as MaxSvcStartDate,
       Sum(a.DaysUpTime) as SumDaysUptime
    FROM #tbla a
    WHERE NOT EXISTS(SELECT *
          FROM #tbla b
          WHERE b.Used=1
           AND b.SvcStartDate > a.SvcStartDate
           AND b.SQLInstance=a.SQLInstance
           AND b.DBName = a.DBName)
    AND a.Used=0
    GROUP BY

    a.SQLInstance,
    a.DBName
    ORDER BY
    SQLInstance,
    DBName

  • Jonathan AC Roberts - Tuesday, June 19, 2018 5:52 AM

    SELECT a.SQLInstance,
       a.DBName,
       Max(a.SvcStartDate) as MaxSvcStartDate,
       Sum(a.DaysUpTime) as SumDaysUptime
    FROM #tbla a
    WHERE NOT EXISTS(SELECT *
          FROM #tbla b
          WHERE b.Used=1
           AND b.SvcStartDate > a.SvcStartDate
           AND b.SQLInstance=a.SQLInstance
           AND b.DBName = a.DBName)
    AND a.Used=0
    GROUP BY

    a.SQLInstance,
    a.DBName
    ORDER BY
    SQLInstance,
    DBName

    Excellent, thanks.  Works perfectly.  Well done.

  • clive.tech - Tuesday, June 19, 2018 6:10 AM

    Jonathan AC Roberts - Tuesday, June 19, 2018 5:52 AM

    Excellent, thanks.  Works perfectly.  Well done.

    Thanks, It's nice to see a well presented question with the set up of data and an example of what's required.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply