outer join problem

  • I want to return a list of all user databases in an instance with their datafile size and latest backup size. I want one row for each database whether it has ever had a full backup or not. the query below only returns databases that have a full backup.

    select a.name, str(cast(sum(size) as numeric)*8192/1048576) as 'data file(s) size in MB',

    floor(b.backup_size/1048576) as 'backup size in MB'

    from master..sysdatabases a left outer join msdb..backupset b

    on a.name = b.database_name

    join master..sysaltfiles c on a.dbid = c.dbid and groupid != 0

    where a.dbid > 4 and

    b.backup_start_date = (select max(backup_start_date) from msdb..backupset where database_name = a.name and type = 'D')

    group by a.name,b.backup_size

    order by a.name

    I have tried splitting the query up into 2 simpler queries with an intermediate temp table, but end up with exactly the same results.

    where am I going wrong, is it to do with the subquery?

    ---------------------------------------------------------------------

  • george sibbald (4/26/2009)


    I want to return a list of all user databases in an instance with their datafile size and latest backup size. I want one row for each database whether it has ever had a full backup or not. the query below only returns databases that have a full backup.

    select a.name, str(cast(sum(size) as numeric)*8192/1048576) as 'data file(s) size in MB',

    floor(b.backup_size/1048576) as 'backup size in MB'

    from master..sysdatabases a left outer join msdb..backupset b

    on a.name = b.database_name

    join master..sysaltfiles c on a.dbid = c.dbid and groupid != 0

    where a.dbid > 4 and

    b.backup_start_date = (select max(backup_start_date) from msdb..backupset where database_name = a.name and type = 'D')

    group by a.name,b.backup_size

    order by a.name

    I have tried splitting the query up into 2 simpler queries with an intermediate temp table, but end up with exactly the same results.

    where am I going wrong, is it to do with the subquery?

    The problem is not the subquery, but the fact that the where clause with the subquery is doing an equals with a field that will be null when there is no backup. NULL does not equal NULL.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • i know once wayne identified it, you got it, but here's an example of what i tested as a result of reading this thread:

    SELECT

    A.NAME,

    STR(CAST(SUM(SIZE) AS NUMERIC)*8192/1048576) AS 'DATA FILE(S) SIZE IN MB',

    FLOOR(B.BACKUP_SIZE/1048576) AS 'BACKUP SIZE IN MB'

    FROM MASTER..SYSDATABASES A

    LEFT OUTER JOIN MSDB..BACKUPSET B

    ON A.NAME = B.DATABASE_NAME

    INNER JOIN MASTER..SYSALTFILES C

    ON A.DBID = C.DBID

    -- AND GROUPID != 0

    LEFT OUTER JOIN (SELECT

    MAX(BACKUP_START_DATE) AS BACKUP_START_DATE,

    DATABASE_NAME

    FROM MSDB..BACKUPSET

    WHERE TYPE = 'D'

    GROUP BY DATABASE_NAME

    ) X

    ON X.DATABASE_NAME = A.NAME

    AND B.BACKUP_START_DATE = X.BACKUP_START_DATE

    WHERE A.DBID > 4

    GROUP BY

    A.NAME,

    B.BACKUP_SIZE

    ORDER BY A.NAME

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell,

    Second LEFT JOIN leave a possibility for the query to return all Backup History records with different Backup_Size. As a result you'll have multiple records for the same DATABASE_NAME.

    You need to change it to INNER JOIN.

    Like this:

    SELECT

    A.NAME,

    STR(CAST(SUM(SIZE) AS NUMERIC)*8192/1048576) AS 'DATA FILE(S) SIZE IN MB',

    FLOOR(B.BACKUP_SIZE/1048576) AS 'BACKUP SIZE IN MB'

    FROM (SELECT

    MAX(BACKUP_START_DATE) AS BACKUP_START_DATE,

    DATABASE_NAME

    FROM MSDB..BACKUPSET

    WHERE TYPE = 'D'

    GROUP BY DATABASE_NAME

    ) X

    INNER JOIN MSDB..BACKUPSET B ON X.DATABASE_NAME = B.DATABASE_NAME

    AND B.BACKUP_START_DATE = X.BACKUP_START_DATE

    RIGHT OUTER JOIN MASTER..SYSDATABASES A ON A.NAME = B.DATABASE_NAME

    INNER JOIN MASTER..SYSALTFILES C ON A.DBID = C.DBID

    -- AND GROUPID != 0

    WHERE A.DBID > 4

    GROUP BY

    A.NAME,

    B.BACKUP_SIZE

    ORDER BY A.NAME

    This will return only 1 latest record for each database.

    _____________
    Code for TallyGenerator

  • cheers guys, appreciate it. I get it, get all the databases from sysdatabases first by not excluding in the where clause, rather add another join to get just the latest backup . All grist to the learning mill.

    I have a day off today (yea), but will run these first chance I get.

    BTW, I was excluding groupid = 0 (log files) because I wanted an idea of databases sized much larger than strictly necessary compared to actual data contained (backup size), and log file size is not relevant to that.

    ---------------------------------------------------------------------

  • Thanks guys, worked a treat. Sergiy you were right about the inner join.

    ---------------------------------------------------------------------

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply