Retrieve records with (more than one row of same type)

  • For example I create some fake data about database files and retrieve only the databases with more than one transaction log file.

    I was trying to test my T-SQL skills and see what was the best way of doing this. Below is the first two that came to mind. Any help or advise would be appreciated!

    CREATE TABLE #stuff (

    [databaseid]TINYINT,

    [databasename]VARCHAR(50) NOT NULL,

    [filename]VARCHAR(50) NOT NULL UNIQUE,

    [filetype]VARCHAR(50) NOT NULL

    )

    INSERT INTO #stuff VALUES (1, 'games', 'game.mdf', 'data')

    INSERT INTO #stuff VALUES (1, 'games', 'game.ldf', 'log')

    INSERT INTO #stuff VALUES (2, 'movies', 'movies.mdf', 'data')

    INSERT INTO #stuff VALUES (2, 'movies', 'movies_1.ldf', 'log')

    INSERT INTO #stuff VALUES (2, 'movies', 'movies_2.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n.mdf', 'data')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log1.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log2.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log3.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log4.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log5.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log6.ldf', 'log')

    INSERT INTO #stuff VALUES (4, 'music', 'music.mdf', 'data')

    INSERT INTO #stuff VALUES (4, 'music', 'music.ndf', 'data')

    INSERT INTO #stuff VALUES (4, 'music', 'music.ldf', 'log')

    SELECTs.[databaseid]

    ,COUNT(*) AS qty

    INTO#stuffaggregate

    FROM#stuff s

    WHEREs.[filetype] = 'log'

    GROUP BY s.[databaseid]

    HAVING COUNT(*) > 1

    ORDER BY s.[databaseid]

    SELECTs1.[databaseid]

    ,s1.[databasename]

    ,s1.[filename]

    ,s1.[filetype]

    FROM#stuff s1

    WHEREs1.[databaseid] IN (SELECT s2.[databaseid] FROM #stuffaggregate s2)

    ORDER BY s1.[databasename] ASC

    SELECTs1.[databaseid]

    ,s1.[databasename]

    ,s1.[filetype]

    FROM#stuff s1

    WHEREs1.[databaseid] IN (SELECT s2.[databaseid] FROM #stuff s2 WHERE s2.[filetype]= 'log' GROUP BY s2.[databaseid] HAVING COUNT(*) > 1)

    TRUNCATE TABLE #stuffaggregate

    DROP TABLE #stuffaggregate

    TRUNCATE TABLE #stuff

    DROP TABLE #stuff

  • Here is another query:

    CREATE TABLE #stuff (

    [databaseid]TINYINT,

    [databasename]VARCHAR(50) NOT NULL,

    [filename]VARCHAR(50) NOT NULL UNIQUE,

    [filetype]VARCHAR(50) NOT NULL

    )

    INSERT INTO #stuff VALUES (1, 'games', 'game.mdf', 'data')

    INSERT INTO #stuff VALUES (1, 'games', 'game.ldf', 'log')

    INSERT INTO #stuff VALUES (2, 'movies', 'movies.mdf', 'data')

    INSERT INTO #stuff VALUES (2, 'movies', 'movies_1.ldf', 'log')

    INSERT INTO #stuff VALUES (2, 'movies', 'movies_2.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n.mdf', 'data')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log1.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log2.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log3.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log4.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log5.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log6.ldf', 'log')

    INSERT INTO #stuff VALUES (4, 'music', 'music.mdf', 'data')

    INSERT INTO #stuff VALUES (4, 'music', 'music.ndf', 'data')

    INSERT INTO #stuff VALUES (4, 'music', 'music.ldf', 'log')

    GO

    WITH AggData AS (

    SELECT

    COUNT(*) OVER (PARTITION BY databaseid, filetype) FileCnt,

    databaseid,

    filetype

    FROM

    #stuff

    )

    SELECT

    databaseid,

    databasename,

    filename,

    filetype

    FROM

    #stuff s

    WHERE

    EXISTS(SELECT 1 FROM AggData ag WHERE ag.databaseid = s.databaseid AND ag.filetype = 'log' AND ag.FileCnt > 1)

    ;

    GO

    DROP TABLE #stuff;

    GO

  • I personally usually just do them in a single build, like so:

    SELECT

    s.[databaseid]

    ,s.[databasename]

    ,s.[filename]

    ,s.[filetype]

    FROM

    #stuff AS s

    JOIN

    (SELECT DatabaseID FROM #stuff GROUP BY DatabaseID HAVING COUNT(*) > 1) AS drv

    ONs.DatabaseID = drv.DatabaseID

    That said, it'll depend on how large your dataset is and if you have indexes optimized to DatabaseID or whatever you're linking on for quick-counts. A temp table with alternative indexing can help if the data-set's large enough.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Beautiful! I have no experience with partition. Does it have a better execution plan too? I could find out myself but I'm on an iPhone right now.

    smallmoney (8/14/2012)


    For example I create some fake data about database files and retrieve only the databases with more than one transaction log file.

    I was trying to test my T-SQL skills and see what was the best way of doing this. Below is the first two that came to mind. Any help or advise would be appreciated!

    CREATE TABLE #stuff (

    [databaseid]TINYINT,

    [databasename]VARCHAR(50) NOT NULL,

    [filename]VARCHAR(50) NOT NULL UNIQUE,

    [filetype]VARCHAR(50) NOT NULL

    )

    INSERT INTO #stuff VALUES (1, 'games', 'game.mdf', 'data')

    INSERT INTO #stuff VALUES (1, 'games', 'game.ldf', 'log')

    INSERT INTO #stuff VALUES (2, 'movies', 'movies.mdf', 'data')

    INSERT INTO #stuff VALUES (2, 'movies', 'movies_1.ldf', 'log')

    INSERT INTO #stuff VALUES (2, 'movies', 'movies_2.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n.mdf', 'data')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log1.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log2.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log3.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log4.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log5.ldf', 'log')

    INSERT INTO #stuff VALUES (3, 'pr0n', 'pr0n_log6.ldf', 'log')

    INSERT INTO #stuff VALUES (4, 'music', 'music.mdf', 'data')

    INSERT INTO #stuff VALUES (4, 'music', 'music.ndf', 'data')

    INSERT INTO #stuff VALUES (4, 'music', 'music.ldf', 'log')

    SELECTs.[databaseid]

    ,COUNT(*) AS qty

    INTO#stuffaggregate

    FROM#stuff s

    WHEREs.[filetype] = 'log'

    GROUP BY s.[databaseid]

    HAVING COUNT(*) > 1

    ORDER BY s.[databaseid]

    SELECTs1.[databaseid]

    ,s1.[databasename]

    ,s1.[filename]

    ,s1.[filetype]

    FROM#stuff s1

    WHEREs1.[databaseid] IN (SELECT s2.[databaseid] FROM #stuffaggregate s2)

    ORDER BY s1.[databasename] ASC

    SELECTs1.[databaseid]

    ,s1.[databasename]

    ,s1.[filetype]

    FROM#stuff s1

    WHEREs1.[databaseid] IN (SELECT s2.[databaseid] FROM #stuff s2 WHERE s2.[filetype]= 'log' GROUP BY s2.[databaseid] HAVING COUNT(*) > 1)

    TRUNCATE TABLE #stuffaggregate

    DROP TABLE #stuffaggregate

    TRUNCATE TABLE #stuff

    DROP TABLE #stuff

  • Ah man I forgot we can replace subqueries with joins which is faster than subqueries

    Evil Kraig F (8/14/2012)


    I personally usually just do them in a single build, like so:

    SELECT

    s.[databaseid]

    ,s.[databasename]

    ,s.[filename]

    ,s.[filetype]

    FROM

    #stuff AS s

    JOIN

    (SELECT DatabaseID FROM #stuff GROUP BY DatabaseID HAVING COUNT(*) > 1) AS drv

    ONs.DatabaseID = drv.DatabaseID

    That said, it'll depend on how large your dataset is and if you have indexes optimized to DatabaseID or whatever you're linking on for quick-counts. A temp table with alternative indexing can help if the data-set's large enough.

  • Evil Kraig F (8/14/2012)


    I personally usually just do them in a single build, like so:

    SELECT

    s.[databaseid]

    ,s.[databasename]

    ,s.[filename]

    ,s.[filetype]

    FROM

    #stuff AS s

    JOIN

    (SELECT DatabaseID FROM #stuff GROUP BY DatabaseID HAVING COUNT(*) > 1) AS drv

    ONs.DatabaseID = drv.DatabaseID

    That said, it'll depend on how large your dataset is and if you have indexes optimized to DatabaseID or whatever you're linking on for quick-counts. A temp table with alternative indexing can help if the data-set's large enough.

    Not quite right. Should only return values where there are more than 1 log files.

  • you're right. i couldn't test this because i'm on a iphone right now but if he added WHERE filetype = 'log' in his subquery on the join then it should be correct results?

    JOIN

    (SELECT DatabaseID FROM #stuff WHERE filetype = 'log' GROUP BY DatabaseID HAVING COUNT(*) > 1) AS drv

    ONs.DatabaseID = drv.DatabaseID

    Lynn Pettis (8/14/2012)


    Evil Kraig F (8/14/2012)


    I personally usually just do them in a single build, like so:

    SELECT

    s.[databaseid]

    ,s.[databasename]

    ,s.[filename]

    ,s.[filetype]

    FROM

    #stuff AS s

    JOIN

    (SELECT DatabaseID FROM #stuff GROUP BY DatabaseID HAVING COUNT(*) > 1) AS drv

    ONs.DatabaseID = drv.DatabaseID

    That said, it'll depend on how large your dataset is and if you have indexes optimized to DatabaseID or whatever you're linking on for quick-counts. A temp table with alternative indexing can help if the data-set's large enough.

    Not quite right. Should only return values where there are more than 1 log files.

  • smallmoney (8/14/2012)


    you're right. i couldn't test this because i'm on a iphone right now but if he added WHERE filetype = 'log' in his subquery on the join then it should be correct results?

    DERP! :blush:

    Yeaaaah, just put your where clause in. 🙂


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply