Better Way to Perform this Query

  • Hello Everyone

    I hope that everyone is having a very good day.

    I need to write a query to select row counts, but this seems like a lot of scanning of the rows. Is there a better way to write a query that does this? There are approx 5 million rows in the table

    DECLARE @abc int

    DECLARE @def int

    DECLARE @Ghi int

    DECLARE @jkl int

    DECLARE @mno int

    DECLARE @pqr int

    DECLARE @stu int

    DECLARE @vw int

    DECLARE @xzy int

    SET @abc = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%abc%')

    SET @def = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%def%')

    SET @Ghi = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%ghi%')

    SET @jkl = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%jkl%')

    SET @mno = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%mno%')

    SET @pqr = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%pqr%')

    SET @stu = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%stu%')

    SET @vw = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%vw%')

    SET @xzy = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%xyz%')

    Thank You in advance for all your help, advice and suggestions.

    Andrew SQLDBA

  • You can try something like this (requires only a single table scan):

    SELECT

    @abc = COUNT(CASE WHEN ColumnName LIKE '%abc%' THEN 1 END),

    @def = COUNT(CASE WHEN ColumnName LIKE '%def%' THEN 1 END),

    @Ghi = COUNT(CASE WHEN ColumnName LIKE '%ghi%' THEN 1 END),

    @jkl = COUNT(CASE WHEN ColumnName LIKE '%jkl%' THEN 1 END),

    @mno = COUNT(CASE WHEN ColumnName LIKE '%mno%' THEN 1 END),

    @pqr = COUNT(CASE WHEN ColumnName LIKE '%pqr%' THEN 1 END),

    @stu = COUNT(CASE WHEN ColumnName LIKE '%stu%' THEN 1 END),

    @vw = COUNT(CASE WHEN ColumnName LIKE '%vw%' THEN 1 END),

    @xzy = COUNT(CASE WHEN ColumnName LIKE '%xyz%' THEN 1 END)

    FROM

    TableName

  • Thank You Peter

    Very handy to know.

    Andrew SQLDBA

  • AndrewSQLDBA (1/9/2013)


    Hello Everyone

    I hope that everyone is having a very good day.

    I need to write a query to select row counts, but this seems like a lot of scanning of the rows. Is there a better way to write a query that does this? There are approx 5 million rows in the table

    DECLARE @abc int

    DECLARE @def int

    DECLARE @Ghi int

    DECLARE @jkl int

    DECLARE @mno int

    DECLARE @pqr int

    DECLARE @stu int

    DECLARE @vw int

    DECLARE @xzy int

    SET @abc = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%abc%')

    SET @def = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%def%')

    SET @Ghi = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%ghi%')

    SET @jkl = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%jkl%')

    SET @mno = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%mno%')

    SET @pqr = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%pqr%')

    SET @stu = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%stu%')

    SET @vw = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%vw%')

    SET @xzy = (SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%xyz%')

    Its worth noting that the following is sargable.

    LIKE '%abc%'

    Question:

    Are you really trying to find columnNames with abc in their name, def in their name, etc... Or are you trying to get a count of all columns whose name begins with [a-c] for @abc; a count of names beginning with [d-f] for @def, etc, etc... If that's what you are looking for, you can try the query below. It's a way to pass a variable or parameter to NTILE and do what you are doing dynamically (for the most part, some assembly required).

    DECLARE @groups int = 9;

    ;WITH

    asciichar(n,c) AS

    (SELECT (65), CHAR(65)

    UNION ALL

    SELECT n+1, CHAR(n+1) FROM asciichar WHERE n+1<=90),

    groups AS

    (SELECTNTILE(@groups) OVER (ORDER BY n) AS groupid, * FROM asciichar ),

    ntileMatrix AS

    (SELECTgroupid,

    MIN(c) OVER (PARTITION BY groupid)+'-'+

    MAX(c) OVER (PARTITION BY groupid) AS [group],

    n AS [ASCII],

    c AS [CHAR]

    FROM groups ),

    people AS

    (SELECTLEFT(LastName,1) AS c1, LastName+', '+FirstName AS Name

    FROM AdventureWorks2008R2.person.person )

    SELECT nm.[group], p.Name

    FROM ntileMatrix nm

    CROSS JOIN people p

    WHERE [CHAR]=c1

    The code above will produce this (truncated for reading):

    group Name

    ----- -------------------------------------

    A-C Abbas, Syed

    A-C Abel, Catherine

    A-C Abercrombie, Kim

    ....

    Y-Z Zukowski, Jake

    Y-Z Zwilling, Michael

    Y-Z Zwilling, Michael

    You can change it to this:

    DECLARE @groups int = 9;

    ;WITH

    asciichar(n,c) AS

    (SELECT (65), CHAR(65)

    UNION ALL

    SELECT n+1, CHAR(n+1) FROM asciichar WHERE n+1<=90),

    groups AS

    (SELECTNTILE(@groups) OVER (ORDER BY n) AS groupid, * FROM asciichar ),

    ntileMatrix AS

    (SELECTgroupid,

    MIN(c) OVER (PARTITION BY groupid)+'-'+

    MAX(c) OVER (PARTITION BY groupid) AS [group],

    n AS [ASCII],

    c AS [CHAR]

    FROM groups ),

    people AS

    (SELECTLEFT(LastName,1) AS c1, LastName+', '+FirstName AS Name

    FROM AdventureWorks2008R2.person.person ),

    Totals AS

    (

    SELECT nm.[group], p.Name

    FROM ntileMatrix nm

    CROSS JOIN people p

    WHERE [CHAR]=c1 )

    SELECT [group], COUNT([group]) AS groupCount

    FROM Totals

    GROUP BY [group]

    To get this:

    group groupCount

    ----- -----------

    A-C 3502

    D-F 1111

    G-I 2572

    J-L 2347

    M-O 2061

    P-R 3195

    S-U 2800

    V-X 1552

    Y-Z 832

    Again, some assembly required for what you are doing....

    What's cool is you can change the parameter or variable to dynamically change your groups like so:

    DECLARE @groups int = 3;

    and get this:

    group groupCount

    ----- -----------

    A-I 7185

    J-R 7603

    S-Z 5184

    Edit: Typo's

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan - That's a very intriguing use of NTILE!

    I would recommend, however that you change the way you construct your asciichar table:

    ;WITH asciichar(n, c) AS (

    SELECT n=64+number, CHAR(64+number)

    FROM [master].dbo.spt_values Tally

    WHERE [Type] = 'P' AND Number BETWEEN 1 AND 26)

    SELECT *

    FROM asciichar


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Alan.B (1/9/2013)


    AndrewSQLDBA (1/9/2013)


    Its worth noting that the following is sargable.

    LIKE '%abc%'

    Did you mean "NOT" SARGable because it sure doesn't look SARGable from here. 😉 And, I agree... nice use of NTILE.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/9/2013)


    Alan.B (1/9/2013)


    AndrewSQLDBA (1/9/2013)


    Its worth noting that the following is sargable.

    LIKE '%abc%'

    Did you mean "NOT" SARGable because it sure doesn't look SARGable from here. 😉

    Is that because of the % at the beginning of the string?


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Yes.

    It would also give the wrong answer, IMHO.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • dwain.c (1/9/2013)


    Alan - That's a very intriguing use of NTILE!

    I would recommend, however that you change the way you construct your asciichar table:

    ;WITH asciichar(n, c) AS (

    SELECT n=64+number, CHAR(64+number)

    FROM [master].dbo.spt_values Tally

    WHERE [Type] = 'P' AND Number BETWEEN 1 AND 26)

    SELECT *

    FROM asciichar

    Thank you, and thank you. I'm still just learning Windows functions (have been studying them for ~6 months now give or take.) For me it's been easy to find a uses for ROW_NUMBER , RANK and DENSE_RANK but this was a rare case where I found some use for NTILE; this was originally a dynamic SQL query.

    I agree that I should have used a tally table. I've been writing CTE's for counting for awhile and can do so while sleeping. I still fumble around with the tally table and, in this case was in a hury to post my code. I just updated my query to include the tally table as you showed above: 303 reads is now 146 reads. 🙂

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Jeff Moden (1/9/2013)


    Alan.B (1/9/2013)


    AndrewSQLDBA (1/9/2013)


    Its worth noting that the following is sargable.

    LIKE '%abc%'

    Did you mean "NOT" SARGable because it sure doesn't look SARGable from here. 😉 And, I agree... nice use of NTILE.

    Thank you very much Jeff. One year ago I had never heard of windows functions, set-based SQL, or a tally table. I think I'm starting to get it and hope to someday be able to write queries like you, Dwain, Lynn and others on SSC.

    Yes - I meant NOT SARGable.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B (1/10/2013)


    I agree that I should have used a tally table. I've been writing CTE's for counting for awhile and can do so while sleeping. I still fumble around with the tally table and, in this case was in a hury to post my code. I just updated my query to include the tally table as you showed above: 303 reads is now 146 reads. 🙂

    You might want to take a look at this article.

    http://www.sqlservercentral.com/articles/T-SQL/74118/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (1/10/2013)


    Alan.B (1/10/2013)


    I agree that I should have used a tally table. I've been writing CTE's for counting for awhile and can do so while sleeping. I still fumble around with the tally table and, in this case was in a hury to post my code. I just updated my query to include the tally table as you showed above: 303 reads is now 146 reads. 🙂

    You might want to take a look at this article.

    http://www.sqlservercentral.com/articles/T-SQL/74118/[/url]

    Thank you Sean. Yes, I have a few times - it's in my favorites folder and I have been putting what I have learned to use. You won't see any more counting CTE's from me. This morning before work for example, after a lot of effort, I finally figured out how to get the Levenshtein Edit Distance between 2 strings without a loop (just for fun because it's something I've never seen done).

    I came up with this:

    -- strings to compare

    DECLARE@s1 varchar(8000)='diner',

    @s2 varchar(8000)='dinerr';

    DECLARE @Ld int=ABS(LEN(@s1)-LEN(@s2));

    IF ((@s1=@s2) OR ((ISNULL(LEN(@s1)*LEN(@s2),0)=0))) BEGIN GOTO LD END;

    DECLARE@minlen int=CASE WHEN LEN(@s1)>LEN(@s2) THEN LEN(@s2) ELSE LEN(@s1) END;

    ;WITH

    nums(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM [master].dbo.spt_values Tally),

    matrix AS (SELECT SUBSTRING(@s1,n,1) s1, SUBSTRING(@s2,n,1) s2 FROM nums WHERE n<=@minlen)

    SELECT @Ld+=COUNT(*) FROM matrix WHERE s1<>s2;

    LD:

    SELECT @Ld AS LD;

    I actually posted this as a script and hope it gets approved. The Tally table is new to me, I did not get it at first but now I totally understand what all the hype is about. Long live the Tally table!

    Edit: typo.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B

    (just for fun because it's something I've never seen done

    Here, here brother. That's what its all about. Well, a lot of it anyway.

    Greg
    _________________________________________________________________________________________________
    The glass is at one half capacity: nothing more, nothing less.

  • Alan.B (1/10/2013)


    This morning before work for example, after a lot of effort, I finally figured out how to get the Levenshtein Edit Distance between 2 strings without a loop (just for fun because it's something I've never seen done).

    Heh... if that's how you warm up for the day, then you've got me beat by a mile.

    You can just bet I'm going to do a deep dive on your rendition of this famous problem... especially since you did it with a Tally Table. Thanks for doing it and thanks for posting it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/10/2013)


    Alan.B (1/10/2013)


    This morning before work for example, after a lot of effort, I finally figured out how to get the Levenshtein Edit Distance between 2 strings without a loop (just for fun because it's something I've never seen done).

    Heh... if that's how you warm up for the day, then you've got me beat by a mile.

    You can just bet I'm going to do a deep dive on your rendition of this famous problem... especially since you did it with a Tally Table. Thanks for doing it and thanks for posting it.

    I for one have never heard of this famous "Levenshtein Edit Distance" problem, but you can rest assured that now I'm going to have to take a look at it as well! 😀

    http://en.wikipedia.org/wiki/Levenshtein_distance


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply