Short circuit SQL query

  • Hello. I pull data from a SQL 2000 environment via SSIS into a 2008 DB of ours.

    Select Distinct RCD.Variable4

    FROM Route_Call_Detail RCD

    WHERE RCD.[DateTime] > '12/4/10' AND RCD.[DateTime] < '12/5/10'

    AND SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' --Thanks to Jeff Moden for this

    --AND CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) BETWEEN 5011 AND 7999

    This works fine, purely numeric values returned for Variable4 1st 6 chars. Uncomment that last line and I get this error:

    Syntax error converting the varchar value 'AZ' to a column of data type int.

    There are rows where Variable4 = 'AZ' but I thought those would be short circuited by the statement above. It works in test :crazy: Anyhow, I tried putting this in a case statement like some people have offered but no luck. Keep in mind the query runs in 2000 before getting over to us (we're fortunately not the keepers of the 2000 data).

    Thanks,

    Ken

  • i think you'll just have to wrap a section of the query into a subquery, since the data has to be filtered on the LIKE statement first;

    because the second part uses a function (CONVERT)...it's not SARG-able, so a short circuit base don the index seeks cannot occur, right?

    SELECT * FROM (

    Select

    Distinct RCD.Variable4

    FROM Route_Call_Detail RCD

    WHERE RCD.[DateTime] > '12/4/10'

    AND RCD.[DateTime] < '12/5/10'

    AND SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' --Thanks to Jeff Moden for this

    ) MyAlias

    WHERE CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) BETWEEN 5011 AND 7999

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • SQL makes no guarantee as to what order expressions will be evaluated in and, as such, one cannot assume that any form of 'short-circuiting' will always occur, especially when dealing with data columns and not just constant expressions.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Lowell (12/6/2010)


    i think you'll just have to wrap a section of the query into a subquery, since the data has to be filtered on the LIKE statement first;

    because the second part uses a function (CONVERT)...it's not SARG-able, so a short circuit base don the index seeks cannot occur, right?

    SELECT * FROM (

    Select

    Distinct RCD.Variable4

    FROM Route_Call_Detail RCD

    WHERE RCD.[DateTime] > '12/4/10'

    AND RCD.[DateTime] < '12/5/10'

    AND SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' --Thanks to Jeff Moden for this

    ) MyAlias

    WHERE CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) BETWEEN 5011 AND 7999

    Makes sense but surprisingly it didn't change anything.

    Ken

  • You've run into a common issue, and it's a natural one to make. The Where clause limits the data, then we do the rest...

    Sadly, this isn't true. You can't make the optimizer run the WHERE first. Not directly. You have no control after you declare the results you want without hinting. Note, this is key. You declare the results you want, you don't give SQL a process, not directly.

    The two ways out of this are to build a subquery with the where clause before you do any conversion and then use the OPTION (FORCE ORDER) tag (which I don't recommend without thorough knowledge of the impact), or to drop the data you're going to process to a #TMP and then do conversions after that.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • ken.trock (12/6/2010)


    Lowell (12/6/2010)


    i think you'll just have to wrap a section of the query into a subquery, since the data has to be filtered on the LIKE statement first;

    because the second part uses a function (CONVERT)...it's not SARG-able, so a short circuit base don the index seeks cannot occur, right?

    SELECT * FROM (

    Select

    Distinct RCD.Variable4

    FROM Route_Call_Detail RCD

    WHERE RCD.[DateTime] > '12/4/10'

    AND RCD.[DateTime] < '12/5/10'

    AND SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' --Thanks to Jeff Moden for this

    ) MyAlias

    WHERE CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) BETWEEN 5011 AND 7999

    Makes sense but surprisingly it didn't change anything.

    Ken

    Heh, what I get for leaving the window open a bit. 🙂 Use OPTION (FORCE ORDER) at the tail of that, last piece of the statement.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Or, as an (untested) alternative.

    Select

    Distinct RCD.Variable4

    FROM Route_Call_Detail RCD

    WHERE RCD.[DateTime] > '12/4/10'

    AND RCD.[DateTime] < '12/5/10'

    AND CASE WHEN SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' THEN CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) ELSE NULL END BETWEEN BETWEEN 5011 AND 7999

    )

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ken.trock (12/6/2010)


    Hello. I pull data from a SQL 2000 environment via SSIS into a 2008 DB of ours.

    Select Distinct RCD.Variable4

    FROM Route_Call_Detail RCD

    WHERE RCD.[DateTime] > '12/4/10' AND RCD.[DateTime] < '12/5/10'

    AND SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' --Thanks to Jeff Moden for this

    --AND CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) BETWEEN 5011 AND 7999

    This works fine, purely numeric values returned for Variable4 1st 6 chars. Uncomment that last line and I get this error:

    Syntax error converting the varchar value 'AZ' to a column of data type int.

    There are rows where Variable4 = 'AZ' but I thought those would be short circuited by the statement above. It works in test :crazy: Anyhow, I tried putting this in a case statement like some people have offered but no luck. Keep in mind the query runs in 2000 before getting over to us (we're fortunately not the keepers of the 2000 data).

    Thanks,

    Ken

    Ken,

    A LIKE that starts with a '%' won't be able to utilize an index on the column, which may make a noticeable impact. Since you want the first six characters to be digits, you might want to change it to:

    AND RCD.Variable4 LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]%'

    which is SARGable and thus will utilize an index.

    This might also fix the issue you were having... if sql is already doing a scan on the table because of this, it would just go ahead and compare the other column at the same time. You would probably still need to put it into a CTE or sub-query as Lowell showed you.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • You're checking the first six characters are digits, so why not:

    WHERE LEFT(RCD.Variable4, 4) BETWEEN '5011' AND '7999'

    which can potentially make use of an index?


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • GilaMonster (12/6/2010)


    Or, as an (untested) alternative.

    Select

    Distinct RCD.Variable4

    FROM Route_Call_Detail RCD

    WHERE RCD.[DateTime] > '12/4/10'

    AND RCD.[DateTime] < '12/5/10'

    AND CASE WHEN SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' THEN CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) ELSE NULL END BETWEEN BETWEEN 5011 AND 7999

    )

    Yes, this works! I was trying to do something like this myself but couldn't quite get there. Thanks everyone who replied.

    Ken

  • ken.trock (12/6/2010)


    GilaMonster (12/6/2010)


    Or, as an (untested) alternative.

    Select

    Distinct RCD.Variable4

    FROM Route_Call_Detail RCD

    WHERE RCD.[DateTime] > '12/4/10'

    AND RCD.[DateTime] < '12/5/10'

    AND CASE WHEN SUBSTRING(RCD.Variable4, 1, 6) NOT LIKE '%[^0-9]%' THEN CONVERT(INT, SUBSTRING(RCD.Variable4, 1, 6)) ELSE NULL END BETWEEN BETWEEN 5011 AND 7999

    )

    Yes, this works! I was trying to do something like this myself but couldn't quite get there. Thanks everyone who replied.

    Watch the performance. Probably want an index on Datetime include Variable4 for this. No point in having Variable4 in the key, that predicate's not exactly SARGable.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Watch the performance. Probably want an index on Datetime include Variable4 for this. No point in having Variable4 in the key, that predicate's not exactly SARGable.

    This is actually data from a 3rd party. It can be customized but we don't control it. They at least have an index for us on DateTime (I know, not the greatest name for a column).

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply