Demystifying the use of CASE in an ORDER BY statement

  • ben.kimball (4/26/2012)


    Seems a good example of how tired and overloaded SQL syntax has become.

    There's no overloading here. The only issue is with people assuming that CASE is overloaded when it isn't.

    SQL syntax has become too obtuse for what we need it to do these days.

    Who's "we", kemosabe? More to the point, what do you need it to do that results in using "obtuse" syntax?

    Need a more powerful way to talk to the database. I'm ready for native C#/Linq queries on SQL...

    So first you say SQL is too obtuse and then you say you want Linq!? That's it! Goodnight! Thanks for the laughs!

  • ben.kimball (4/26/2012)


    Seems a good example of how tired and overloaded SQL syntax has become.

    SQL syntax has become too obtuse for what we need it to do these days. Need a more powerful way to talk to the database. I'm ready for native C#/Linq queries on SQL without installing .NET assemblies and this kludgy 'Write it in C#/Linq, which is then translated to T-SQL'. Yuck.

    You might make a case for that in some situations, but this is not a particular good example. What you are referencing to is simply written without proper understanding. The result is needles complex code. But you can write obfuscated code for simple tasks in any language, SQL is unfortunatly no exception to this rule.

  • Maybe it is me overreacting here, if so I stand corrected, but I think this article should have been screened out and not get published.

    I do not know the formal process by which these articles get published, but I do know this article will not have demystified anything! Given the replies of some I could argue the opposite is actually been happening, despite the best efforts of some participants (like me) to correct things. As we all probably know, the initial impression is what often counts . Starters to T-SQL, reading the article could have learned a bad trick and on top got a false impression to how SQL Server actually works, sure to be leading to further mistakes in the future.

    This could have been avoided IMHO.

    As for the writer, it takes balls to write an article and put it online for everyone to say their thing about. Respect for that and I hope you have learned from it and the replies. But next time let others check your next article first as it is better to screen out errors before going public. It can be just as good for learning from it!

  • Dynamic SQL opens the door to injection attacks unless you are very careful: something like using a function which has parameters to generate your SQL string which is then run using execsql all encapsulated in its own proc. But there are limits to this - the SQL cannot be more than 4000 Nchar (doesnt seem a small limit, but you'd be surprised). Best avoided completely.

  • dadam (4/26/2012)


    If you're confused by any of this .... try putting the same CASE statement in the query as a column so you can see what's going on.

    GENIUS!!

    BRILLIANT!!

    I'm seriously reflecting on my SQL skills and rethinking what I know because in all my years of working with SQL I've never considered putting my tricky CASE "order by" statements into the SELECT statement for review....I'm mildly embarrassed about this but not enough to stop me from expressing my excitement of this epiphany! Thanks for posting!

  • ianp (4/26/2012)


    Dynamic SQL opens the door to injection attacks unless you are very careful: something like using a function which has parameters to generate your SQL string which is then run using execsql all encapsulated in its own proc. But there are limits to this - the SQL cannot be more than 4000 Nchar (doesnt seem a small limit, but you'd be surprised). Best avoided completely.

    Not true.... nvarchar(max) is valid for sp_executesql

    Something like this via Dynamic is not hard to write, and works better than convoluted case statements. Of course somebody is going to tell me to make a different stored proc for each sort case, but if you don't want to do that, or can't for some reason, something LIKE this will work (If I were putting this into production I'd also write tests for the first parameter being null or invalid but the second being fine and then adjust how I wrote the order by into the dynamic sql).

    CREATE PROCEDURE dynamicorder

    @sortkeya VARCHAR(50),

    @sortkeyaorder VARCHAR(4),

    @sortkeyb VARCHAR(50),

    @sortkeyborder VARCHAR(4)

    AS

    SET NOCOUNT ON

    DECLARE @ORDERBY NVARCHAR(MAX)

    DECLARE @SQL NVARCHAR(MAX) =N'

    SELECT

    Employee.EmpID,

    Employee.EmpName,

    Employee.EmpDOB

    FROM Employee '

    IF @sortkeya = 'EmpDOB' AND @sortkeyaorder = 'ASC'

    SET @ORDERBY = N' ORDER BY Employee.EmpDOB ASC'

    IF @sortkeya = 'EmpDOB' AND @sortkeyaorder = 'DESC'

    SET @ORDERBY = N' ORDER BY Employee.EmpDOB DESC'

    IF @sortkeya = 'EmpName' AND @sortkeyaorder = 'ASC'

    SET @ORDERBY = N' ORDER BY Employee.EmpName ASC'

    IF @sortkeya = 'EmpName' AND @sortkeyaorder = 'DESC'

    SET @ORDERBY = N' ORDER BY Employee.EmpName DESC'

    IF @sortkeyb = 'EmpDOB' AND @sortkeyborder = 'ASC'

    BEGIN

    IF @ORDERBY IS NOT NULL

    SET @ORDERBY += N' ,Employee.EmpDOB ASC'

    ELSE

    SET @ORDERBY = N' ORDER BY Employee.EmpDOB ASC'

    END

    IF @sortkeyb = 'EmpDOB' AND @sortkeyborder = 'DESC'

    BEGIN

    IF @ORDERBY IS NOT NULL

    SET @ORDERBY += N' ,Employee.EmpDOB DESC'

    ELSE

    SET @ORDERBY = N' ORDER BY Employee.EmpDOB DESC'

    END

    IF @sortkeyb = 'EmpName' AND @sortkeyborder = 'ASC'

    BEGIN

    IF @ORDERBY IS NOT NULL

    SET @ORDERBY += N' ,Employee.EmpName ASC'

    ELSE

    SET @ORDERBY = N' ORDER BY Employee.EmpName ASC'

    END

    IF @sortkeyb = 'EmpName' AND @sortkeyborder = 'DESC'

    BEGIN

    IF @ORDERBY IS NOT NULL

    SET @ORDERBY += N' ,Employee.EmpName DESC'

    ELSE

    SET @ORDERBY = N' ORDER BY Employee.EmpName DESC'

    END

    EXEC sp_executesql @SQL

    Edit: adjusted to handle parameters like I said above.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • OK, maybe I shouldn't have used the term 'overloaded', as that has connotations that I did not mean to imply. I meant it more like they have stuck so many features on top of the SQL syntax over the decades that it's just a complete hack these days. SQL's basic syntax is too close to BASIC from the early 1970s. Gets as weird as VB.NET syntax has became over the years. I grew up coding in BASIC, but grew out of it and fully shun it these days.

    You apparently haven't done much work with C# and Linq then, or you would not find it humorous, but instead insightful. It doesn't have to be Linq or even C#, but it needs to be a higher level language. I can create object-oriented solutions so much quicker and cleaner with C# and Entity Framework than I ever could with oldschool ADO.NET and Stored Procs. Only times I've used Stored Procs in the past year and a half are for situations requiring Full Text Search and a few for needing top notch performance, so using Stored Procs with extra tricks thrown in like Compiler Hints and other things that haven't been translated to good EF/Linqiness.

    At least the 2008 management studio, as well as RedGate's offerings, allow for Intellisense-like assistance while coding. But it's still a band-aid solution. Shoehorning advanced queries into any flavor SQL has become a giant pain. It's hacky, it leads to non-determism like shown in this article with 'Well, it might work this way, or it might work that way...'. (Although the author's approach is not the same approach I would take to solve the same sorting issue.)

    Not as big of a deal, but they GOTTA get rid of those ugly BEGIN/END structures. At least alias them to { and } by default.

  • Oh wow!

    I was just getting ready to say "Wait, WUT?" to your statement about being able to use more than 4000 characters in nvarchar(max). Somehow, I have always had it stuck in my head that 'max' was just a shorthand for 4000.

    So I went and doublechecked the definition of varchar(max) to make sure I knew what I was talking about, and was very happily surprised to find out that you were right! What I never realized until now was that MAX actually means 231-1 bytes!! If using an integer value, then the highest you can go is 4000. Holy carp!

    I'm glad I DIDN'T know that back when I was stuck in the dynamic SQL mindset or I'd have been stuck there longer!! Dynamic SQL = BAD/Slow/Insecure/Sign of an Inexperienced Developer.

  • ben.kimball (4/26/2012)


    Dynamic SQL = BAD/Slow/Insecure/Sign of an Inexperienced Developer.

    As usual...it depends :).

    Mostly on when to use it, because if the nature of the problem you try to solve is dynamic, construction SQL code dynamically makes perfect sense and anything else would be quite dumb.

    Just as when to use it, there is to how do you use it when you do need to! Coding dynamic SQL in SQL is a rather crude, ugly and ineffective way of doing things IMHO. Building a dynamic statement in the application layer makes more sense as that can be much easier as you do not need to concatenate strings everywhere.

    Here is how some dynamic code would look like in ColdFusion (high level tag based language for web development):

    <cfquery name="local.query" datasource="#GetDatasource()#">

    select

    N.IdNews

    , N.Title

    , N.Date

    from

    dbo.Commission as C

    inner join dbo.News as N on N.IdCommission = C.IdCommission

    where

    N.IdRegion = 0

    <cfif not local.Secure>

    and N.IsSecure = 0

    </cfif>

    and C.IdCommission in ( <cfqueryparam cfsqltype="cf_sql_smallint" list="yes" value="#arguments.IdCommissionList#"> )

    order by

    N.Date desc

    , N.IdNews desc

    ;

    </cfquery>

    As you can see, the dynamic sql code that is conditionally added, does in no way obstruct the main layout and readability of the query. You also see how a list of comma separated IDs is passed into the query in a type safe manner. Passing a single ID trough is even less work and just as safe!

    Not using dynamically build SQL when you really need to or not having the right tools to pass data values safely, is what can make things messy. Not using dynamically build SQL in and of itself.

  • I can sorta barely make sense of that, but not a CF guy, so taking your word for how it communicates with the back end SQL server. 🙂

    My statement was more directed at when people dynamically generate SQL inside of the SQL Stored Proc. Drives me insane, or the procedural guys who write all their Sproc code using cursors and loops. Arrrgghh!! Yes, I once was one of them, and it's a really difficult concept to explain why you shouldn't do that and explain how you should do it different. I was stuck there for a few years and aggravated several people in the mean time.

    Then one day, click, you have that epiphany and the better way finally makes sense. Until then, you see the code that's written correctly and participates with the compiler/query plan cache, and wonder why people write stuff that way, as it seems more trouble than it's worth. It's also hard to let go of thinking you need to have 'full control' of what SQL Server does (As if you ever did) and finally realize that it's a pidgin to just precisely explain what data you are wanting and just trust SQL to determine the best way to get you that data and let it do it's thing. (In most cases, and usually when it does something unexpected it turns out that it was you, not it, that was approaching the task incorrectly.)

  • ben.kimball (4/26/2012)


    Oh wow!

    I was just getting ready to say "Wait, WUT?" to your statement about being able to use more than 4000 characters in nvarchar(max). Somehow, I have always had it stuck in my head that 'max' was just a shorthand for 4000.

    So I went and doublechecked the definition of varchar(max) to make sure I knew what I was talking about, and was very happily surprised to find out that you were right! What I never realized until now was that MAX actually means 231-1 bytes!! If using an integer value, then the highest you can go is 4000. Holy carp!

    I'm glad I DIDN'T know that back when I was stuck in the dynamic SQL mindset or I'd have been stuck there longer!! Dynamic SQL = BAD/Slow/Insecure/Sign of an Inexperienced Developer.

    That all falls into the "it depends" The above code I wrote in a stored procedure will run relatively well and via using exec sp_executesql to run it, I'll get plans into the cache and they will get re-use because of the way I wrote it. It's also relatively injection proof because I never concatenate on the values passed in as parameters.

    Keep in mind we pass nvarchar(max) so you only get half as many characters, but if you're generating 230 ish characters into a single dynamic query you've got other issues.

    There's a reason that veteran DBAs still prefer Stored Procedures. Many reasons in fact. And you can LINQ to them as well. Keeping the Database as it's own API means that if you have Database logic problems with a query you don't have to redeploy a DLL, just fix a stored procedure or UDF.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • peter-757102 (4/26/2012)


    ben.kimball (4/26/2012)


    Dynamic SQL = BAD/Slow/Insecure/Sign of an Inexperienced Developer.

    As usual...it depends :).

    Mostly on when to use it, because if the nature of the problem you try to solve is dynamic, construction SQL code dynamically makes perfect sense and anything else would be quite dumb.

    Just as when to use it, there is to how do you use it when you do need to! Coding dynamic SQL in SQL is a rather crude, ugly and ineffective way of doing things IMHO. Building a dynamic statement in the application layer makes more sense as that can be much easier as you do not need to concatenate strings everywhere.

    Here is how some dynamic code would look like in ColdFusion (high level tag based language for web development):

    As you can see, the dynamic sql code that is conditionally added, does in no way obstruct the main layout and readability of the query. You also see how a list of comma separated IDs is passed into the query in a type safe manner. Passing a single ID trough is even less work and just as safe!

    Not using dynamically build SQL when you really need to or not having the right tools to pass data values safely, is what can make things messy. Not using dynamically build SQL in and of itself.

    Dynamically creating that query will also avoid plan re-use. The IN clause will be different every time. In that case, if I were using SQL 2008 I'd pass in a Table Parameter to the stored proc with a list of values and join onto it or use it in an exists test.

    With SQL 2005 or earlier I'd pass in a comma separated list of values and use a tally table to break it up into a temp table or table variable, and use that for the exists test or as a join.

    You could also pass in an xml document with a list of values in it and shred those out, done properly that's a very fast solution as well.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Ultimately to sort data, the best solution is still to return it unsorted to the client, and let the client sort it rather than force the database to sort it anyway.

    But sometims you gots to do what you gots to do.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • In the last example, could it not have been done with TWO case statements instead of four so long as each case statement contained only references to the same column name? So one CASE for DOB and one for Name?

  • I saw a very clever use of ROW_NUMBER( ) OVER (ORDER BY CASE WHEN ....) as RowNum,

    in a CTE WITH-clause in a stored proc, where the CASE used an input param to determine which column and whether ASC or DESC to order by.

    You can follow this up with a final ORDER BY RowNum. You could also use RANK( ) or DENSE_RANK( ) functions if that offered value.

Viewing 15 posts - 16 through 30 (of 63 total)

You must be logged in to reply to this topic. Login to reply