The T-SQL Paradigm

  • BWAA-HAAA!!! I've heard about a bazillion C programmers ask why T-SQL can't be more like C. My question would be, why can't C be more like T-SQL? 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jason Miller (4/3/2009)


    GSquared (4/3/2009)


    Quite obviously, it is meant to be contained in the Group By, so why not implicitly include it? Sure, that may be a violation of the standard, but in my opinion, the standard is flawed on this point.

    I guess I'm strange, I prefer an language/environment where explicit definitions are required.

    Nah. I prefer explicit in any code I'm going to reuse at all. I'm just lazy about my one-off scripts and wish a few shortcuts would work.

    None of my complaints about T-SQL/ANSI SQL are major. Just a few minor wish-list items. T-SQL gets the job done, gets it done well, and is easy for me to write, debug, etc.

    There are some things in the interface that would be time-savers. In most cases, Redgate and/or ApexSQL have tools that will do all of those things, so that's good enough for me.

    I'd much rather MS spend their time on even better database tools, performance coding, etc., on the major stuff, than that they spend time and resources on Management Studio. If I had to choose between aggregate partitioning and intellitype, I'd definitely choose aggregate partitioning.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Maybe Wittgenstein's Tractatus should be required reading before criticising a language.

    Isn't any human language a reflection of the social structure in which it is evolving?

    Or did the chicken come first, and the social structure derives itself from the evolving constructs of the language?

    Aren't there concepts in, say Finnish, that can't be rendered directly into English, and vice versa?

    Does that make either Finnish or English inherently better or worse than the other?

    Or does it just mean that Finnish is really good for dealing with Finnish social constructs and ...?

    Isn't any programming language a specialised subset of a human language with a set of constructs that reflect the language creator's social domain?

    Isn't it the task of a programming language to render explicable the binary interface to yet another set of constructs created using yet another programming language?

    What difference does it make what the verbs and nouns and all the other parts of speech are, provided everyone using the same language can comprehend the ideas being expressed in a more or less similar manner?

    And produce similar results of course.

    Peter Edmunds ex-Geek

  • YES, T-SQL is a strange and somewhat poor programming language. I tend to think in terms of sets, not in terms of procedural programming, and yet there are still things that have always baffled me in T-SQL.

    From my earliest days of SQL 7, I wanted to pass parameters to views. (Not using parms in a Where clause on the result, but a parm that the view would use.) Oh, a stored procedure could do that, eh? Well, it seems strange to call a stored procedure to act like what is essentially a function in other languages.

    Leaving out CLR functions, which are often not recommended for performance reasons: Up until SQL 2008, you couldn't pass an object of any type (such as a table) to a stored procedure. Table-valued functions are better, and yet some long-time SQL programmers don't seem to like table-valued functions (or any functions at all).

    In most programming languages, functions are used when you want to take an object (the domain of the function, in mathematical terms) and return a result (the range of the function) which should also be an object. The objects of the language can be passed in as parameters. Think VB .NET 2002, or even APL. SQL 6.5, 7, 2000, and 2005 can't do this.

    SQL procedures seemed more like "scripts" or "batch files" to me, to do procedural things, not to manipulate data arguments and return results. But I know that most real work is done all over the world in T-SQL using stored procedures. EVEN THOUGH stored procedures aren't first-class programming constructs (until SQL 2008), able to accept a table as a parameter.

    Some of these things can be accomplished in SQL 2005 using scalar functions on each data column. Although Microsoft once claimed in writing that using scalar user-defined functions is not supposed to incur a performance penalty, the truth is that they do. (People say that you shouldn't do anything in SQL one record at a time. I don't, but I realize that at the very lowest level, SQL itself has to move data one record at a time. Consider the times when you are using CASE statements in a Select; SQL obviously has to process these one record at a time in order to do the comparisons.)

    Why can't expressions be used anywhere a T-SQL variable can be used? For example, in the first parm of a RAISERROR call. The first parm must be a string, an error number, or a local variable. The language SHOULD (IMHO) be able to accept any SQL expression in any place where a local variable can be used. But it can't.

    These are the kinds of things that frustrate me, and they don't seem to be well-documented -- we just have to learn them by doing.

    Why are there so few examples ANYWHERE, such as in BOL or in the vastness of the Internet, showing an Inner Join on TWO conditions (not just one), or joins that use anything besides equality? I had to search high and low before I knew you could join on more than one condition. (Why does everyone insist on aliasing tables, often to one-letter names? Table aliases are usually not required, but all examples use them, and novices get the impression that they are important. They obscure what tables are being manipulated.)

    And I have a long-running conversation (more than a year) with the BOL writers: the WHILE keyword is simply documented WRONG, WRONG, WRONG. Here is the SQL 2005 doc for WHILE from the November 2008 update:

    WHILE Boolean_expression

    { sql_statement | statement_block }

    [ BREAK ]

    { sql_statement | statement_block }

    [ CONTINUE ]

    { sql_statement | statement_block }

    That tells me that WHILE can be followed by a statement or a statement block, then (and only then) you may have the BREAK keyword, and then another statement or statement block is REQUIRED (the curly braces tell you that), then you can have an optional CONTINUE statement, then one more REQUIRED statement or statement block. If you use both BREAK and CONTINUE, then obviously BREAK has to come before CONTINUE, and there must be one statement or statement block between them. Break and Continue can each be used only once, and they aren't mentioned as living inside a statement block at all.

    Clearly this documentation is wrong. And it has been wrong forever.

    It's documented better in SQL 2008, but it's still not right.

    Enough for now...

    David Walker

  • Yes, the documentation on T-SQL (and pretty much everything else that has to do with anything ever touched by human beings) could stand improvement. In large, major ways. The only solution to that is keep on communicating on the subject.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • David Walker (4/3/2009)


    Why can't expressions be used anywhere a T-SQL variable can be used? For example, in the first parm of a RAISERROR call. The first parm must be a string, an error number, or a local variable. The language SHOULD (IMHO) be able to accept any SQL expression in any place where a local variable can be used. But it can't.

    Being a lover of the C family of languages I have to say this is one of my biggest gripes as well and makes the language design of T-SQL seem like an afterthought. And yeah, it does feel more like a batch or scripted language.

  • Mark Cook (4/3/2009)


    Like so many others have said, T-SQL excels in data extraction and manipulation. ...what it's designed for, it's great. And I've got a lot more to learn about it.

    😛

    I so agree.

    I'm also an older coder some 20+ years, and in a small shop.

    I love seeing some of the older tools being mentioned.

    One other thing at adds to the complexity is poor table design and/or modifications compounded by time. For example Consider a database designed several years ago and ported forward (perhaps it had security written into the code that may exist in other layers now, or some replication to compensate for a performance issue in sql v1, ... )

    As much as we would like to, many of us do not have the time to rewrite the tables to take advantage of all the new features.

    "Any problem can be overcome given enough time and money or until something useful needs to get done."

    -- Optimist with experience and still learning

  • GSquared (4/3/2009)


    T-SQL is clunky in certain regards. For example, why doesn't this work:

    select Col1, count(*)

    from dbo.MyTable;

    SQL Server will raise an error that Col1 is neither contained in a Group By nor an aggregate. Quite obviously, it is meant to be contained in the Group By, so why not implicitly include it? Sure, that may be a violation of the standard, but in my opinion, the standard is flawed on this point.

    I mean no disrespect but I do hope you are joking about why GROUP BYs arein't implicit. While in your example you could say that the logical Grouping is on Col1, not every Query with 1 or more aggregates in it is like this. When you have more then 1 aggregate and one non-aggregated value the order of the GROUP BY is very important. Take these 2 query's for example:

    /*Query #1*/

    SELECT col1, Col2, Count(col3)

    FROM dbo.TABLE1

    GROUP BY col1, col2

    /*Query #2*/

    SELECT col1, Col2, Count(col3)

    FROM dbo.TABLE1

    GROUP BY col2, col1

    The resulting set will not always be the same between the 2 depending on the data in TABLE1. This is why a GROUP BY is required and not implicit.

    Kindest Regards,

    Just say No to Facebook!
  • And yeah, it does feel more like a batch or scripted language.

    I hate to break it to you, but T-SQL IS a scripting language. It's a language for scripting SQL statements. SQL, on the other hand, is a Domain Specific Language for extracting data out of relational databases. Neither is, or was ever meant to be, a full-fledged programming language. I suspect (without really looking into it) that, even combined, they are not Turing complete. That's why front ends are written in something else.

    Coming from a programming background, my biggest issue learning SQL was that you couldn't abstract as much as I would like: passing a table name, or a column name for example. But you get used to the quirks and you figure it out.

    --

    JimFive

  • Anyone remember this?

    ON ERROR BEGIN;

    PUT DATA;

    SIGNAL FINISH;

    END;

    Then as to the APL dig that was Steve Jones. Even the great Admiral Hopper would never acknowledge having anything to do with APL it was in majority developed at FSU while she was running things down there. Then again she took credit for the term "bug" and we have heard from Phil Factor on that one.

    SQL as a bad language. No. Try doing APL without the magic type ball. I helped a friend debug a program that took a free form trinomial and factored into two binomials keeping the same variables. We did it in COBAL. I worked at a CPA firm and we wrote all the programs in FORTRAN. I try to stay language independent if not language agnostic.

    I'm still in the mode of using SQL to get me sets of data an then letting my application go through the rows. For SQL operations stay set based. I just finished reading "Thinking In Sets" by Joe Celko. If you think T-SQL is bad our Joe has to stay with pure SQL. The book makes me glad of several things. One of them being that is folk dealing with SQL Server have it nice.

    ATBCharles Kincaid

  • I believe you'll find most who complain about SET based languages are those who first learned procedural based language(s) and therefore have a hard time transitioning to a SET BASED approach. The same can often be said for the reverse where some SQL pros complain about how PROCEDURAL based languages work although the Procedural programmers complaining about SQL most certainly out rank the reverse by 4 to 1 if not more.

    ATTN Procedural developers who complain about how T-SQL works:

    When you go to a toolbox, not the one in your IDE but a real one that holds a wrench, hammer, screwdriver and the like, do you pick up the hammer and complain about it being limited in it what it can do, asking why can't the hammer also tighten a screw like the screw driver? Or do you just take the hammer and drive in the screw with it like the way you handle set based programming with your procedural based langauge?

    Kindest Regards,

    Just say No to Facebook!
  • David Walker (4/3/2009)


    Leaving out CLR functions, which are often not recommended for performance reasons:

    Must be the nuances of the language... I've been able to beat virtually every CLR with the exception of a well written RegEx clr and one type of hierarchical file handler... and I darn near tied on the RegEx for performance.

    The reason why people think that CLR's offer a performance advantage is because they don't know how to write what they want in T-SQL and get performance out of it.

    Table-valued functions are better, and yet some long-time SQL programmers don't seem to like table-valued functions (or any functions at all).

    Heh... in which cases are they better? They typically are the cause of performance problems in the face of any scalability. And, they have a huge number of limitations compared to TempTables. The idea the Table variables are "memory only" and, thus, faster than Temp Tables which are supposed "disk only" is a huge and absolutely incorrect myth.

    For any given situation, you won't actually know which is faster until you do a test... but that's also a problem... while Temp Tables will persist for debugging purposes in QA or SSMS, table variables do not which makes them a bit more difficult to troubleshoot because you basically have to rerun everything instead of just running something the persisted result sets stored in a Temp Table.

    Table aliases are usually not required, but all examples use them, and novices get the impression that they are important. They obscure what tables are being manipulated

    But, done properly, make the query rather self documenting.

    It's documented better in SQL 2008, but it's still not right.

    BWAA-HAA!!! Yeah... I'm sure that the C documentation is 100% correct and that it has all of the examples you'd ever need as well. 😉

    It's totally amazing to me that this whole thread is taking place... you wouldn't use a Porsche to bring a picket fence home and you wouldn't take your sweetie out to a fine anniversary dinner in your stake-body truck...

    I think anyone who doesn't like T-SQL should probably leave it to the ones that do. 😉 To that end, I don't like C... and I don't use it. :hehe:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • When you go to a toolbox, not the one in your IDE but a real one that holds a wrench, hammer, screwdriver and the like, do you pick up the hammer and complain about it being limited in it what it can do, asking why can't the hammer also tighten a screw like the screw driver? Or do you just take the hammer and drive in the screw with it like the way you handle set based programming with your procedural based langauge?

    Great example!! 😉 Just what I was saying before, use the right tool for the right job! Too many people hammer in the screw then complain about the difficulty they are having or that the results are less than expected. Hmmm. I wonder why?:ermm:

  • YSLGuru (4/3/2009)


    GSquared (4/3/2009)


    T-SQL is clunky in certain regards. For example, why doesn't this work:

    select Col1, count(*)

    from dbo.MyTable;

    SQL Server will raise an error that Col1 is neither contained in a Group By nor an aggregate. Quite obviously, it is meant to be contained in the Group By, so why not implicitly include it? Sure, that may be a violation of the standard, but in my opinion, the standard is flawed on this point.

    I mean no disrespect but I do hope you are joking about why GROUP BYs arein't implicit. While in your example you could say that the logical Grouping is on Col1, not every Query with 1 or more aggregates in it is like this. When you have more then 1 aggregate and one non-aggregated value the order of the GROUP BY is very important. Take these 2 query's for example:

    /*Query #1*/

    SELECT col1, Col2, Count(col3)

    FROM dbo.TABLE1

    GROUP BY col1, col2

    /*Query #2*/

    SELECT col1, Col2, Count(col3)

    FROM dbo.TABLE1

    GROUP BY col2, col1

    The resulting set will not always be the same between the 2 depending on the data in TABLE1. This is why a GROUP BY is required and not implicit.

    Trust me on this one: I'm very familiar with how Group By works. I have more than a little familiarity with T-SQL.

    At the same time, I think an implicit Group By, taking the same non-aggregate columns as the Select clause, in the same sequence, would be handy in certain circumstances.

    There are a number of other things of a similar nature that I would like to have, which, if misused, would be bad, but if used correctly, would be nice. Not necessary, not critical, not even important, just "nice".

    Yes, that opens the door to misusing these things. Considering all the harm that can already be done by people who don't pay attention to how "or" and "and" work in Where clauses, don't understand that cursors are generally a bad idea, don't understand how to avoid RBAR, don't understand proper indexing, don't understand normalization, don't understand denormalization, don't understand data modeling, don't understand data types, et al, ad naseum, I'm not worried about a few minor problems with these things. I just think they would be handy.

    So, no, I'm not kidding.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Hope I don't misattribute anything here...

    Heh... in which cases are [table-valued functions] better? They typically are the cause of performance problems in the face of any scalability

    I meant better in the sense that you can pass things around more easily, not better in performance. Although, table-valued functions can't accept a table as a parm, they just return a table as the result. Why can't a table-valued function take a table as a parm? If T-SQL were a first-class language, this would be possible.

    But, done properly,

    make the query rather self documenting.

    No. NO!!! NO!!!! The table name itself is the most accurate representation of what's being joined. IF, and I say IF, a table alias name would EXPOUND on what subset of the table was being joined, you might be right, but SHOW ME AN EXAMPLE in the wild where that is done. I have NEVER seen that done. NEVER.

    It's always "Trades T Inner Join Customers C" or even worse, "Trades A Inner Join Customers B". Just Say No to table aliases! When I'm reading the guts of the join, is C the customers or the clients? Is A the account table or the activity table? I have to mentally resolve the aliases, and leaving them OUT would be 1000% better. I would like to find a tool that would un-alias all of the crap like this for me.

    Yeah... I'm sure that the C documentation is 100% correct and that it has all of the examples you'd ever need as well. 😉

    I don't write in C, or C++, or C#, so I can't say. I write code in T-SQL these days. 🙂 And sometimes VB.NET 2008, but not often.

    The SQL engine is terrific forr what it does -- have you seen the bookstore sample (Barnes and Noble, I think) that Microsoft used when testing SQL 2005, which included some billion-record tables? -- but the T-SQL language is not what I would call a first-class programming language. I didn't start the topic, I just replied with my thoughts.

    David Walker

Viewing 15 posts - 46 through 60 (of 266 total)

You must be logged in to reply to this topic. Login to reply