Guest Editorial: On writing SQL

  • Phil Factor (12/12/2008)


    You sit down at the keyboard and you say to yourself, How in the blazes am I going to test this routine I haven't yet written? Once you've worked out how you can test it, Then you write it. Surely the principles of Test Driven Development (TDD) have been around for years?

    Well, precisely. I was taught TDD at school a zillion years ago, although they didn't call it that. And even now, the second thing I do is figure out how I'm going to test the thing I'm writing. The first is where the data is coming from.

    A brief example: I almost always test a query by running it and getting the rowcount, then sticking DISTINCT on it and running it again to see if the rowcount is different. If it is, I've got dupes and I've got to go back, something's wrong. I've been doing that since long before anybody heard of agile development.

  • WILLIAM MITCHELL (12/12/2008)


    On a high level, I consider the stored procedure to be what in electronics is called a black box - you define the inputs and the expected outputs.

    All of the following would be done in the development environment using a recent backup from production.

    Begin a new query, DECLARE the input parameters and SET those parameters to sample values. Write some T-SQL using those parameters to produce the desired results.

    For INSERT, UPDATE or DELETE sp's I would also insert BEGIN TRAN and ROLLBACK TRAN with a few "SELECT *" statements before and after the T-SQL so I can verify the data changes.

    At this point you have a working batch & you can determine if results & performance meet the objectives.

    Once the batch is working, comment out the DECLARE's and SET's and add the CREATE PROC statement at the top of the code. Create & then execute the sp with various combinations of parameters to verify that the outputs are still correct. I would also run it with some inappropriate input values to see if I can break it.

    After all that, you can remove any TRAN and SELECT * statements and the sp is ready for QA testing.

    William's method is exactly the way I approach it. When I'm writing a many-stepped procedure with lots of lines of code that will run from sub-procedure calls, I put the code in-line until I'm comfortable with it's function, then move it to a sub-proc. I find using labels and GOTOs handy during the creation process as well.

    Tom Garth
    Vertical Solutions[/url]

    "There are three kinds of men. The one that learns by reading. The few who learn by observation. The rest of them have to pee on the electric fence for themselves." -- Will Rogers
  • This year I had to rewrite some SQL consisting of >7000 lines of code embeded in a DTS package, producing a huge table (> 160 cols, 1,7 million rows). The old code used three nested cursors and an enormous amount of variables in an RBAR way, each day refilling the complete table. No documentation available, just the code and the data (yeah, that).

    I ended up analysing, writing code and testing for each col individualy, until my code reproduced the table in a more efficient way. While the task as a whole looked daunting, breaking it down into individual sub-problems and testing each while writing the code for it made handling it possible for me.

    Cheers!

  • The first thing I do when I have to shepherd someone into the strange and wonderful world of writing SQL code is force them to turn off whatever GUI they're leaning on.

    Learning to write join statements (and then complicated join statements) along with subqueries gets them to understand what's happening when they draw the little line between columns.

    After that I make sure they're testing segments, such as "What does that subquery return?" when they're building up to the larger procedures.

    Unfortunately we're living in a GIU, hand-holding world and us old codger command line programmers are getting to be an endangered species. But occasinally we're dragged back out into the light to show how things work, instead of how to put boxes together on a screen.

    Yes, I'm a bit biased towards being able to actually write my code. SSIS is a very difficult transition for me since I have to translate what I want to do into a task/transformation and even then I tend to ask "Why does it take four steps to do what I could do in one statement?"

    For reasons best left unsaid we're moving away from stored procedures and towards SSIS whenever practical. But I still test my logic flow in the server management studio before trying to write it in SSIS. Some habits I won't even try to break.

  • Great editorial and interesting responses. Thanks for the break, Phil, I had run snowboarding yesterday 🙂

    I wonder if we'll start to change the way we write procedures in the future? I tend to grab sets of results as I write, checking that the set I'm thinking of is what I'm pulling into some operation, but I also tend to hold large sets in my head as well. That kind of comes with practice, and it's hard to teach someone. You really almost have to get them to think in different terms.

    I wonder if we'll get tabbed grids of data at some point as we write to help us visualize what's happening and if that will help.

  • When I was first studying programming 25ish years ago, my instructors had us approach it from sort of an outside in perspective: given this input, needing this output, how do you get from A to B? I also think in tables, and somewhat less in joins, but I have good mental visualization of relational operations, and that has served me quite well through the years.

    It's not easy getting traditionally-trained programmers to give up row-by-row processing paradigms and think in sets, describing good debugging techniques adds to that joy. I look forward to following this thread to get some more ideas on the subject.

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

  • To describe how I develop or debug SQL as a process, I'd have to say what I do is simmilar to test driven development, but not nearly as formal.

    I start by undertanding the environment I'm in (in this case the data). For example, if I'm working on a query about customer orders, I'll first find out how many customers there are, and how many orders there are, etc.

    Once I have a grasp of the environment, I'll break down the overall task into bite sized pieces, and then piece by piece add more to the whole query or stored procedure as I resolve each bite, making sure my results after each piece added are reasonable for my knowledge of the environment.

    When assembling the pieces, I try to stick to a few rules, like select columns from the table with the smallest level of granularity that I can, join each table to the table with the smallest level of granularity that I can, do aggregations in a view or subquery (derived table) before joining the aggregated table to the rest of the data to simplify costly complex group by clauses, and others.

    After putting all the pieces together, then I'll check for acceptable performance and efficency of execution, and of course compare if the results are reasonable for my knowledge of the environment.

  • I always develop from the inside out. First write subqueries and test selection criteria, then test any complex computed columns then combine the simplest joins and any criterion (inner) joins, then add on the outer joins and any temp tables. This approach allows me to check plans and processing times as well as validate outputs as I go and greatly simplifies performance tuning later on. Like others, I think this aligns with the agile paradigm nicely as it attacks the most vulnerable points of development first and tests them in a real-time manner. (though I admit I am not an expert on the paradigm)

    I like the analogies of smelling trouble and visualizing joins. I learned SQL on DB2 v5 and all we had by way of explain was a flat grid of ones and zeros. I am still slow on adoption of the nifty GUI technologies because I know how to write the code. In fact, I just turned off intellisense as its assumptions annoy me.

    😎 Kate The Great :w00t:
    If you don't have time to do it right the first time, where will you find time to do it again?

  • When writing/testing an SP, I always start by writing the procedure's interface along with the paramaters, initial output are the paramater values. I then write a query that calls the outside SP through its interface. This is how the world will be working with my SP.

    As I write subs, I'll test these as well through their interfaces (as opposed to testing them from within.).

    This coming from someone who discovered SQL Server Express and thought, "Hey, how cool is this!". I don't work for the company's IT department, I'm not a "trained" developer/DBA, but I hack away to fill a niche that the overworked, understaffed IT couldn't fill.

    My $0.02 worth.

    3c

  • Phil Factor (12/12/2008)


    I was getting into an argument with some Agile XP people who were making out that they were the first to see the central importance of unit-testing (snip)

    I don't want to start an argumment with anyone, but I have pretty much the same issue with pretty much all of Agile methodology (such that I understand it correctly). For example, the client-centered focus. The importance of obtaining clent business knowledge was flat hammered into my head back in 1989, when I started in this industry. I do like the Agile emphasis on this and other issues, such as unit testing, but I must agree with you Phil, much of Agile is nothing new.

  • You said, "The march of technology makes bone-heads of us all." I take your meaning but I must disagree. The fact that changes in technology make our knowledge obsolete indicates that much of what we know really isn't knowledge at all. I believe this is one of the fundamental reasons that technical and non-technical people so frequently cannot connect at a personal level. Technical people sometimes scoff at, say an art history major, but at least what that person studies and contemplates will still be true a thousand years from now. Much of what we study will be obsolete within the decade.

    Some may not care about this, but I found this realization somewhat depressing.

  • Your question was not just about stored procs, but about "tricks to being a highly productive SQL Programmer", so I'll think outside of the box.

    #1 - I frequently use information_schema views to query tables, columns and routines. Finding objects in the database is lightening fast for me.

    For example, to find all views containing a dateAdded column:

    select * from information_schema.columns

    where column_name = 'dateAdded' and table_name like 'vwc%'

    -- assuming you use vwc as a prefix to your views

    #2 - Scripting out SQL statements via SQL statements, for instance:

    select 'select * from ' + tablename + ' AS ' + Name + '_table where dateCreated < getdate()' from fooTable

    -- assumes you have a table named footTable which contains table names and that each of THOSE tables has a dateCreated column 😉

    which results in something like this:

    select * from tblUser AS tblUser_table where dateCreated < getdate()

    select * from tblTask AS tblTask_table where dateCreated < getdate()

    ...

    that sample is simplistic, but I use the approach everyday to support customers or aide my development efforts.

    #3 - sp_helptext

    Other developers go looking for a stored proc via object explorer (maybe even using the new filtering feature in the object explorer). It's faster for me to type:

    (control+t for text results mode)

    sp_helptext spxCalculateYearlyBonus

    and then copy/paste the proc into the query window.

    -- if you don't know the exact name of the proc, then query information_schema.routines using LIKE

    There are so many more - (have you used binary_checksum() or the new EXCEPT statement? Again, knowing the TSQL language, the system procs (like sp_helptext), and knowing the development tool (be it Visual Studio, BIDS or Management Studio) is the real trick.

    Anyone can use a table saw, but you'd be amazed at the versatility of the tool once you start using jigs, dado blades and cross cut slides! I always loved the architectual and foundational aspects of home building as a metaphor to building software 😛

  • I've had a fascinating time reading these comments, and learned a huge amount in the process. I'm sure that everyone else who reads these comments will learn something too. I've found it very encouraging to read of so many people who program the same way as I do, or who have interesting variations that I hadn't thought of. Many thanks to everyone who contributed.

    I may have been a bit hard on the Agile developers. I've actually learned a lot from the youngsters who are trying out Agile methodologies, but it is a weird feeling to be sat down and told the importance of Unit testing, regression testing and user acceptance testing as if it were something new, like social networking, and something that I, with my graying hair, would be unacquainted with. The'll soon be introducing me to the Beatles, Starsky and Hutch, and flared trousers.

    I re-read 'The Mythical Man-Month, by Fred Brooks, recently. His suggestions, made in 1974, on development methodologies, still sound radical.

    On the subject of the friend who could whistle ASCII text into an acoustic coupler, He was not spinning a yarn. I stood over him and monitored the output on a VDU, and out came characters between 0 and 127, all over the screen. The only problem was that there were a lot of control characters (below 32). It put me in mind of the attempts of Archie of the wonderful Don Marquis books of Archie and Mehetabel where Archie the cockroach jumped up and down on the typewriter keys. If he was really on form, he got very close to writing text that could be mistaken for English, but his downfall was in inserting space characters, which required an accuracy of whistle that was beyond human skill.

    Best wishes,
    Phil Factor

  • When the heck did fletching become obsolete?!? And here I spent the last twenty years honing my sons, one to be an expert fletcher and the other to be an expert bowyer, figuring that the natural synergy would be a big money-maker for them. Imagine my erubescence!

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Heh... I've found that helical fletching makes a high velocity pork chop sound really cool as well as enhancing the flight path... haven't gotten into the bowyer mode of building launchers because, unless the bone is just right, pork chops are difficult to properly set a nock in. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 73 total)

You must be logged in to reply to this topic. Login to reply