Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

T-SQL Tuesday #21 - Yes, Today is Wednesday!

 T-SQL Tuesday #21 - A Day Late and Totally Full of It

Holy Crap!  What in the world are we talking about today?  Well it’s TSQL Tuesday #21, and we’re posting this on Wednesday, and I’m feeling pretty crappy about the whole thing.  So, we have:

Select T-SQL + [Crap] =Crap Code. 

Today, “crap” is a reserved word.  If you don’t like talking about crap, then blame the moderator,  founder and host of today’s T-SQL Tuesday, er, Wednesday, Adam Machanic.  :-)  Adam invited us all to the blog party to talk about our crap code.  As per Adam’s definition,

“Crap code. We’ve all seen it. We’ve all created it. We’re all guilty. Yes, even you. Sometimes our crap is purposeful—the ugly, “temporary” hack. Sometimes we produce crap because we simply don’t know any better. But there is no excuse good enough. As professionals, we must strive to rid ourselves of bad habits. And the only way to learn the difference is to see lots, and lots, and lots of examples.”

But, even with all this crap surrounding us, and infiltrating our SQL Servers and blog, there is redemption here. We must talk about it here, why is (or was) it crap? Why did you do it? And how did you learn from your mistake?

The story you're about to hear is true!  I've omitted the names to protect the innocent - or embarass them.  It is a tale that is, kinda crappy.

Well, I was browsing the SQL Server forums online, and a poor unsuspecting poster was looking for a little help with his query.  By no means a T-SQL titan – a title that I’ll bestow on other SQL Gurus. He wanted to simply find out the best way to take his existing code, and filter it by date range.  It was a hack, not the best code, but it worked.  He even suggested there must be a better way to do this, a better way to aggregate the data.  Hey, there always is a better way.

Interestingly, he used the “cookie jar” as a sample data set and query: 

SELECT A.JAR1, B.JAR2 FROM
(select COUNT(cookie_id)as JAR1 from cookies
where ctype_id in (19683,19633,18334,13343,19645)
AND crdate between '3-15-11' and '4-15-11'

) AS A, 
(select COUNT(cookie_id) as JAR2 from cookies
where ctype_id in (25682)

AND crdate between '3-15-11' and '4-15-11'
) AS B


So, when I scrolled down to see one of the responses, I was shocked to see an extremely angry and excoriating reply, in the manner a child get scolded by a parent when he or she has done something really naughty. Talk about getting your hands caught in the cookie jar!

 This reply was actually by a well-known T-SQL Guru in the community, who’s helped countless folk in the forums. But really, I think someone needs to teach him some bedside manners!  A little forum etiquette for those of lesser T-SQL prowess than this guy.

Now don’t get me wrong, the answer was firm and certain, and definitely filled with useful expert knowledge that he could’ve padded with a smiley.  (He actually used the fiery angry emoticon!) He could’ve have been soft but mentoring, but hells no, he wasn’t going to have any of this crap code in his forum.

Yes, the Sheriff of T-SQL was guns blazing, but offered some very good tips on code formatting.  Talk about a public flogging, his reply was enough to make many of us feel this small.  He was a yellin, and a screamin, and even a braggin about his perfect code. Figured the OP was a crappin in his pants.

Here’s the reply, Word-for-word:

“First of all, NEVER EVER DO A BETWEEN ON DATES!  Heh... yeah... I'm yelling here. The reason why to never do a BETWEEN on dates is simply to make your code more "bullet proof" if someone decides they want to start adding times to the cookie entries.

Second, there's absolutely no need to dip the table twice if you do a little "pre-aggregation" using some CROSS TAB technology. You'll see what I mean in the code below.

Third, get into the habit of using the 2-part naming convention for all tables. It'll make your code a bit faster if it's being called from a GUI or in a UDF and, again, it'll make your code more "bullet proof" if some "smart-guy" decides to start using multiple schemas.

Here's the code. Since you didn't include any test data in your original post (see the first link [REDACTED] below for the
correct way to do that), I've not tested the code.

--===== Find two types of cookies in a date range and return the counts for each type in the date

     -- range and the percentage of the total of the two types that each type represents

WITH

ctePreAggregate AS

( --=== Using a single pass on the table, pre-aggregate all the data we need for the date range

     -- using a CROSS TAB.

 SELECT Jar1  = SUM(CASE WHEN ctype_id IN (19683,19633,18334,13343,19645)       THEN 1 ELSE 0 END),

        Jar2  = SUM(CASE WHEN ctype_id IN (29682)                               THEN 1 ELSE 0 END),

        Total = SUM(CASE WHEN ctype_id IN (19683,19633,18334,13343,19645) THEN 1 ELSE 0 END)

   FROM dbo.cookies

  WHERE crdate >= @StartDate AND crdate < DATEADD(dd,1,@EndDate) --Includes all times on @EndDate

    AND ctype_id IN (29683,29633,98334,83343,29645,29682)

) --=== Return all values including the requested percentage calculations.

 SELECT Jar1, Jar2, Total,

        Jar1Percent = ISNULL((Jar1*100.0)/NULLIF(Total,0),0.0), --Prevents "Divide-by-Zero" errors

        Jar2Percent = ISNULL((Jar2*100.0)/NULLIF(Total,0),0.0)  --Prevents "Divide-by-Zero" errors

   FROM ctePreAggregate

;


As a bit of a side-bar, find a good "indented" standard for formatting your code. It'll make you look better in the eyes of your peers, it gives you a chance to review your own code for errors, and it'll make life a whole lot easier when you actually have to read the code to troubleshoot it or make modifications in the future. Don't forget to comment your code so you can remember what you did six months from now. My recommendation there is to comment every SELECT, INSERT, UPDATE, and DELETE even if they're in a sub-query so you don't actually have to read the code to figure out where you need to make a fix or a change. We reduced research time for many code modifications from an average of 2 days down to just a couple of minutes by using proper comments in the code. Documenting even obvious "trick code" (like I did with some of the "tail" comments) will also help the next person. It takes only a minute or two to properly comment the code as you write it and it saves days in troubleshooting time. Quite the ROI if I do say so myself.

Wow, isn’t that something!  He even went on to say how this all inspired him to do a blog – can’t wait to see that one.  I bet the OP really learned something, and will never make the mistake of using BETWEEN again. (Heck, he’ll probably never even use T-SQL again!)

I thought this was pretty amusing, and an example of crap code, being, well beaten the crap out of. I hope the OP has recovered.  He’s probably no longer doing SQL, and doing something that he doesn’t get yelled much for, like cleaning toilets off-hours.  Well, enough crap code for today.  This has been another fun edition of T-SQL Tuesday on Wednesday.  I guess that’s the way the cookie crumbles.

You can follow all the T-SQL Tuesday fun on Twitter using the hashtag #TSQL2SDay

Thanks, Adam Machanic for hosting the party! 

 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

You can follow me on Twitter |PearlKnows, and check out our new website for our product(s) and services at http://www.pearlknows.com

 

 

Comments

Posted by Jason Brimhall on 10 August 2011

For the record, he went back and added a smiley to the original post after that comment about yelling ;)

Posted by R L Reid on 12 August 2011

I have no idea what real world thing those literal numbers represent, but they make me nervous.  Maybe because I'm dealing with a few dozen old scripts that all have something like "and c_code in (12, 15, 65, 76)" - and now there's a 31 that needs to be part of the selection. (The c_codes are a subset of what's in a relation, but there's no in-database differentiation between this group and the rest).

I usually refer to this as "data-in-logic" - does anyone have another term for it?  

Posted by J on 12 August 2011

I think the term "data-in-logic" is a good term and should be conined usage to describe this scenario.  

Another term (not good for coining) is matching to this article (Crap Code).  Hard coding values in a proc, view, function etc is bad practice.  The other name for it, "Candidate for re-write" so that the "Hard Coded" values can be changed to a config or parameter table.  Such as {

instead of

FROM MyCrapCodeTable

WHERE MyVagueColumn IN (1,2,3,4,5,6,7,8,9)

One can write

FROM MyFormerCrapCodeTable oq

WHERE EXISTS (

   SELECT null

   FROM MyIncludeList iq

   WHERE iq.MyIncludeListValue = oq.MyVagueColumn

)

This makes adding values to include or removing values more simple since it isn't a code change but a table insert, update and/or delete.  The include list can include a description so that "VagueColumn" and its value is less vague.

Posted by John Dempsey on 15 August 2011

Maybe I'm seeing something that is here, but isn't the "Corrected" code excerpt posted above wrong?  I see two things there that don't seem to "add up".

1.  In the CTE calculation for Total isn't it missing the cytype_id = 29682 for Jar#2?  It seems to me that the total should include all types in order to perform the percentage calculation later.  Am I wrong in my analysis?

2.  If they are going to correct someone on the BETWEEN shouldn't they also keep the DATEADD function out of the WHERE clause also?  Perhaps, creating a variable to store the end date (@EndDate) and using that after the "<" operator.  It was my understanding a function on the WHERE clause makes a the filter Non-Sargable.  

Please correct me if I am wrong on my analysis of the corrected code.

Thanks.

Leave a Comment

Please register or log in to leave a comment.