Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

Hanging WHILE loop Expand / Collapse
Author
Message
Posted Sunday, May 11, 2014 11:32 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 7:34 PM
Points: 36,978, Visits: 31,498
tombiagioni1983 (5/7/2014)
Thanks everyone for the quick responses and the helpful attitude! I come from a C# programming background where iterative loops aren't as costly, So i'll definitely be reading that article! I'll add in the additional BEGIN/END statements and report back.

Thanks again everyone!


Iterative loops in T-SQL aren't costly. Just the explicit ones are.

One of the things that most books and courses on SQL Server and, in particular, the T-SQL Language fail miserably in is how to loop. In courses like those for C#, the first thing they do is teach you how to setup the working environment. The next thing they do is teach you how to print "Hello World" to demonstrate that your working environment is setup. After teaching you some syntax, functions, and a couple of other things, they teach you how to write a loop that will count from, say, 1 to 100 because looping is the backbone of every program.

T-SQL isn't any different except you don't have to write the loops. For example, instead of opening a file, checking to see if any rows exist, reading a row, doing something with it, and looping back to see if another row exists, all you have to do in T-SQL is write a SELECT statement. Behind the scenes, SELECT statements aren't a whole lot more than a machine-language-level loop that works much like I just described. Thanks to a fellow by the name of R.Barry Young, I call those hidden, very high performance loops behind every SELECT a "Pseudo Cursor" and, yes, they are controllable.

If you look at Luis' code, it uses such a Pseudo Cursor to replace your explicit cursor (While Loop, in this case). It's MUCH faster than the While Loop because it only has to make 1 execution plan and then SQL Server loops behind the scenes at machine language speeds. With a While Loop, it has to make or reuse an execution plan for each and every iteration even if it decided that there's an execution plan that it can reuse. The While loop also has to take separate locks for every iteration instead of locking what it needs all at once like the SELECT does. All of that takes time.

My best suggestion to make the paradigm shift from writing procedural code in a language like C# is in my signature line below. First, make the realization the T-SQL will build the loops it needs for you and second, like the tagline says, "Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column" and let T-SQL figure out how to express what you want done to each row.

The really cool part about all of this is that you don't actually have to use anything from the table to use the table to loop. All you need is the "presence of rows" to get the job done. You'll find an example of this in the Tally Table article that was previously cited above by rhythmk but, if you have the time, try the following out. It'll help you make the paradigm shift I'm talking about. If you can make that shift, your life will be golden not only for counting but for many other areas of T-SQL, as well.

What does the following statement do?

 SELECT *
FROM sys.all_columns
;


Other than offending most people's sensibilities about the use of "*" returning to much data (too many columns), most people would say that the code returns all columns of the table and they'd be partially correct. The other thing that it does is, because there is no limit expressed in the code, it also returns all of the rows. If you look in the lower right corner of the screen, you'll see the number of rows the query returned and, for this table, I absolutely guarantee that number will always be more than 4,000. Write the number of rows returned down. We'll need it again.

Next experiment. What does the following code do? And... what doesn't it do?

 SELECT 1
FROM sys.all_columns
;


The obvious thing that it does is that it returns a 1 for every row in the table. You can easily verify that by comparing the number you wrote down in the first experiment with the number of rows this query returned.

The question is, what didn't it do?

The answer is that it didn't use ANY data from the table. It only used the "presence of rows" from the table. To translate that, we used the Pseudo Cursor behind the scenes of the SELECT as a loop to return a certain number of ones, that number being the count of all the rows in the table.

Now, let's do something a bit more practical. Let's count from 1 to however many rows there are in the table. Don't think of it as creating X rows with an incremental count. Think of it as creating a column that has an incremental count up to a limit. Let T-SQL worry about how to create the rows.

 SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns
;


On my little 2005-based test machine, that returns a column of numbers number from 1 to 4,168 even on a brand new, totally empty database. Your number may differ a bit but it will always be > 4,000.

That does bring up the next question. Let's just say that's the largest table in the database and we want to count to, say, 20,000. In C#, you would simply change the max limit of your While loop and you'd be done. In T-SQL, we actually have to provide a larger "Presence of Rows". In C#, if the largest number you could count to was 4,000, you could do it by making a nested loop where each iteration of an outer loop would cause the inner loop to iterate the 4,000 times it's supposed to.

In T-SQL, we can build such a nested loop simply by doing a CROSS JOIN. To keep us from building the 16 Million rows that a nested loop of 4,000 in each loop would deliver, we have to provide a "Stop" value just like you would have to do in C# on the inner loop. Instead of having to worry how to do that in a loop, we just tell T-SQL to stop when it has 20,000 values in the column. We use TOP for this and let T-SQL figure out the rest. Rather than defining the content of each row and evaluating each row for the max value we want, we simply define what we want in the column (as we did before) and when to quit. Like this...

 SELECT TOP 20000
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;


Taking the next step, think of the CROSS APPLY that Luis used as a "For Each Row". His use of CROSS APPLY replaces the nested SELECT in your WHILE loop so that for each of the 1,000 items selected from the Chest table, it'll return X number of rows from the Items table depending on the random number generator that you used (ABS(CHECKSUM(NEWID()))%range+offset). Behind the scenes, it does exactly the same looping that you did in your WHILE loop but in a more "compiled" fashion rather than an "interpretive" fashion.

Again, the key to success for a great deal of what can be done in T-SQL is to understand that T-SQL loops behind the scenes in a very effective manner and your goal is simply to let T-SQL do that for you instead of overriding what it's good at. Tell T-SQL what you want it to do and not how to do it. In other words (again), "Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
"


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1569639
Posted Sunday, May 11, 2014 11:37 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 7:34 PM
Points: 36,978, Visits: 31,498
Luis Cazares (5/11/2014)
Your query looks just like mine but using a while loop instead of a single operation. Why would you do that.


Because and like he said, he's a C# procedural programmer just making his first move from a procedural language to a declarative language and hasn't yet made the necessary paradigm shift. Based on how he formatted the code and the fact that he understands the use of NEWID() in conjuction with modulus to produce a constrained random integer, he'll get it. It's just going to take some time like the first time most of us started working with T-SQL.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1569643
Posted Sunday, May 11, 2014 7:32 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, May 18, 2014 3:51 PM
Points: 4, Visits: 13
Thanks Jeff for the detailed insight on the necessary shift in programmatic perspective, I truly appreciate the help!

Right now I'm taking the training courses offered through CBT Nuggets on 70-461 and I can see that there is a nugget entirely devoted to row-based transactions vs. set-based transactions, so I'm confident I'll shortly have a better understanding of how a T-SQL programmer needs to think relative to relative to relational databases (and set theory, at a higher level).

For a better understanding of what I've been trying to accomplish, I'll post some results below:

Re: The ChestItems table, It's purpose is to be a one-to-many join table between Chests and Items. A chest can have many Items, but an Item can have one Chest:
ChestItemsID ChestID     ItemID
------------ ----------- -----------
1 1 60002
2 1 60014
3 1 60015
4 1 60021
5 1 60031
6 1 60038
7 2 60004
8 2 60007
9 2 60012
10 2 60019


The purpose of the WHILE loop around the INSERT was to ensure that I was putting x Items into the same chest. If there is a better method of implementation, please share it with me. I am humbled by the help I've already received and appreciate any more that comes my way.
Post #1569662
Posted Sunday, May 11, 2014 8:03 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 7:34 PM
Points: 36,978, Visits: 31,498
tombiagioni1983 (5/11/2014)
If there is a better method of implementation, please share it with me.


There's is indeed. See Luis' code, which he previously posted. Remember that CROSS APPLY is akin to "For Each Row".


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1569667
Posted Sunday, May 11, 2014 8:17 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 2:51 PM
Points: 1,703, Visits: 4,490
Here is a quick "translation" to play around with

USE tempdb
Go

DECLARE @CHEST_COUNT INT = 1000;

/* Seed for the Inline Tally */
;WITH TN(N) AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1)) AS X(N))

SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ChestItemsID
,CHEST.N AS ChestID
,ITEM.N AS ItemID
FROM
(
SELECT TOP (@CHEST_COUNT) ROW_NUMBER() OVER
(ORDER BY (SELECT NULL)) AS N FROM TN T1, TN T2
, TN T3, TN T4, TN T5, TN T6, TN T7, TN T8, TN T9
) AS CHEST
CROSS APPLY
(
SELECT TOP ((ABS(CHECKSUM(NEWID())) % (CHEST.N * 2)) + 1) ROW_NUMBER() OVER
(ORDER BY (SELECT NULL)) + 6000 AS N FROM TN T1, TN T2
, TN T3, TN T4, TN T5, TN T6, TN T7, TN T8, TN T9
) AS ITEM;

Result example
ChestItemsID  ChestID   ItemID
------------- --------- -------
1 1 6001
2 2 6001
3 2 6002
4 2 6003
5 2 6004
6 3 6001
7 4 6001
8 4 6002
9 4 6003
10 4 6004
11 4 6005
12 4 6006
13 4 6007
Post #1569671
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse