query puzzle

Question

query puzzle

David Walker-409102

SSC Veteran

Points: 234
More actions
April 14, 2017 at 3:02 pm

#324365

Hello - I have an existing SQL database that keeps track of student information. We use a field called STU.NID to store a unique Network ID number (a 5 digit, sequential number).
I’m trying to write a query that will determine if a student is missing a value for STU.NID. If so, the student’s record should be updated with the next NID number in the sequence.
At this point the query throws an error about the stu.nid in line three not being bound. Is that because WHILE only works with variables? Once I determine what the next NID is, how to update just one student with that one NID then check and go through the loop again?

I’ve read that loops take up a lot of resources and should generally be avoided in SQL. Can anyone suggest a non-loop solution to this puzzle?
Thank you in advance for your feedback
David
DECLARE @nextNID smallint
select NID from STU
while stu.nid = ' '
BEGIN
set @nextNID = ((SELECT MAX(NID) FROM STU) + 1)
select top 1 STU.ID, STU.NID
              FROM STU
                     WHERE
                           STU.NID = ' ' and
                           stu.sc in (11,12,14,15,16,17,27,28) and
                           stu.tg = ' '
update STU
              set NID = @nextNID
END

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Alan Burstein SSC Guru Points: 61136 More actions · Answer 1

Yes, loops bad. Set-based good.

First for some sample data:
IF OBJECT_ID('tempdb..#stu') IS NOT NULL DROP TABLE #stu; CREATE TABLE #stu(nid int NULL, student_name varchar(100) NOT NULL); INSERT #stu VALUES (1000, 'billy'), (1109, 'sue'), (NULL, 'fred'), (NULL, 'pete'); GO

Now we have:
SELECT * FROM #stu;
nid student_name ----------- ------------ 1000 billy 1109 sue NULL fred NULL pete

Now for a loop-free solution:
WITH maxid AS (SELECT maxid = MAX(nid) FROM #stu), -- get the highest nid value orphins AS ( SELECT #stu.*, new_id = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))+maxid -- starting at the highest nid, add 1 FROM #stu CROSS JOIN maxid WHERE nid IS NULL ) UPDATE orphins SET nid = new_id; -- set the nid to be the new_id value

Now we have:
nid student_name ----------- ------------- 1000 billy 1109 sue 1110 fred 1111 pete

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

David Walker-409102 SSC Veteran Points: 234 More actions · Answer 2

Thank you for your reply, Alan. I modified your query to populate the temp table with actual student ID numbers. When I run, it does create the temp table, and it does include all of the student numbers but when it comes to the second part it doesn't populate any NIDs. Then I realized that in your example you seeded the NID field with a couple of demo NIDs. So, for the lowest ID# I set the NID = to 11111 and tried to run the second part (after the table creation) but it doesn't do anything, - no errors, just 0 rows affected. Any suggestions?
thanks again 🙂
David

heres the updated script...

IF OBJECT_ID('tempdb..#stu1') IS NOT NULL DROP TABLE #stu1 CREATE TABLE #stu1 (NID varchar(255) NULL, ID int NOT NULL) INSERT #stu1 (NID, ID) select NID, ID FROM STU GO

WITH maxid AS (SELECT maxid = MAX(nid) FROM #stu1), -- get the highest nid value orphins AS ( SELECT #stu1.*, new_id = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))+maxid -- starting at the highest nid, add 1

FROM #stu1 CROSS JOIN maxid WHERE nid IS NULL ) UPDATE orphins SET nid = new_id; -- set the nid to be the new_id value

--*************************** update #stu1 set NID = 11111 where ID = 1101700

select top 3 * from #stu1

drew.allen SSC Guru Points: 76944 More actions · Answer 3

Alan.B - Friday, April 14, 2017 3:17 PM
Yes, loops bad. Set-based good.
First for some sample data:
IF OBJECT_ID('tempdb..#stu') IS NOT NULL DROP TABLE #stu; CREATE TABLE #stu(nid int NULL, student_name varchar(100) NOT NULL); INSERT #stu VALUES (1000, 'billy'), (1109, 'sue'), (NULL, 'fred'), (NULL, 'pete'); GO
Now we have:
SELECT * FROM #stu;
nid student_name ----------- ------------ 1000 billy 1109 sue NULL fred NULL pete
Now for a loop-free solution:
WITH maxid AS (SELECT maxid = MAX(nid) FROM #stu), -- get the highest nid value orphins AS ( SELECT #stu.*, new_id = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))+maxid -- starting at the highest nid, add 1 FROM #stu CROSS JOIN maxid WHERE nid IS NULL ) UPDATE orphins SET nid = new_id; -- set the nid to be the new_id value
Now we have:
nid student_name ----------- ------------- 1000 billy 1109 sue 1110 fred 1111 pete

This approach requires reading the table twice: once to find the max nid and once to find the null records. Here is an approach that only requires reading the table once.

; WITH orphans AS ( SELECT *, MAX(nid) OVER() + ROW_NUMBER() OVER(ORDER BY nid) AS new_nid FROM #stu ) UPDATE o SET nid = new_nid FROM orphans o WHERE nid IS NULL ;

I should mention that this working correctly depends on the fact that NULL values are sorted before non-NULL values.

I also corrected the spelling of orphans.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

David Walker-409102 SSC Veteran Points: 234 More actions · Answer 4

thanks again - i see the problem, in our database the NID column does not allow NULLS. So I changed

WHERE nid IS NULL
to
WHERE nid = ' '
and it works great!
thank you for your help 🙂

David

jcelko212 32090 SSCrazy Eights Points: 9294 More actions · Answer 5

David Walker-409102 - Friday, April 14, 2017 3:02 PM

>> I have an existing SQL database that keeps track of student information. We use a field .sic. called STU.NID to store a unique network_id number (a 5 digit, sequential number). <<

Please look up what the term "field" means in SQL. You've confused with a column. Why are you still writing in all uppercase and using short names? I used to do that back in the 1960's when we were Fortran programmer and we had limited space on punchcards. You also don't know a row is not anything like a record! That usually covered the first week of any decent SQL class. I also see you expect us to write the DDL that you fail to post; this is not just counterproductive, it's very very rude.

>> Iâ€™m trying to write a query that will determine if a student is missing a value for STU.NID. If so, the studentâ€™s record [sic] should be updated with the next NID number in the sequence. <<

Since you didn't post any DDL, we don't know for using the CREATE SEQUENCE statement in your code. Again, sequence has a special meaning in SQL.

What is really worse is that you don't understand that, SQL is a declarative language. Declarative languages do not use while loops; that was your old COBOL and FORTRAN stuff which is basically what you're trying to write.

CREATE SEQUENCE network_id_seq
AS SMALLINT
START WITH 00000
INCREMENT BY 1
MINVALUE 00000
MAXVALUE 99999;

CREATE TABLE STU -- what is a STU?
(..
network_id SMALLINT NOT NULL
CHECK (network_id BETWEEN 00000 AND 99999)
DEFAULT NEXT VALUE FOR network_id_seq,
..

Having said all of this, the use of numeric values for identifiers is wrong. Get any book on basic data modeling. We do not do math on identifiers; what is the square root of your credit card number? With a little work. You can cast the small integer into a string of five digits (these things are called tag numbers in data modeling) and use it as default in your STU table. By the way, what the heck is an STU? I'm assuming that it's a meaningful name in your trade, but it really doesn't mean much to anybody else. Professionals write code that can be maintained by newcomers who just to the job; amateurs write code that is cryptic and gives them job security 🙂

Please post DDL and follow ANSI/ISO standards when asking for help.

drew.allen SSC Guru Points: 76944 More actions · Answer 6

David Walker-409102 - Monday, April 17, 2017 4:32 PM
thanks again - i see the problem, in our database the NID column does not allow NULLS. So I changed
WHERE nid IS NULL
to
WHERE nid = ' '
and it works great!
thank you for your help 🙂
David

Is your nid column (N)(VAR)CHAR? This would cause implicit conversion when adding the row number and another when converting back to a string. This could cause a conversion error if your nid contains any value that is non-numeric. You could also run into problems where the nid value is unique, but the converted value is not, e.g., '1', '01', and '001'.

Drew

PS: Ignore Joe on this. He's overly concerned with minutiae.

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

Mr. Brian Gale SSC-Insane Points: 24689 More actions · Answer 7

jcelko212 32090 - Tuesday, April 18, 2017 10:36 AM