When I was in college, and even high school, all of my computer science classes required me to build algorithms. Often they were simple things, like implement a sort, or reverse a string, or shuffle a deck of cards. Those seemingly silly and trivial exercises, however, build the skills of pattern recognition and implementation in computer science. Sometimes I think we don’t do enough of that for people that are tackling computer careers these days.

I saw a post from someone that had an incrementing column, an identity, that impacted another field. Basically whenever the first column reached “10”, you wanted to add one to the second column.

Easy, right? I think so, and to show someone how they might create an update statement, or even see the pattern, I built a quick tally table.

SELECT Top 205 IDENTITY(INT,1,1) as N INTO Tally FROM master.dbo.syscolumns SC1, master.dbo.syscolumns SC2

From there, I then looked at the pattern. Every 10 items, I need to add one. That’s a pattern, and the way that pattern is easily discerned in math is with a modulo operation. To the rest of the world, that’s a remainder. If you look at the pattern of remainders of an increment divided by 10, it’s this:

n modulo

———– ———–

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 0

11 1

12 2

13 3

14 4

15 5

16 6

17 7

18 8

19 9

20 0

21 1

22 2

from this code:

SELECT Top 205 IDENTITY(INT,1,1) as N INTO Tally FROM master.dbo.syscolumns SC1, master.dbo.syscolumns SC2 SELECT n , n % 10 FROM Tally DROP TABLE tally

That mans that we can see each time there is a zero remainder, we want to perform an increment. So essentially if you detect an update, do a modulo, and get a zero, then you update the next column.

It gets a little more complicated if there can be multiple rows updated or added at once, but here is the overall code that essentially builds a table of numbers that increment for each 10 on the previous value.

SELECT Top 205 IDENTITY(INT,1,1) as N INTO Tally FROM master.dbo.syscolumns SC1, master.dbo.syscolumns SC2 DECLARE @b INT SELECT @b = 1 SELECT n , n % 10 , @b , n , @b + (1 * (n / 10)) 'col b' , CASE WHEN (n % 10) = 0 THEN 'add 1' ELSE '' END FROM Tally DROP TABLE tally

You end up with this:

n n col b

———– ———– ———– ———– ———– —–

1 1 1 1 1

2 2 1 2 1

3 3 1 3 1

4 4 1 4 1

5 5 1 5 1

6 6 1 6 1

7 7 1 7 1

8 8 1 8 1

9 9 1 9 1

10 0 1 10 2 add 1

11 1 1 11 2

12 2 1 12 2

13 3 1 13 2

14 4 1 14 2

15 5 1 15 2

16 6 1 16 2

17 7 1 17 2

18 8 1 18 2

19 9 1 19 2

20 0 1 20 3 add 1

21 1 1 21 3

22 2 1 22 3

23 3 1 23 3

If I had started at zero, you’d see a more traditional increment of 0 for column b to start with.

Filed under: Blog Tagged: syndicated, T-SQL

## Posted by Joe Celko on 23 April 2011

Two weeks ago I did a job interview at a local company. The first step was to log onto a website and take a test. The first question involved a product id that was made from the permutations of {0,1,2,3,4,5} with restrictions among the different positions in the string (first digit must be twice the second digit, etc). The questions were along the lines of "Can 3 ever be to right of 4?" and so forth.

I did it in SQL (of course!) and I noticed that my mindset had changed since my freshman algorithms class. In my youth, I would have worked each question by itself, applying the restrictions originally given. very deductive, proof oriented because I was a math major. We know that product ids have to start with (42****) or (21****), etc.

Today, I wrote a SELECT and generated all valid product ids (less than ten of them), then visually searched the listing. Brute force, data oriented. I even turned in my statement at the live interview.

They then puled out a PC and asked me to do "FizzBuzz", which is a famous interview question. Write a program in your favorite language that prints the numbers from 1 to 100. But for multiples of three print "Fizz" instead of the number and for the multiples of five print "Buzz". For numbers which are multiples of both three and five print "FizzBuzz". (www.codinghorror.com/.../why-cant-programmers-program.html).

Nobody does it in SQL. But it was easy with a Series table and a CASE expression. The hard part was guessing if a CTE or a materialized Series table was better, how to scale it or generalize it to other divisors.

In SQL, I was worried about the system as a whole -- a Series table can be shared among invocations of the procedure, while a CTE would re-create the same data over and over.

It is interesting to feel the mindset changes..

## Posted by Joe Celko on 23 April 2011

Two weeks ago I did a job interview at a local company. The first step was to log onto a website and take a test. The first question involved a product id that was made from the permutations of {0,1,2,3,4,5} with restrictions among the different positions in the string (first digit must be twice the second digit, etc). The questions were along the lines of "Can 3 ever be to right of 4?" and so forth.

I did it in SQL (of course!) and I noticed that my mindset had changed since my freshman algorithms class. In my youth, I would have worked each question by itself, applying the restrictions originally given. very deductive, proof oriented because I was a math major. We know that product ids have to start with (42****) or (21****), etc.

Today, I wrote a SELECT and generated all valid product ids (less than ten of them), then visually searched the listing. Brute force, data oriented. I even turned in my statement at the live interview.

They then puled out a PC and asked me to do "FizzBuzz", which is a famous interview question. Write a program in your favorite language that prints the numbers from 1 to 100. But for multiples of three print "Fizz" instead of the number and for the multiples of five print "Buzz". For numbers which are multiples of both three and five print "FizzBuzz". (www.codinghorror.com/.../why-cant-programmers-program.html).

Nobody does it in SQL. But it was easy with a Series table and a CASE expression. The hard part was guessing if a CTE or a materialized Series table was better, how to scale it or generalize it to other divisors.

In SQL, I was worried about the system as a whole -- a Series table can be shared among invocations of the procedure, while a CTE would re-create the same data over and over.

It is interesting to feel the mindset changes..