Blog Post

Generate a random number of children for each parent

,

I was asked an interesting question the other day.

Is it possible to get a different random number of rows back from each application of a cross apply?

The purpose is to create some random demo/testing information. This is one of those cases where an example may be needed for this to make sense. So, let’s say we have a group of students each of whom will have signed up for some classes. In order to create some test data, we want each student to be assigned to a different random number of classes.

CREATE TABLE Students (
StudentId INT NOT NULL IDENTITY(1,1),
FirstName varchar(50),
LastName varchar(50)
);
CREATE TABLE Classes (
ClassId INT NOT NULL IDENTITY(1,1),
Name varchar(50)
);
INSERT INTO Students VALUES
('Bob','Smith'),
('Joe','Jones'),
('Chris','Cross'),
('Amy','Fisher'),
('Barbara','Marshal')
INSERT INTO Classes VALUES
('Math 101'), ('English 101'), ('Spanish 101'),
('Theater 101'), ('Music 101'), ('Robotics 101'),
('History 101'), ('Biology 101'), ('Programming 101'),
('Math 201'), ('English 201'), ('Spanish 201'),
('Theater 201'), ('Music 201'), ('Robotics 201'),
('History 201'), ('Biology 201'), ('Programming 201');

So what I want to get, is a random selection of classes for each student. And not just a random set of values, but a random number of them. To start with I’ll be using TOP and (ABS(CHECKSUM(NEWID()) % 5)) to generate a random number number of rows. I’m also using CROSS APPLY because that will call the subquery once for each row returned by the outer query. At least that’s the way I understand it.

SELECT Students.FirstName, Students.LastName, Classes.Name
FROM Students
CROSS APPLY (SELECT TOP (ABS(CHECKSUM(NEWID()) % 4) + 1) * 
FROM Classes) Classes
ORDER BY Students.FirstName, Students.LastName, Classes.Name;

So far so good. Unfortunately this way everyone ended up with Math 101 since, even though there isn’t an order specified it’s still most likely to pull in the order the rows were inserted. So let’s try throwing a ORDER BY NEWID() to get a random order of the rows as well.

SELECT Students.FirstName, Students.LastName, Classes.Name
FROM Students
CROSS APPLY (SELECT TOP (ABS(CHECKSUM(NEWID()) % 4) + 1) * 
FROM Classes ORDER BY NEWID()) Classes
ORDER BY Students.FirstName, Students.LastName, Classes.Name;

Well, better, but now every student is getting the same number of rows and the same classes. But at least it’s a different set of classes each time I run the query. Off guess it’s because I’ve got NEWID() in the subquery twice now, but I can’t be sure. What I did notice though is that if I make the subquery correlated (use some value from the outer query) it fixed it.

SELECT Students.FirstName, Students.LastName, Classes.Name
FROM Students
CROSS APPLY (SELECT TOP (ABS(CHECKSUM(NEWID()) % 4) + 1) * 
FROM Classes ORDER BY NEWID(), Students.FirstName) Classes
ORDER BY Students.FirstName, Students.LastName, Classes.Name;

And now I get a random set of classes for each student. Probably not something I’ll have to do very often but it did make for an interesting exercise.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating