SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

Generate a random number of children for each parent

I was asked an interesting question the other day.

Is it possible to get a different random number of rows back from each application of a cross apply?

The purpose is to create some random demo/testing information. This is one of those cases where an example may be needed for this to make sense. So, let’s say we have a group of students each of whom will have signed up for some classes. In order to create some test data, we want each student to be assigned to a different random number of classes.

CREATE TABLE Students (
	StudentId INT NOT NULL IDENTITY(1,1),
	FirstName varchar(50),
	LastName varchar(50)
	);

CREATE TABLE Classes (
	ClassId INT NOT NULL IDENTITY(1,1),
	Name varchar(50)
	);

INSERT INTO Students VALUES
	('Bob','Smith'),
	('Joe','Jones'),
	('Chris','Cross'),
	('Amy','Fisher'),
	('Barbara','Marshal')

INSERT INTO Classes VALUES
	('Math 101'), ('English 101'), ('Spanish 101'),
	('Theater 101'), ('Music 101'), ('Robotics 101'),
	('History 101'), ('Biology 101'), ('Programming 101'),
	('Math 201'), ('English 201'), ('Spanish 201'),
	('Theater 201'), ('Music 201'), ('Robotics 201'),
	('History 201'), ('Biology 201'), ('Programming 201');

So what I want to get, is a random selection of classes for each student. And not just a random set of values, but a random number of them. To start with I’ll be using TOP and (ABS(CHECKSUM(NEWID()) % 5)) to generate a random number number of rows. I’m also using CROSS APPLY because that will call the subquery once for each row returned by the outer query. At least that’s the way I understand it.

SELECT Students.FirstName, Students.LastName, Classes.Name
FROM Students
CROSS APPLY (SELECT TOP (ABS(CHECKSUM(NEWID()) % 4) + 1) * 
			FROM Classes) Classes
ORDER BY Students.FirstName, Students.LastName, Classes.Name;

So far so good. Unfortunately this way everyone ended up with Math 101 since, even though there isn’t an order specified it’s still most likely to pull in the order the rows were inserted. So let’s try throwing a ORDER BY NEWID() to get a random order of the rows as well.

SELECT Students.FirstName, Students.LastName, Classes.Name
FROM Students
CROSS APPLY (SELECT TOP (ABS(CHECKSUM(NEWID()) % 4) + 1) * 
			FROM Classes ORDER BY NEWID()) Classes
ORDER BY Students.FirstName, Students.LastName, Classes.Name;

Well, better, but now every student is getting the same number of rows and the same classes. But at least it’s a different set of classes each time I run the query. Off guess it’s because I’ve got NEWID() in the subquery twice now, but I can’t be sure. What I did notice though is that if I make the subquery correlated (use some value from the outer query) it fixed it.

SELECT Students.FirstName, Students.LastName, Classes.Name
FROM Students
CROSS APPLY (SELECT TOP (ABS(CHECKSUM(NEWID()) % 4) + 1) * 
			FROM Classes ORDER BY NEWID(), Students.FirstName) Classes
ORDER BY Students.FirstName, Students.LastName, Classes.Name;

And now I get a random set of classes for each student. Probably not something I’ll have to do very often but it did make for an interesting exercise.

SQLStudies

My name is Kenneth Fisher and I am Senior DBA for a large (multi-national) insurance company. I have been working with databases for over 20 years starting with Clarion and Foxpro. I’ve been working with SQL Server for 12 years but have only really started “studying” the subject for the last 3. I don’t have any real "specialities" but I enjoy trouble shooting and teaching. Thus far I’ve earned by MCITP Database Administrator 2008, MCTS Database Administrator 2005, and MCTS Database Developer 2008. I’m currently studying for my MCITP Database Developer 2008 and should start in on the 2012 exams next year. My blog is at www.sqlstudies.com.

Comments

Leave a comment on the original post [sqlstudies.com, opens in a new window]

Loading comments...