SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

SQL Data Generator–Getting a value based on another column

This is a series on SQL Data Generator, covering some interesting scenarios I’ve run into. If you’ve never tried it, SQL Data Generator is a part of the SQL Toolbelt. Give it a try today with an evaluation today.

One of the things that people often want to do is generate data, but limit the generation to some data in another column. Here’s a good example. Let’s suppose I have some data that represents a balance in an account. That’s in a column we’ll call Balance. In another column, I have a status that is either OK or Overdrawn, depending on whether the Balance column is positive or negative.

If I perform a random generation on these columns, I’ll get some strange data. Sometimes the data below matches up, sometimes it doesn’t. I have positive numbers as Overdrawn and negatives as OK

2017-10-05 10_51_08-SQL Data Generator - New project _

Let’s fix that.

In Data Generator, I have a variety of choices for the generators. Let’s look at what I can do for the Status column.

2017-10-05 10_52_27-SQL Data Generator - New project _

Certainly there are RegEx and Python scripts, but there’s a nice “Cross Column” section with some examples. In this case, let’s look at the Age in Years generator. The definition is:

2017-10-05 10_53_11-SQL Data Generator - New project _

In this case, it’s a simple .NET date function and some math. I can do that. Most importantly, I can see the “Insert Column Name”, which lets me pick another column in my table.


The language of choice in SQL Data Generator is Python, specifically Iron Python. Outside of C# Datatime values, Python is needed. If you examine any of the other cross column items, you’ll see we need a main() function that returns something.

In this case, it’s a simple expression. I’ll use an If statement to check if the Balance is >= 0. Here’s a Python construct.

def main(config):
     if Balance >= 0:
         return “OK”
         return “Overdrawn”

Whitespace matters, as does indentation. If I put this in like so:

2017-10-05 10_57_57-SQL Data Generator - New project _

I’ll get this. Notice that the status is correct.

2017-10-05 10_58_05-SQL Data Generator - New project _

In a real project, you may have more complicated logic, or more likely, status values. One way to handle those is to use a Python function and return the appropriate values for your system.

You can build some complex and interesting data generation projects with SQL Data Generator. Give it a try today.

Filed under: Blog Tagged: Data Generator, Redgate, syndicated

The Voice of the DBA

Steve Jones is the editor of SQLServerCentral.com and visits a wide variety of data related topics in his daily editorial. Steve has spent years working as a DBA and general purpose Windows administrator, primarily working with SQL Server since it was ported from Sybase in 1990. You can follow Steve on Twitter at twitter.com/way0utwest


Leave a comment on the original post [voiceofthedba.com, opens in a new window]

Loading comments...