SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

SQL Data Generator–Getting a value based on another column

This is a series on SQL Data Generator, covering some interesting scenarios I’ve run into. If you’ve never tried it, SQL Data Generator is a part of the SQL Toolbelt. Give it a try today with an evaluation today.

One of the things that people often want to do is generate data, but limit the generation to some data in another column. Here’s a good example. Let’s suppose I have some data that represents a balance in an account. That’s in a column we’ll call Balance. In another column, I have a status that is either OK or Overdrawn, depending on whether the Balance column is positive or negative.

If I perform a random generation on these columns, I’ll get some strange data. Sometimes the data below matches up, sometimes it doesn’t. I have positive numbers as Overdrawn and negatives as OK

2017-10-05 10_51_08-SQL Data Generator - New project _

Let’s fix that.

In Data Generator, I have a variety of choices for the generators. Let’s look at what I can do for the Status column.

2017-10-05 10_52_27-SQL Data Generator - New project _

Certainly there are RegEx and Python scripts, but there’s a nice “Cross Column” section with some examples. In this case, let’s look at the Age in Years generator. The definition is:

2017-10-05 10_53_11-SQL Data Generator - New project _

In this case, it’s a simple .NET date function and some math. I can do that. Most importantly, I can see the “Insert Column Name”, which lets me pick another column in my table.

Python

The language of choice in SQL Data Generator is Python, specifically Iron Python. Outside of C# Datatime values, Python is needed. If you examine any of the other cross column items, you’ll see we need a main() function that returns something.

In this case, it’s a simple expression. I’ll use an If statement to check if the Balance is >= 0. Here’s a Python construct.

def main(config):
     if Balance >= 0:
         return “OK”
     else:
         return “Overdrawn”

Whitespace matters, as does indentation. If I put this in like so:

2017-10-05 10_57_57-SQL Data Generator - New project _

I’ll get this. Notice that the status is correct.

2017-10-05 10_58_05-SQL Data Generator - New project _

In a real project, you may have more complicated logic, or more likely, status values. One way to handle those is to use a Python function and return the appropriate values for your system.

You can build some complex and interesting data generation projects with SQL Data Generator. Give it a try today.


Filed under: Blog Tagged: Data Generator, Redgate, syndicated

The Voice of the DBA

Steve Jones is the editor of SQLServerCentral.com and visits a wide variety of data related topics in his daily editorial. Steve has spent years working as a DBA and general purpose Windows administrator, primarily working with SQL Server since it was ported from Sybase in 1990. You can follow Steve on Twitter at twitter.com/way0utwest

Comments

Leave a comment on the original post [voiceofthedba.com, opens in a new window]

Loading comments...