SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Building Test Data


Building Test Data

Author
Message
Steve Jones
Steve Jones
SSC Guru
SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)

Group: Administrators
Points: 540228 Visits: 20703
Comments posted to this topic are about the item Building Test Data

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)

Group: General Forum Members
Points: 790213 Visits: 45936

A couple of decades ago, long before the words "Test Driven Development" ever parted anyone's lips, we called such reliable and safe test data a "Gold Set". And, yes, if the space that they occupied could be weighed, they were worth their weight in Gold.



--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Andy Robertson
Andy Robertson
SSCrazy
SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)

Group: General Forum Members
Points: 2321 Visits: 450

Totally agree with Jeff. A good set of test data is priceless. It's also expensive to set it up. I've set up realistic test data for demo websites to the point that they want to see aggregate data improve over time. You know, just so that the client can see that things will definitely get better!
To make test data realistic is very difficult and time consuming and I've always used real data as the basis for the test data.
We set up a very useful function on our development website where the internal tester can see the external mail addresses in the notification emails. The dev web application effectively only uses our domain email addresses and adds the external email addresses to the end of the email for reference. When testing the internal client can see which external client would have received the emails so they can be sure the right people will be notified on the live site.


dsor
dsor
SSC-Addicted
SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)

Group: General Forum Members
Points: 406 Visits: 655
I think a key component in making this feasible is to stop designing systems to achieve a result and instead have them model a process accurately. It would mean we'd have specs for which data can exist and when/where it exists. Generating test data from these specs would also be relatively easy, since you're not trying to guess what the representative set in production is and instead have a recipe for creating one. Detailed data contracts for external communication would be a prerequisite for this though.
ben.kentzer
ben.kentzer
Grasshopper
Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)

Group: General Forum Members
Points: 10 Visits: 51
We use a subset of real data as a basis to ensure that linkages are all going to work, but it's then all anonymised and reviewed by a second person to ensure that it really is going to be okay. This way, the system will work in the same way as live (rejecting records that are invalid or orphaned for example) and loading data that has the full referential integrity as expected. Due to the fact that we're testing batch loads into data warehouses most of the time, we have rafts of test data (separate from the development samples that we build for the developers) with a myriad of scenarios that will ensure that we cover most bases - new scenarios are added all the time. We build processes to build the test data so that it can be regenerated on demand. We're not talking huge volumes of data here, as this is essentially functional testing at this stage.

All of our emails are sent via a separate SQL / SSIS application (we've had issues with email servers going offline just as we were about to send out a "success" email which then caused the job to keel over). This means that the email requests are dropped into a SQL table, and then batch sent later. The test version of the email system doesn't actually send emails until you ask it to (the live version is a 24x7 job, test isn't) and so we can review what should be sent before it is. The test system doesn't have authority to relay outside the organisation either as a separate safety feature. The other safety feature which helps prevent us sending out emails to the wrong people is that all email distribution lists are managed within the email app, rather than in the SSIS / SQL process. This also means that we can easily change the recipients of emails when job changes happen without having to re-deploy the system. The system just requests an email to be sent to "Client1" of type "File Loaded Successfully" and the distribution list is retrieved from the email system at send time. Given that we have a separate test system, we have a separate test email table so we have different email lists pointing to the same email type.

Ben
kerry_hood
kerry_hood
SSC-Addicted
SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)SSC-Addicted (406 reputation)

Group: General Forum Members
Points: 406 Visits: 184
Does anyone outside safety-critical industries (eg aerospace) actually work with proper test sets? Perhaps it's just my (30-year) experience, but I'm yet to work anywhere which has proper test data sets (that are not yesterdays/last weeks' backup restored), and that's across insurance, pensions, oil exploration, finance,merchant banking, and logistic sectors... everywhere wants their software yesterday and is not prepared to put in the time/money/people to test it properly.
Steve Jones
Steve Jones
SSC Guru
SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)

Group: Administrators
Points: 540228 Visits: 20703
Jeff Moden - Wednesday, June 13, 2018 9:26 PM

A couple of decades ago, long before the words "Test Driven Development" ever parted anyone's lips, we called such reliable and safe test data a "Gold Set". And, yes, if the space that they occupied could be weighed, they were worth their weight in Gold.



Yes, they are.

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Steve Jones
Steve Jones
SSC Guru
SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)SSC Guru (540K reputation)

Group: Administrators
Points: 540228 Visits: 20703
ben.kentzer - Thursday, June 14, 2018 3:36 AM
...
- new scenarios are added all the time.

We build processes to build the test data so that it can be regenerated on demand. We're not talking huge volumes of data here, as this is essentially functional testing at this stage.

The key parts of this


Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
Eric M Russell
Eric M Russell
SSC Guru
SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)

Group: General Forum Members
Points: 96484 Visits: 13895
Frankly, when I want to unit test a stored procedure in an environment where the data, system specs, and workload as as close as possible to actual production, there have been (some) occasions where I tested it in... production. Well it's sort of in production. OK, I don't unit test this way with procedures which modify data, and I'm not talking about deploying a development version of a procedure on top of an existing production procedure. I'm simply talking about deploying a development version of a stored procedure to production in a special schema called UnitTest, and then intentionally execute it during normal business hours to confirm it's performing as expected and doesn't cause blocking issues. I'll also run the unit test under the context of an account with permissions limited only to executing procedures within the UnitTest schema. I'm not saying (when) or (where) I've done this, only that I've done it in the past.

This is more secure than copying data from production to a development or test environment, and performance tuning a stored procedure in an isolated environment is not a true test of how it will perform in production.


"The universe is complicated and for the most part beyond your control, but your life is only as complicated as you choose it to be."
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)SSC Guru (790K reputation)

Group: General Forum Members
Points: 790213 Visits: 45936
kerry_hood - Thursday, June 14, 2018 3:58 AM
Does anyone outside safety-critical industries (eg aerospace) actually work with proper test sets? Perhaps it's just my (30-year) experience, but I'm yet to work anywhere which has proper test data sets (that are not yesterdays/last weeks' backup restored), and that's across insurance, pensions, oil exploration, finance,merchant banking, and logistic sectors... everywhere wants their software yesterday and is not prepared to put in the time/money/people to test it properly.


We won't deploy to production without having gone through the full gauntlet including proper test data sets that have also gone through a strong validation gauntlet. But, I do agree... according to a lot of the posts on this very forum and others, it seems that there are a whole lot of people that view testing (especially real QA/UAT) as some sort of annoyance ("expert" arrogance) on a daily basis. Really spooky out there.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum








































































































































































SQLServerCentral


Search