One of the debates I've seen over the last few months is about test data in development environments. As I've been preparing for and learning more about the GDPR, it seems that many companies are concerned about holding sensitive data in their development systems. I think it's a valid concern, and I've often had to deal with this issue in the past, before any regulation impacted my work.
In one of my early jobs, we stored emails from customers in a table. We also had an email feature for our application. Needless to say, we needed to test that, though we didn't want to obviously send emails to real customers when we were testing some sort of feature. I've done that, sent those emails accidentally, and usually it resulted in a complaint and some scolding of the development staff. As a result, I learned to ensure that anytime we restored production to our QA system, we ran a script that would either change all emails to invalid values, or reset them to something we could use in a test system. In some cases, we'd reset them all to a specific address that we could check to see if the emails actually were sent.
In talking to many people, they often don't build test data for development systems because the data isn't valid. What a developer thinks about, or what might be randomly generated by some utility often isn't seen as valuable. Most developers want to see real data, perhaps because they can then better relate their work or a specific feature to the actual live system. Maybe it's easier to see actual customers, products, accounts, etc. when working with clients or testers, but I do think that certainly in light of the GDPR and other regulation, there are risks here.
While many people want to just restore production to refresh environments, I do think it's a poor idea to use actual sensitive data. Even if you trust your developers, there have been no shortage of attacks against development systems, loss of laptops or other files with production data that were intended for developers. We just don't secure test and development environments like production, and that means we are making a fundamental error in how work habits.
I've gone through different views on this topic across the last couple decades, and now I want some real data, some not real data. What I'd really want is a bunch of random data that is close to production, mimicking the shape and skew of production, but without any sensitive data. Then what I'd like is a set of known cases of data that are the types of data that we need to ensure works in our system. Various cases of transactions and values designed to cover the functional edges that we support.
Of course building these sets isn't' always easy, and it's never going to be done. As long as we write software, we will need to maintain tests and data alongside the code. I do think this is a worthwhile investment in regulated industries, and likely worth doing in all industries. The thing is, it's not interesting or fun, and likely not to be ever be done in most organizations. I'd like to change that, and I hope you do as well.
The Voice of the DBA podcast features music by Everyday Jones. No relation, but I stumbled on to them and really like the music.
What’s the top challenge faced by SQL Server professionals in 2018?
Learn how 626 SQL Server professionals monitor their estates in our new report on the State of SQL Server Monitoring. Discover the challenges currently facing the industry, and what is coming next. Download your free copy of the report
Become a more efficient SQL developer with SQL Prompt
Learn how to write SQL faster and more efficiently with these exclusive short videos from Data Platform MVPs and SQL Server experts. With SQL Prompt you can strip out the repetition of coding and write SQL 50% faster. Check out the tips
As the trend in development shifts toward more agile delivery, and customers clamor for continuous deployment, this model of monolithic code releases begins to strain. Automating your release pipeline can cut down on your release cycle time and alleviate many of the Release Day issues that arise from still trying to manually deploy code. More »
Its possible to use some comparable expressions to a full regular expression library for matching certain patterns with T-SQL using the like operator. This tutorial uses these expressions for filtering price phrases involving alphabetic, numeric, and special characters More »
Monday Grant Fritchey (b/t) put up a great post EXTENDED EVENTS, THE SYSTEM_HEALTH SESSION, AND WAITS that talks about how... More »
Question of the Day
Today's Question (by Steve Jones):
I have a SQL Server 2017 RTM instance that isn't very busy. I have the minimum server memory set to 8GB, but after restarting the host OS, I find that SQL Server is only using about 2GB of RAM. Why is this?
Think you know the answer? Click here, and find out if you are right.
We keep track of your score to give you bragging rights against your peers.
This question is worth
1 point in this category: Memory.
We'd love to give you credit for your own question and answer.
To submit a QOTD, simply log in to the
The 2014 release of Microsoft's SQL Server Integration Services provides enhancements for managing extraction, transformation, and load operations, plus expanded in-memory capabilities, improved disaster recovery, increased scalability, and much more. The increased functionality will streamline your ETL processes and smooth out your workflow, but the catch is that your workflow must change. New tools come with new best practices, and Professional Microsoft SQL Server 2014 Integration Services will keep you ahead of the curve. SQL Server MVP Brian Knight is the most respected name in the business, and your ultimate guide to navigating the changes to use Microsoft SQL Server Integration Services 2014 to your utmost advantage. Get your copy from Amazon today.
This newsletter was sent to you because you signed up at SQLServerCentral.com.
Feel free to forward this to any colleagues that you think might be interested.
If you have received this email from a colleague, you can register to receive it here.