I've been working through some of the GDPR legislation, trying to find ways to explain this more clearly to customers and ensure that our products make sense in light of this law taking effect. Redgate is focused in this area and not only do we need to ensure we are compliant, but we also want to ensure we are building tools that help ensure you are compliant.
In article 14, there's this text: "the controller shall provide the data subject with the following information... the existence of automated decision-making, including ... meaningful information about the logic involved." That sounds a little concerning for those of us that work with data. It's not always easy, but we can explain how a SUM or AVG function works, even with a complex OVER() clause and lots of joins and criteria.
What do we do with a model running under SQL Server Machine Learning Services? The output from those scripts and models is often created by the model, without any obvious way to determine how the results are determined. The requirement to explain is enshrined in a law, one that many people are concerned about. With all the ways that ML and AI systems can get gamed and perhaps contain biases based on the data used to train the model, I can certainly see no shortage of people asking for explanations of decisions or conclusions.
Fortunately Article 14 also has this part: "the provision of such information proves impossible or would involve a disproportionate effort ..." That seems to give companies an out if they are using current systems about which little is known about the black box of machine learning. Certainly organizations are still charged with protecting the data subjects rights and freedoms, but this seems allow for the use of technologies that we can't quite understand.
I doubt this was the intention of the authors, though I do hope that this doesn't prevent the use of newer tools and technologies. What I'd like to see take place is more research and understanding into how the various algorithms we want to use for ML and AI technologies work, perhaps with some more detailed analysis of the inner workings of the models.
GDPR is going to be an interesting regulation that may have dramatic impacts on the world of data. I'm both excited and concerned to see how things move forward from here. Hopefully this results in better and more responsible data handling and doesn't degenerate into a series of long term legal battles.
The Voice of the DBA podcast features music by Everyday Jones. No relation, but I stumbled on to them and really like the music.
Database migrations inside Visual Studio
Feeling the pain of managing and deploying database changes manually? Redgate ReadyRoll creates SQL migration scripts you can use to version control, build and release, and automate deployments. Try it free
Become a more efficient SQL developer with SQL Prompt
Learn how to write SQL faster and more efficiently with these exclusive short videos from Data Platform MVPs and SQL Server experts. With SQL Prompt you can strip out the repetition of coding and write SQL 50% faster. Check out the tips
Your organization’s culture of DevOps will often be the defining factor on whether you have a successful organizational change, or whether the changes that you implement impact pockets of your organization. More »
Technology is constantly moving forward, but it is also helpful to understand how we arrived where we are today. Joe Celko reminisces about the history of database design and how it relates to the concept of ‘Degree of Duplication’ in this article. More »
Database maintenance does not have to be expensive. There are free tools out there that will make your life easier. Of... More »
Question of the Day
Today's Question (by Steve Jones):
I have a database, called Finance, with Change Data Capture (CDC) enabled inside of it. During a DR event, we decide to restore the most recent backup of this database on another instance. How do I ensure that my CDC information is not lost?
Think you know the answer? Click here, and find out if you are right.
We keep track of your score to give you bragging rights against your peers.
This question is worth
1 point in this category: Backup and restore.
We'd love to give you credit for your own question and answer.
To submit a QOTD, simply log in to the
The Query Store changes the way you monitor performance on your databases and the way you tune the performance of those same databases. This book represents a deep dive into a large number of topics in and around the Query Store. Get your copy from Amazon today.
Yesterday's Question of the Day
(by Steve Jones):
I've got this data set:
rank player.name year2017 yards2017
1 1 Tom Brady 2017 4577
2 2 Philip Rivers 2017 4515
3 3 Matthew Stafford 2017 4446
4 4 Drew Brees 2017 4334
5 5 Ben Roethlisberger 2017 4251
I want to add a column to track how many yards each person is trailing the leader. How can I add a column to this data set and populate it with the number of yards behind the leader?
You can use the $ syntax to add a new column. The assignment can take place in the same command using the expression you want to include in the column. In this case we use the largest value and substract the value of the column from this.
Ref: Adding and removing columns from a data frame - click here
This newsletter was sent to you because you signed up at SQLServerCentral.com.
Feel free to forward this to any colleagues that you think might be interested.
If you have received this email from a colleague, you can register to receive it here.