Problems displaying this newsletter? View online.
SQL Server Central
Featured Contents
Question of the Day
The Voice of the DBA
 

Data Debt

I had never heard of data debt until I saw this article on the topic. In reading it, I couldn't help thinking that most everyone has data debt, it creates inefficiencies, and it's unlikely we'll get rid of it. And by the way, it's too late to get this under control. I somewhat dismissed the article when I saw this: "addressing data debt in its early stages is crucial to ensure that it does not become an overwhelming barrier to progress." I know it's a barrier, as I assume most of you also know, but it's also not stopping us. We keep building more apps, databases, and systems, and accruing more data debt. Somehow, most organizations keep running.

The description of debt might help here. How many of you have inconsistent data standards, where you might define a data element differently in different databases? Maybe you have duplicated data that is slow to update (think ETL/warehouses), maybe you have different ways of tracking a completed sale in different systems. Maybe you even store dates in different formats (int, string, or something weirder). How many of you lack some documentation on what the columns in your databases mean? Maybe I should ask the reverse, where the few of you who have complete data dictionaries can raise your hands.

For most of my career I've heard a couple of terms that I've never really seen implemented. There's the famous "single version of the truth" for a system, which seems to break down whenever we add a reporting or warehousing system. Even inside a single database, often an OLTP one, it's hard to get a truth because values are changing so fast. The other term is MDM (master data management), which promises to ensure that every element is tracked and tagged the same way. No misspelled customer names or outdated addresses. There have been no shortage of products I've seen to help people tackle this problem, but ultimately I think the amount of data debt is too high. When we realize we need MDM, we'll never pay down that debt, mostly because too many developers have too many habits and legacy ways of capturing data that will never get integrated into any MDM dictionary.

The article seems like a great academic set of principles. Make sure you label all your data. Put governance in place, with good access controls. Train workers, establish accountability to properly manage data. Invest in scalable architectures. How many of you can add scale to your system easily? It's always taken me jumping through a variety of hoops to do that. The cloud makes it easy.

For a month. Then when the bill comes, you'll be scaling back down.

Really, the chaos of the real world, where organizations are not one thing, but a large number of people and groups, each with their own goals and processes, just trying to get enough done to keep the organization moving forward is where we live. There's no real time to deal with data debt.

Except if you're the ETL person. We mostly pay you to move data around and clean it as best you can. At least then the problem remains hidden from the report readers, who trust you've actually done the T portion of ETL correctly.

Steve Jones - SSC Editor

Join the debate, and respond to today's editorial on the forums

 
 Featured Contents
SQLServerCentral Article

JSON in Microsoft SQL Server: A Comprehensive Guide

AR from SQLServerCentral

Introduction JSON (JavaScript Object Notation) has become a popular data format for storing and exchanging information. Microsoft SQL Server, starting from version 2016, introduced built-in support for JSON, allowing developers to work with JSON data more efficiently within the relational database environment. This article will explore how to store, retrieve, and manipulate JSON data in […]

External Article

How to rename a SQL Server database

Additional Articles from MSSQLTips.com

There may come a time when you want to or need to rename a database. In this tip we look at a couple different options.

Technical Article

Navigating the Database Landscape in 2025 Livestream on Feb 12

Press Release from SQLServerCentral

Join us on February 12th for the livestream: Navigating the Database Landscape in 2025: Simplifying Complexity. Discover the latest trends and insights from our 2025 report, learn new approaches for professional development, and gain valuable knowledge to stay ahead in your career.

Blog Post

From the SQL Server Central Blogs - Another AI Faux Pax

Steve Jones - SSC Editor from The Voice of the DBA

I was experimenting with a local model (article) and as a part of this, I pulled down a web interface for my model in a container. I ran it...

Blog Post

From the SQL Server Central Blogs - Data Governance: The Invisible AI Accelerator

Joyful Craftsmen from Joyful Craftsmen Blog

The post Data Governance: The Invisible AI Accelerator appeared first on Joyful Craftsmen.

Learn Microsoft Fabric: A practical guide to performing data analytics in the era of artificial intelligence

Site Owners from SQLServerCentral

A step-by-step guide to harness the power of Microsoft Fabric in developing data analytics solutions for various use cases

 

 Question of the Day

Today's question (by Steve Jones - SSC Editor):

 

Precision and Scale I

I have this number: 456.23 What is the scale of this number?

Think you know the answer? Click here, and find out if you are right.

 

 

 Yesterday's Question of the Day (by Steve Jones - SSC Editor)

Dates and Languages

The string, listopad, translates to a month name in different languages. If I were to run this code, what values are returned?

DECLARE @yourInputDate  NVARCHAR(32) = '28 listopad 2018';

SET LANGUAGE Polish;
SELECT CONVERT(DATE, @yourInputDate) AS [SL_Polish];

SET LANGUAGE Croatian;
SELECT CONVERT(DATE, @yourInputDate) AS [SL_Croatian];

SET LANGUAGE English;

Answer: Two different dates a month apart

Explanation: This string translates to either October or November (month 10 or 11) depending on the language, so two different dates are returned. Ref: Nondeterministic conversion of literal date strings into DATE values - https://learn.microsoft.com/en-us/sql/t-sql/data-types/nondeterministic-convert-date-literals?view=sql-server-ver16

Discuss this question and answer on the forums

 

 

 

Database Pros Who Need Your Help

Here's a few of the new posts today on the forums. To see more, visit the forums.


SQL Server 2016 - Administration
Over period of time application stop connecting to AG database - Hello. SQL version - 2016 , windows 2019 both are enterprise edition Over period of time application stop connecting to Alwayson availability group database. what could be reason why application stop working automatically? please suggest. Checked SQL Error logs not found any errors regarding connections. also windows event viewer not found cluster related errors. application […]
SQL Server 2016 - Development and T-SQL
Calculating Numerator and Denominator - Good evening, I have a rather peculiar issue while trying to calculate Numerator and Denominator for a Final Temporary Table I have built (using a couple of Intermediary Temporary Tables along the way).   ATTEMPT 1 The following two lines were my initial attempt (on the Final Temp): Cast(SUM(CASE When  Then 1 Else 0 […]
trying to call powershell from TSQL statement - Hello, trying to call a powershell file I created, locally on the server, but also, made sure since this has an always on cluster, to first check if its PRIMARY, BUT, what i am having trouble with, is using the xp_cmdshell command... when i run the following: EXEC xp_cmdshell 'powershell.exe -File E:\folder\RunToExport.ps1' it gives the […]
Administration - SQL Server 2014
Getting below error - An error occurred during recovery, preventing the database 'XXXXX' (28:0) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support. (.Net SqlClient Data Provider) (Microsoft SQL Server, Error: 3414) Version : Microsoft SQL Server 2012 (SP4-GDR) (KB4583465) - 11.0.7507.2 […]
Development - SQL Server 2014
Why is "SELECT *" about 10 times faster than "SELECT (ALL Columns)" - Dear all, I have noticed that  a "SELECT * ..."  is about 10 time faster than "SELECT Column1, Column2, ... LastColumn  FROM Table WHERE ID = (number)" 70 ms <-> 650 ms The Queryplans are equal and so are the costs - in both cases it is a grouped index scan with a Cost of […]
SQL Server 2019 - Administration
File date from Extended Event .xel files - I know this, but I do not remember it. I have scoured by saved scripts as well a lot of google-fu. I have multiple extended event files. Activity_0_133820397244150000.xel Activity_0_133820364382140000.xel Activity_0_133820332071900000.xel This portion of the file name is a representation of the date and time, correct??? 133820397244150000 133820364382140000 133820332071900000 How can I calculate this????
restore db as db_copy. resolve hardcoded references within procedures to db. - So I backup & restore db as db_copy. within both there is a proc that does: update [db]..[column] I need it to say: update [db_copy]..[column] (Simplified, above x 1000) Is there a step in backup / restore to do this?  Is there a quick resolution?  Is this poor coding in the first place?   Thank […]
SQL Server 2019 - Development
Is it possible to create a parameter and use different datasets? - Hello, I have an SSRS report, version 2019,  that has two parameters, which will call for three different datasets.  Parameter:  Continent >  3 values (Label)  = ASIA, EUROPE, SOUTH_AMERICA Parameter: Country -> cascading from the first parameter. Values are generated from the following datasets. Datasets  - DS_Asia;  Fields:  Continent, Country;  DS_Europe: Fields: Continent, Country; DS_South_America: […]
Select statement with in using multiple columns - I inherited a mess.   The previous developer did not understand how to set up a database properly. Beside not using proper data types, he did not use normalization, One of the things I need to do now is query on a table, see below, that has 4 columns now (it was multiple tables, 1 for […]
any thoughts on datasphere? - hi my boss asked me to take a look at sap's datasphere product as a reporting alternative to ssrs. does the community have any thoughts on it?   specifically how it compares to ssrs in user friendliness, time to market, cost of ownership maintenance wise?   we have maybe one user who writes ssrs.   in IT one […]
Final SELECT with WHERE based on passing in variable from Dropdown box - Is there a way to base the final select statement based on whether a user chooses, from a report dropdown box, a date range based on e.g.: PaidDate or DateOfService similar to the below (which doesn't work): SELECT CASE WHEN @dateChoice = 'DOS' THEN ( SELECT DISTINCT tp.client_id AS Member_ID ,tp.PaidDate AS Paid_Date ,tp.DOS FROM […]
AD, powershell and ssis - Hi, i have a requirement that involves visiting multiple folders in AD for active, disabled etc computers.   Other attributes  might need to be extracted as well including associated users and locations.   From what im seeing in AD,  at least sometimes the description column here has the user name.  And the folder can be the location.  […]
SQL Server 2022 - Administration
DbDefence - how does work this tool? - How it is implemented?  I have installed, tried, it is free for dbs <=30 mb. I have asked their support: MS declared XP's are legacy and do not use them, they will be removed in the future versions. It was written some versions ago. What you will do? But DbDefence support answered: MS wrote but […]
SSMS gets very messy - At work we have quite a few databases that I'm frequently browsing. This is quickly becoming a real mess when I need to switch between different SSMS'es (I need to start them with different credentials) and different databases. My questions is, do you guys have any hints to make working with lots of databases/tables within […]
SQL Server 2022 - Development
Transpose sql table rows into columns - I want to transpose sql table rows to columns. I have written a query for that already but its taking a very long time (around 15 min) as the number of rows in the table is high like 8M+. Is there any other way to write the query to get the same output with improved […]
 

 

RSS FeedTwitter

This email has been sent to {email}. To be removed from this list, please click here. If you have any problems leaving the list, please contact the webmaster@sqlservercentral.com. This newsletter was sent to you because you signed up at SQLServerCentral.com.
©2019 Redgate Software Ltd, Newnham House, Cambridge Business Park, Cambridge, CB4 0WZ, United Kingdom. All rights reserved.
webmaster@sqlservercentral.com

 

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -