Problems displaying this newsletter? View online.
SQL Server Central
Featured Contents
Question of the Day
The Voice of the DBA
 

Representative Data Challenges

This editorial was originally published on Jan 21, 2021. It is being re-run as today is a holiday in the US.

One of the areas where machine learning and artificial intelligence have had lots of success is with image work. Whether identifying people in pictures or helping cars stay on the road and out of each other's way, this capability of computing has worked well. It's not perfect, and not necessarily as accurate as most humans, but it works well. At least well enough. Sometimes it's even better than humans.

There are issues, however, and I think some of them are because of poor data sets. Last year when the pandemic hit, education was challenged with how to conduct remote exams. While there are some solutions, they don't always work well. Sometimes the algorithms don't recognize people, especially non-Caucasians.

The issues raised reminded me of the issues with some bathroom gadgets. I have fairly dark skin, and I've always wondered why some sinks and soap dispensers wouldn't work for me. I hadn't thought much about it until I saw a few reports like the one listed above.

I don't think there is anything malicious here, but I do think that often we find teams work on a happy path when building some new tool. They test it often themselves, but they don't think widely about how a variety of customers will use things. While I've seen many personas, I often don't see anyone creating personas that might consider something like skin color, or even a different culture. We often consider roles, without deeply examining how those roles are implemented.

We need to work with representative data in whatever area we work, but data that does include some of the edge or corner cases that might come up. Our dev and test areas can start with small data sets, including those that we build, but at some point we need representative data. Whether we're building OLTP software, sensors, or image recognition, our data should be well rounded.

While systems don't need to solve every issue, we ought to consider a large percentage. In the case of imaging, certainly understanding the wide variety of type of people that can use products would seem to be important. Hopefully future teams won't make the mistake of assuming that most of their customers look exactly like them.

Steve Jones - SSC Editor

Join the debate, and respond to today's editorial on the forums

 
 Featured Contents
SQLServerCentral Article

SQL Server 2025 Standard Developer Edition

Johan Bijnens from SQLServerCentral

Learn about the new Standard Developer Edition of SQL Server 2025.

External Article

Find Missing SQL Server Data with NOT EXISTS

Additional Articles from SimpleTalk

After years of writing T-SQL a certain way, changing can be tough. When comparing tables for missing rows, developers often use LEFT JOIN.

Blog Post

From the SQL Server Central Blogs - Plan your 2026

K. Brian Kelley from Databases – Infrastructure – Security

If you don't have a plan, you'll accomplish it. That's not a good thing.

Blog Post

From the SQL Server Central Blogs - Monday Monitor Tips: Learning While Using the Tool

Steve Jones - SSC Editor from The Voice of the DBA

A customer was asking about what certain items in Redgate Monitor mean. They have a variety of skills on their staff, and they have developers accessing Redgate Monitor. This...

SQL Server 2022 Query Performance Tuning

Grant Fritchey from SQLServerCentral

Troubleshoot slow-performing queries and make them run faster. Database administrators and SQL developers are constantly under pressure to provide more speed. This new edition has been redesigned and rewritten from scratch based on the last 15 years of learning, knowledge, and experience accumulated by the author. The book Includes expanded information on using extended events, automatic execution plan correction, and other advanced features now available in SQL Server.

 

 Question of the Day

Today's question (by Steve Jones - SSC Editor):

 

URL Safe or Not?

If I use BASE4_ENCODE() in SQL Server 2025, is the output URL Safe by default?

Think you know the answer? Click here, and find out if you are right.

 

 

 Yesterday's Question of the Day (by Steve Jones - SSC Editor)

The Long Name

I run this code to create a table:

Create table with unicode name

When I check the length, I get these results:

Table with length of name shown as 132 characters

A table name is limited to 128 characters. How does this work?

Answer: This is the actual number of bytes, not characters

Explanation: This returns bytes, not characters. The byte count here is 132, but the character count is 66. Ref:

Discuss this question and answer on the forums

 

 

 

Database Pros Who Need Your Help

Here's a few of the new posts today on the forums. To see more, visit the forums.


Analysis Services
Connecting Power BI to SSAS and effective user not working - Hi everyone, Below is a consolidated summary of what we validated Architecture & data path The on-premises data resides in SQL Server, accessed by Power BI Service via on-premises Analysis Services (SSAS Tabular). Effective flow: Power BI Service ? Power BI Gateway ? SSAS Tabular ? SQL Server The issue is not SQL connectivity, but authentication and […]
Anything that is NOT about SQL!
Fantasy Football 2026 - The thread for the league in 2026. Players from last year have priority.
Editorials
An SSIS Upgrade - Comments posted to this topic are about the item An SSIS Upgrade
Where Your Value Separates You from Others - Comments posted to this topic are about the item Where Your Value Separates You from Others
Your AI Successes - Comments posted to this topic are about the item Your AI Successes
Article Discussions by Author
Semantic Search in SQL Server 2025 - Comments posted to this topic are about the item Semantic Search in SQL Server 2025
Encoding URLs - Comments posted to this topic are about the item Encoding URLs
Fixing the Error - Comments posted to this topic are about the item Fixing the Error
T-SQL in SQL Server 2025: Encoding Functions - Comments posted to this topic are about the item T-SQL in SQL Server 2025: Encoding Functions
Which Table I - Comments posted to this topic are about the item Which Table I
Using Python notebooks to save money in Fabric: The Fabric Modern Data Platform - Comments posted to this topic are about the item Using Python notebooks to save money in Fabric: The Fabric Modern Data Platform
SQL Server 2022 - Administration
High Availability setup - has anyone seen this method? - Hi all, I recently moved to a new employer who have their HA setup in a way I've never seen and I'd just like to get opinions on it; I'm not saying it's right or wrong, just different (but it does appear to have caused issues). The way that I'm used to is that two […]
Certificates expired - Can't restore after creating new certificate - The previous DBA created a certificate which expired 12/31/2025. I came in hoping to have an easy day for New Year's Eve and found all of the backups were failing.  After doing the research, I found the certificate had expired. 1.  I created a new certificate since I couldn't update the expiry date. 2.  I […]
SQL Server 2022 - Development
advice on diving into devops for managing BI projects - hi , i know this is a sql server forum but i think my brain is aligned more with members of this forum than any other. we have a large netsuite migration in which multiple erps are involved.   The technologies on which those erp's run vary from db2 to all flavors of sql server based […]
Simplifying WHERE Condition with LIKE test on multiple columns - Good Evening, Is there a simpler way to rearrange the following WHERE condition: [Column_1] LIKE 'Beta08%' OR [Column_1] LIKE 'Beta11%' OR [Column_1] LIKE 'Beta16%' OR [Column_1] LIKE 'Beta17%' OR [Column_1] LIKE 'Beta15%' OR [Column_1] IN ('Beta192') OR [Column_2] LIKE 'Beta08%' OR [Column_2] LIKE 'Beta11%' OR [Column_2] LIKE 'Beta16%' OR [Column_2] LIKE 'Beta17%' OR [Column_2] LIKE […]
 

 

RSS FeedTwitter

This email has been sent to {email}. To be removed from this list, please click here. If you have any problems leaving the list, please contact the webmaster@sqlservercentral.com. This newsletter was sent to you because you signed up at SQLServerCentral.com.
©2019 Redgate Software Ltd, Newnham House, Cambridge Business Park, Cambridge, CB4 0WZ, United Kingdom. All rights reserved.
webmaster@sqlservercentral.com

 

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -