Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Data Quality Expand / Collapse
Author
Message
Posted Monday, February 14, 2011 9:21 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Yesterday @ 3:11 PM
Points: 31,368, Visits: 15,837
Comments posted to this topic are about the item Data Quality






Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1063982
Posted Tuesday, February 15, 2011 2:07 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: 2 days ago @ 5:02 AM
Points: 97, Visits: 97
I do agree on the points about data quality and in certain industries, it does seem that technology hasn't been fully embraced.

However, as IT professionals, we can make a significant difference with the systems we deliver by putting in strict validation and making data entry elements mandatory to prevent essential facts from being omitted.

I know making an application "idiot proof" isn't an exact science nor is it always possible to second guess every crazy thing an end user will try but we can definitely make a difference.

Also, the example of estate (real) agents may also be partly down to a liberal attitude to the truth in some cases - sales people huh :)
Post #1064091
Posted Tuesday, February 15, 2011 2:59 AM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Tuesday, December 16, 2014 1:51 AM
Points: 579, Visits: 1,255
It's not all data quality, much of it in the property business is "marketing" - don't reveal the negatives.

If the norm is three parking spaces and the listing doesn't say anything people will assume the norm and view the property. Then maybe they'll still like it when they find it only has one or two spaces whereas they would not even have viewed if the listing had said only two spaces.

I can only speak from an England and Wales perspective (Scotland may be in the UK but it has a very different house purchasing system) but you rapidly learn to read between the lines with estate agents blurb.

As for data quality at work, I will always highlight issues that I see in data that can be corrected and this is usually welcomed bythe relevant department because I can put data together in different ways that they cannot. I consider it a part of my DBA/developer role.
Post #1064121
Posted Tuesday, February 15, 2011 4:51 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Saturday, September 10, 2011 8:08 AM
Points: 539, Visits: 201
My thinking is that years ago, data integrity was not as important as people are realizing it is today. I am just finishing up a data mining class and one of the biggest issues with target variable results is bad data. As a programmer in the food industry, I am mindful of our control limits on some of our critical data that is needed for GMP and FDA purposes. Hopefully years from now when the analysts are mining data that has been entered from one of my applications, they will be happy that somebody took time to work closely with the "key" players (namely our Quality department) and have those data validations in place.
Post #1064165
Posted Tuesday, February 15, 2011 6:04 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Monday, December 15, 2014 9:03 AM
Points: 851, Visits: 5,596
As sometimes excessive data validation can cause users not to enter anything, another approach is to produce regular data quality reports grouped by department, user etc.

If management want information based on missing data, it is amazing how the quality of input can improve!
Post #1064200
Posted Tuesday, February 15, 2011 8:57 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: 2 days ago @ 11:21 AM
Points: 1,793, Visits: 5,044
The problem is that 3rd party websites that aggregate listings from multiple sources can't be too choosy about their data, because they have no control over the data entry etiquette of individual agents, and they don't want to drop listings just because the agent happens to be sloppy (or even a little dishonest).
One solution I'd suggest is that they rank and sort their listing based on completeness of the records. This would involve not only checking for missing data but also flagging records that are statistically outside the norm. For example, if a house in a new subdivision is listed as having 5 bedrooms and 2,500 square feet, but all the other houses have 4 bedrooms and less than 2,200 square feet. Also, if a property is listed as zoned for residential/business multiuse, but comparing the zipcode and street address against a county database suggests otherwise.
These questionable records could be sorted near the bottom of the list along with a special icon or notation advising the user of a possible discrepancy.
Post #1064344
Posted Tuesday, February 15, 2011 10:34 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, July 22, 2014 5:29 PM
Points: 127, Visits: 860
I think it may be a liability issue. Some agents leave information, like sq ft and car spaces, off the form because they do not want to be sued if the information is wrong.

"When in danger or in doubt. Run in circles, scream and shout!" TANSTAAFL
"Robert A. Heinlein"
Post #1064420
Posted Tuesday, February 15, 2011 10:37 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, August 11, 2014 10:14 AM
Points: 11, Visits: 114
I work for a school district in the US that serves grades K-12 and also work with another 21 agencies in my county. The data quality is extremely varied between the districts. As California is attempting to switch to a statewide student data system it is revealing that many of the districts in California just simply do not have the infrastructure to even manage the data and, of those that do, for many data quality has been an after thought.

What I have realized in working with our district (who I have to say is consistently in the top tier regarding quality) is it is a joint effort between the IT staff, the administration staff and the data entry staff themselves (clerks and teachers for the most part).

It is quite evident that data entry staff are only really concerned with data that affects them personally. For instance, student test scores and grades, their teacher and courses and their attendance record are of very high quality. This is because this affects the teacher/student on a daily basis. The attendance record directly impacts the district financially.

As we started several years ago to clean up other areas we would encourage staff as to their importance in the grand view of things but, unless they had a personal ambition to do high quality work, many felt they couldn't spend the time.

We use several 3rd party software systems and have many internal applications. Unfortunately many vendors just do not put a value on data quality. Some might argue that we should switch vendors but our vendor is actually one of the best (some others are far worse) and the current investment is too time/financially prohibitive. Some of our most trusted systems allow users to enter bad data all the time. Since we have many custom database services that move our data around, bad data in one system can often cause errors when moving to other systems.

One of the best things we did was to create our own internal audit tracking system. It tracks hundreds of different data points every day and emails out reports to the relevant administration staff. Having an audit system that catches errors within hours (not months) has proven to be very effective. As users start to get reports on errors they entered the day before they start taking a personal investment in their data and they also get better at verifying data they enter in the first place.

We still have a long way to go but I believe it is systems like this that empower users and aid accountability that help us constantly be at the top.
Post #1064423
Posted Tuesday, February 15, 2011 10:40 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Thursday, December 18, 2014 9:58 AM
Points: 13,872, Visits: 9,600
The only way I've ever seen this work is to take the data quality, analyze it, report on it, and make it part of the annual review (salary and raise issue).

That will have zero impact on salespeople, since their whole value to the company and to themselves is sales volume, but for anyone else, it can help increase quality.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #1064425
Posted Tuesday, February 15, 2011 10:47 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: 2 days ago @ 11:21 AM
Points: 1,793, Visits: 5,044
What I like are the realestate websites where you can see actual floor plans or recorded video allowing a virtual tour. A picture is worth a 1,000 data items.
Post #1064431
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse