Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Data Quality Expand / Collapse
Author
Message
Posted Monday, September 11, 2006 11:38 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, December 18, 2014 12:03 PM
Points: 2,839, Visits: 3,146
Comments posted to this topic are about the content Data Quality
Post #307733
Posted Monday, September 18, 2006 2:24 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, January 29, 2014 5:01 AM
Points: 66, Visits: 105

Hello,

you gave the example of legacy systems not being well documented.

Sometimes it can be the other way arround unfortunately. A change of management, a change of IT-companies, and a wreckless import (time you know) of originally more or less well-documented data into a poorly documented "new" system. My heart broke and now two years later there are still all kinds of little tiny problems. But hey, the system runs ! (but the data s**ks)

Greetings,

L

Post #309268
Posted Monday, September 18, 2006 3:55 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, April 13, 2007 12:12 PM
Points: 11, Visits: 1
Thank you for writing this article. It represents another approach to what I have suggested in my article printed here last year about Fragmentation Measurement
I believe that the quality of data is to be measured firstly concerning the gap between the physical device and the way the users are using it.
The second level has to deal with the redundancy on data (probably check replication).
The third level which I believe this article is about, is the beginning of a BI, because you are looking for missing data and taking decisions according to the contents of the data base.
This subject is very important, and should be continued.

Koby Biller
www.disklace.com
Post #309281
Posted Monday, September 18, 2006 9:07 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, October 9, 2006 6:10 AM
Points: 42, Visits: 1

I work for a Third Party Administrator. One of my primary job duties is converting data from other sources. When our company buys another company or we get a new client, I convert their data into our system. The new client’s we receive can have anywhere from 2,000 to 40,000 claims, 5,000 to 600,000 payments, and 5,000 to 600,000 notes.

I receive data in every format you can imagine. I have found the following steps useful when converting new data.

 

  1. Import the data into Conversion Database worktables.
  2. Place an identity column on the end of each record.
  3. Setup Staging Tables that are similar to you production file tables. This staging area tables have blank primary key fields, that will be filled in when you import the data into a pre-production or production database.
  4. In the Staging Area, you create a standard set of stored procedures that you run against this data. The stored procedures report any inconsistencies within the data and also inconsistencies between the new data and your production data.

Here is a sampling of what my stored procedures look for.

  1. Does the SSN you are importing match any SSNs with in your database. If so, are they the same name
  2. Does all instances of a SSN have the same first name / last name combination
  3. Is this a valid SSN
  4. Do the Payment detail amounts sum up to the Payment Header Amounts
  5. Are all From Dates before To Dates
  6. Are all necessary fields filled in
  7. Are there any invalid codes in the new data
  1. Once inconstancies have been fixed, create standard stored procedures that transfer the new data to a Pre-production database. Have code in these stored procedures to update the staging tables with the primary key values that are generated.
  2. Verify the data imported correctly.
  3. Have a standard test plans and signoff sheet for the business user to test the data, within the application.
  1. Verify data in the claims application
  2. Verify payment details and totals in the payments application
  3. Verify Account Information in the account application.
  1. Verify payments against a loss run report and other reports.
  2. Have a policy in place that no data is imported into production until the proper signatures are received in Pre-Production.
  3. Run the Conversion process to convert the data into production.

 

Since I have used this method of performing conversions, I am able to clean up the data in a pre-production phase, before it gets into production. Also, by using identify columns and primary key columns as explained in step 2 and 3, I am able to trace my data back to the original source.

Post #309375
Posted Tuesday, September 18, 2007 10:08 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Wednesday, November 5, 2014 11:24 PM
Points: 85, Visits: 262
Consisting all most entirely of truisms, No quantification, expected norms or constructive methodology for cleansing discussed.

Very unsatisfying read.


Post #400187
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse