Data Quality - Addressing non-stated requirements

  • Having a local cache is sensible. The caveat to storing data locally is to check the terms and conditions for whatever the data is.

    There's a lot of information out there in the public domain but some of it runs into legal grey areas. Effectively setting up a local cache can violate Ts&Cs.

    Where legally valid a weekly refresh sounds a good policy. Again, there needs to be full disclosure of what you are doing. If you refresh on a Monday and the source data changes on a Tuesday then the worst case is 6 days inaccurate data. The impact of incorrect data has to be understood. In the majority of cases it won't matter but if that data formed part of the risk factors for insurance then the customer would be under the impression that they were insured when in fact they were not.

    I would also not limit my thinking to what could be expressed in SQL. If you have clear rules what you have is a spec against which tests can be writtenin what ever tier of the application is appropriate

  • sql-troubles (7/29/2016)

    I think the postcodes aren’t the best example in what concern the data quality assessment rules. Even if one succeeds in his attempts of reengineering the postcodes rules, by next rule change the rules need to be changed as well. There are also cases in which the postcode system changes from the grounds up. Imagine you have to code rules for postcodes from all over the world!

    At least in what concerns the postcodes and address information, one should use when possible an address validator or row data provided by an authorized entity. I know that there are such kinds of services. If possible one should implement such address validators directly in the source systems. This depends also on the number of new addresses added each year. Sometimes a validator is cost-effective, other times it isn’t.

    Address validators can bring their own problems, though. I spent a long frustrating time trying to arrange internet services in an office in a newly converted building, where the postcode was changing and the providers' IT systems did not yet recognise the new code. They couldn't accept an order until they had a postcode that their system recognised. The existing postcode for the building pre-conversion wouldn't do because our new address then failed the validation against the list of valid addresses at that postcode.

Viewing 2 posts - 16 through 16 (of 16 total)

You must be logged in to reply to this topic. Login to reply