SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Cleaning Address Data with SSIS Using a Web Service


Cleaning Address Data with SSIS Using a Web Service

Author
Message
Dr. John Tunnicliffe
Dr. John Tunnicliffe
Old Hand
Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)

Group: General Forum Members
Points: 334 Visits: 71
Comments posted to this topic are about the item Cleaning Address Data with SSIS Using a Web Service
Martin Vrieze
Martin Vrieze
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1300 Visits: 125
This looks like a viable way to clean up international addresses.

For those looking to clean up US addresses, I would proceed with a high degree of caution.

1. Is the web service CASS certified by the US Postal Service?

2. Will the cleaned addresses resolve to a "DPV" validated address?

3. How old are the addresses on file? If they are older than 6 months, I highly recommend passing them against the US Postal Service NCOA file.

4. Are your staff well versed in the intricacies of US Postal Service requirements and regulations?

I would argue that if you have any question about the four points I make above, you should seek the expertice of a good computer service bureau to perform these address hygiene functions and not "do it yourself".

Some explanations on the above. In August 2007, the US Postal Service requires mailer seeking any type of presort discounts to apply CASS Cycle "L" to their housefiles. This requires mailers have their addresses resolve to a "Delivery Point Validation" or "DPV" address. This means if the address does not match exactly against an address on the USPS DPV database, the Postal Service declares that address undeliverable and you will not be able to take any discounts for mailpieces sent to that address.

The US Postal Service also requires mailers seeking to take presort discounts run their addresses against the National Change of Address (NCOA) database to ensure the most up-to-date addresses for the housefile. This is required every 6 months. As a practical matter, DBA's want to maintain the most accurate addresses on the housefile and ensure that when we need to communicate with customers, leads, etc that our mail gets to the appropriate individuals. The economics of passing the address files against the US Postal Service files, in most cases, will pay for the cost many times over when sending any type of mail such as billings, marketing materials, etc.

If you are looking to clean addresses for any kind of larger scale mailing (ofer a few hundres pieces), not following the items above will cost your organization significantly more postage than is necessary.

Again, I highly recommend seeking the advice of a good computer service bureau to perform US Postal Service address hygiene to your housefile.
SQL Smartie
SQL Smartie
SSC Rookie
SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)

Group: General Forum Members
Points: 29 Visits: 15
My team had to look at this earlier this year. We looked at and tried a number of solutions including those from Melissa Data, Qualified Address and Pitney Bowes, among others. All of the above services were CASS certified and could tell us if an address actually exists (DPV). We wouldn't even touch it if it's not CASS certified.

Management didn't want to spend $25,000-50,000/year on address verification, so we finally went with Qualified Address because their pricing blew everyone else away: http://www.qualifiedaddress.com/Services/Address-Verification-API/Pricing/

The other cool thing they had was a JavaScript version of their address verification. We put it on our checkout, which took about 10 minutes, and now our customers certify their own address (because ultimately they're the authority on their own address and where we ship the product). In this way we have quality data from beginning to end instead of trying to create quality from raw data. Our customer service department says we should have done this a long time ago - because the number of returned products because of incorrectly input address data is virtually zero.

We also looked at NCOA, the USPS charges $175,000/year for the data, so we'd definitely recommend a provider for that as well.
James Stover
James Stover
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1263 Visits: 862
Very nice article Dr. John, thank you. I have had the (dis?)pleasure of validating and cleansing addresses more times than I care to remember.

Just wondering what the performance is like for this process. On average, how long does it take to cleanse 1000 addresses? Assuming you have credits, is the processing time linear with each batch iteration (i.e. does it take 2x the time to do 2000 addy's and so on...)? I suppose it's really more to do with Postcode Anywhere...but still would like to know. Thanks again.


James Stover, McDBA

Dr. John Tunnicliffe
Dr. John Tunnicliffe
Old Hand
Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)Old Hand (334 reputation)

Group: General Forum Members
Points: 334 Visits: 71
When I used the UK postcode service, some batches were amazingly quick and others were inordinately slow. Talking to the very helpful guys at Postcode Anywhere, it seems the speed is more to do with the quality (or not!) of the incoming data. The poorer the quality, the more time their service takes as it has to use progressively complex algorithms to try to find a match for the address. So, if the data has the house number and a correct postcode, you get the nicely formatted address back very quickly. If the data just contains a house name and a misspelled town name, it probably will not be able to find a match.

I was not to worried about throughput as it was a one-off exercise to clean up some historic data, but I seem to remember the service ripped through 300,000+ addresses in around 15-20 minutes.
Andy Litherland
Andy Litherland
SSC Rookie
SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)SSC Rookie (33 reputation)

Group: General Forum Members
Points: 33 Visits: 0
This is a great article - thanks Dr.John.

We have now implemented something very similar for one of clients who required a .Net assembly that could perform batch cleansing and single address cleansing (for their websites). They also required me to provide a sample SSIS package that used a Script Component for asynchronous batch processing of 1,000 addresses at a time. With the help of your article, I referenced my.Net assembly and disassociated the the input from the output in the script component.

I found that the Postcode Anywhere service batch cleansed 20,000 addresses in around 8 minutes.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum







































































































































































SQLServerCentral


Search