Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Geocode Addresses in T-SQL

By David Rueter,

This article shows how you can "geocode" street address information natively in T-SQL with MS SQL 2005, and as a byproduct, how you can validate, normalize, and parse address information.

Note that this approach will work in SQL 2008 as well, though SQL 2008 provides geospatial types and functions that could be used in lieu of the numeric data type used here.

Geocoding is the process of converting a geographical address into latitude and longitude (GPS coordinates). Geocoding address information has a number of benefits:

  • Means to interoperate with web-based mapping services
  • Means to interoperate with GPS devices
  • Makes it possible to calculate distance between addresses
  • Makes geospatial analysis possible (i.e. identifying geographically similar addresses, etc.
  • Compact way to represent a geographical location
  • And more...

Besides these core benefits, there are a number of valuable byproducts of geocoding, including:

  • Address validation
  • Address normalization
  • Street number identification
  • Zip Code lookup
  • County lookup

I'll gloss over coordinate and GPS theory. The bottom line is that in modern systems latitude and longitude are most often represented as a decimal number in the range -180.000000 to 180.000000. (These values are really degrees, from 0 to 360, but positive / negative is used to separate hemispheres.) Consequently, in SQL Server 2005, I'd suggest using a data type of numeric(9,6) for the coordinate data.

The actual work of geocoding is heavy lifting that needs to be performed by a remote service that has a database of address information. We simply want to pass the address information to such a remote service, and retrieve the XML results. There are several suitable services we could use. I have selected Google's service for this example.

So how should we call the Google geocoding service from SQL Server? We need to make an HTTP request, so we need to use one of these:

  • CLR procedure
  • External stored procedure
  • ActiveX call

All of these have pros and cons. I realize that many would recommend CLR -- but I want to write only native T-SQL code. This means that we'll be using sp_OACreate to use an ActiveX object in this example. First we need to make sure that ActiveX support is enabled in the database:

EXEC sp_configure 'show advanced options', 1
RECONFIGURE
EXEC sp_configure 'Ole Automation Procedures', 1
RECONFIGURE

The basic theory of operation is that we will use the ServerXMLHttp object that is already installed on the server. We will instantiate that object, set the URL property, call the send method, and then retrieve the results.

The core of the procedure is this:


EXEC @Result = sp_OACreate 'MSXML2.ServerXMLHttp', @Obj OUT
EXEC @Result = sp_OAMethod @Obj, 'open', NULL, 'GET', @URL, false
EXEC @Result = sp_OAMethod @Obj, 'setRequestHeader', NULL, 'Content-Type', 'application/x-www-form-urlencoded'
EXEC @Result = sp_OAMethod @Obj, send, NULL, ''
EXEC @Result = sp_OAGetProperty @Obj, 'status', @HTTPStatus OUT
EXEC @Result = sp_OAGetProperty @Obj, 'responseXML.xml', @Response OUT
EXEC @Result = sp_OADestroy @Obj
SET @XML = CAST(@Response AS XML)
SET @GPSLatitude = @XML.value('(/GeocodeResponse/result/geometry/location/lat) [1]', 'numeric(9,6)')
SET @GPSLongitude = @XML.value('(/GeocodeResponse/result/geometry/location/lng) [1]', 'numeric(9,6)')

Please see the attached script for the complete declaration of the spGeocode stored procedure.

Once the spGeocode stored procedure has been created by executing the attached script, we can geocode an address like this:

EXEC spGeocode
@Address = '1234 N. Main Street',
@City = 'Santa Ana',
@State = 'CA'
GPSLatitude GPSLongitude City State PostalCode Address
33.755327 -117.867622 Santa Ana CA 92701 1234 N Main St

Neat! We get the Latitude and Longitude returned just as we had hoped. As an added bonus, we also have the Postal Code.

But as useful, if not more so, is the validation and scrubbing of the address. For example, suppose we didn't have separate fields for City, State and Zip. We could call the procedure as follows:

EXEC spGeocode '1234 N. Main Street, Santa Ana, CA'
GPSLatitude GPSLongitude City State PostalCode Address
33.755327 -117.867622 Santa Ana CA 92701 1234 N Main St

How about that! We get the same results back, with City, State and Postal Code split out into separate columns.

Or what if we have imprecise information, such as this?

EXEC spGeocode '123 ELm STreeT, CArlsbAd, CA'

The results are:

GPSLatitude GPSLongitude City State PostalCode Address
33.188985 -117.378193 Oceanside CA 92054 123 Elm St

Notice that the city is actually neighboring "Oceanside", and not "Carlsbad".

Finally, if you actually run this sample code, you will see three additional columns returned:

  • County
  • MapURL
  • XMLResults

MapURL is useful but not fancy: I just concatenated together a static URL to open Google Maps to the GPS coordinates that were returned to us.

MapURL
http://maps.google.com/maps?f=q&hl=en&q=33.188985+-117.378193

XMLResults contains the raw results that were returned by the geocoding service. Here is an excerpt:

<GeocodeResponse>
<status>OK</status>
<result>
<type>street_address</type>
<formatted_address>123 Elm St, Oceanside, CA 92054, USA</formatted_address>
<address_component>
<long_name>123</long_name>
<short_name>123</short_name>
<type>street_number</type>
</address_component>
<address_component>
<long_name>Elm St</long_name>
<short_name>Elm St</short_name>
<type>route</type>
</address_component>
<address_component>
<long_name>Oceanside</long_name>
<short_name>Oceanside</short_name>
<type>locality</type>
<type>political</type>
</address_component>
...

<geometry>
<location>
<lat>33.1889849</lat>
<lng>-117.3781934</lng>
</location>
<location_type>RANGE_INTERPOLATED</location_type>
...

</geometry>
<partial_match>true</partial_match>
</result>
</GeocodeResponse>

A few notes on this sample code:

  1. See http://code.google.com/apis/maps/documentation/geocoding for documentation on Google's geocoding service
  2. Even though ServerXMLHttp can return a native XML property, ActiveX / ADO does not currently support parameters of the XML, varchar(MAX) or nvarchar(MAX) types--so we are stuck using a varchar(8000) for our results, and then casting this to the XML type.
  3. I realize that the Microsoft purists will point out that a CLR procedure is at least a more modern (if not a better) way to make the HTTP call, but that seems to me to open up a whole new facet of complexity. Personally, I favor using sp_OACreate in this case so that we don't need to mess with CLR for such a simple task.
  4. I declared the stored procedure parameters as OUTPUT parameters to make it easy to store results in variables if needed, such as when looping through a table to update coordinate and postal code data for existing rows.

This example shows several useful things, including:

  • Obtaining GPS coordinates for an address
  • Validating an address
  • Normalizing address information into appropriate fields
  • Looking up postal code and county information
  • Obtaining a URL to a map of an address

This example does things in a way that is very easy for a T-SQL programmer to use and understand. This example also shows some simple real-world uses for SQL 2005 XQuery (to parse out data from the XML results), and for the sp_OAxxxxx procedures.

Think of all the fun you can have with this!

Happy geocoding!

Resources:

Geocode.sql
Total article views: 19343 | Views in the last 30 days: 83
 
Related Articles
FORUM

Geocode Addresses in T-SQL

Comments posted to this topic are about the item [B]Geocode Addresses in T-SQL[/B] Great article, in...

ARTICLE

Geocoding with SQL Server 2005

Mapping services have become quite popular on the Internet and with many applications, but in order ...

FORUM

Generate MAc addresses with a stored procedure

Generate MAc addresses with a stored procedure

ARTICLE

Dealing with Addresses

David writes about the system he put together to handle addresses and the pros and cons of various t...

FORUM

store procedure result to excel

store procedure result to excel

Tags
geocode    
geospatial    
gps    
latitude    
longitude    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones