Selecting n Closest (Geographic) Rows

Question

Selecting n Closest (Geographic) Rows

Robert A. Shannon III

SSC Veteran

Points: 254
More actions
November 5, 2013 at 2:13 pm

#282075

Hello there!
This is a high-level explanation.
I have two sets of data. One set , let's say, is employer locations. The other set, let's say, is employees' home locations. With a single select statement, I want to select the employer location details with the two (2) closest empoyees and their home locations to listed employer location.
It's (2) now but could just as easily be closest (5) or closest (10) later as requirements change. I have the lat/long on each row in each data set and I'm using the "Great Circle" formula to determine the approx. geographic distance between rows.
I do not want to use a CURSOR (blech!) over top of the employer location data set to process one row at a time, but this self-imposed single statement restriction has me tied in knots.
Thanks in advance for any direction.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply

Michael Valentine Jones SSC Guru Points: 64818 More actions · Answer 1

You could use the function on the link below to calculate the distance from work to employee home in a select statement, and just select the top 2 based on distance.

Great Circle Distance Function - Haversine Formula

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=81360

select

top 2

Distance = dbo.F_GREAT_CIRCLE_DISTANCE

(a.Work_Latitude, a.Work_Longitude1, a.Home_Latitude, a.Home_Longitude2 ),

a.*

from

MyTable a

order by

1 asc

mickyT SSChampion Points: 10360 More actions · Answer 2

Hi Robert

If you are using geographies you could use try the following. Unfortunately in 2008 it doesn't appear to make use of spatial indexes, meaning it doesn't scale very well. However in 2012 the spatial index will be used if you follow the rules specified in technet

Data setup:

create table #employer (

id int identity primary key,

placeofbusiness Geography

);

create table #employee (

id int identity primary key,

homelocation geography

);

with randomLocations as (

SELECT lat = RAND(CAST(NEWID() AS VARBINARY)) * 2.0,

long = RAND(CAST(NEWID() AS VARBINARY)) * 2.0

FROM (VALUES (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) e1 (N)

--,(VALUES (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) e2 (N) --scale up

)

INSERT INTO #employer (placeofbusiness)

SELECT Geography::Point(lat, long, 4326)

FROM randomLocations;

CREATE SPATIAL INDEX employer_sdx ON #employer(placeofbusiness) USING GEOGRAPHY_GRID;

with randomLocations as (

SELECT lat = RAND(CAST(NEWID() AS VARBINARY)) * 2.0,

long = RAND(CAST(NEWID() AS VARBINARY)) * 2.0

FROM (VALUES (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) e1 (N)

,(VALUES (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) e2 (N)

--,(VALUES (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) e3 (N) --scale up

--,(VALUES (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) e4 (N) --scale up

)

INSERT INTO #employee (homelocation)

SELECT Geography::Point(lat, long, 4326)

FROM randomLocations;

CREATE SPATIAL INDEX employee_sdx ON #employee(homelocation) USING GEOGRAPHY_GRID;

Nearest Neighbour Query

select e1.id employer_id, e2.id employee_id, e2.homelocation.STDistance(e1.placeofbusiness)

from #employer e1

cross apply (

select top(10) e3.id, e3.homelocation

from #employee e3 where e3.homelocation.STDistance(e1.placeofbusiness) is not null

order by e3.homelocation.STDistance(e1.placeofbusiness)

) e2