# Nearest Proximity For every record

• marty.seed

SSCrazy

Points: 2441

Hello!

We have a query we are trying to build with a key component. All of our locations have a lat and long entered as own values in the table. We need to find out the nearest location to EACH building from all the other buildings. Another tricky part is we have to make sure this can run across multiple SQL Server instances, we have some that are 2008 and some that are 2012.

For table structure just figure BuildingID int, Latitude decimal, Longitude decimal

Has anyone had to do this? Any help is appreciated

• pietlinden

SSC Guru

Points: 62848

Off the top of my head, you'd use a cross join so you'd get two copies of Building, and then if your buildings are really close together, you could just use Pythagorean theorem to get distance, and then use TOP 1 to get the closest one.

Yeah, I know Haversine formula would be better, probably, but if they're really close together...

• ScottPletcher

SSC Guru

Points: 98489

You'd also want to keep the closest building id in the table structure (with a datetime of when it was computed) and that distance.

The reason for that is to provide a limit for future searches.  For example, say you for building 1, you determine that building 2 is closest, and it's 102.4 miles/kms away (using the standard lat/long distance formula).  Then, when you subsequently searched for the closest building to building 2, you'd only have to look for building less than 102.4 mi/km away, because building 1 is known to be only that far.

Presumably you'd search outward from closest longitude (absolute difference in longitude) then closest latitude.  Maybe check the 20 (?) closest buildings in that search order?!

The idea, of course, is to prevent having to calc distance from each building to every other building every time, as that could take a very long time.

SQL DBA,SQL Server MVP(07, 08, 09) Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: "If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them."

• Steve Jones - SSC Editor

SSC Guru

Points: 720473

Do the buildings change often? Honestly, I'd expand on Scott's solution. If there were 20 buildings, or 100, now and they rarely change, I'd just calculate all the distances now and store those. Then look up the closest location quickly in there.

The reason is that buildings and contracts are something that often takes weeks or longer to change. However, queries might happen every day. I'd just have part of the process of adding a new building include a re-calc of all distances.

• marty.seed

SSCrazy

Points: 2441

Hi guys,

Great responses and ideas. We have clients that have hundreds, sometimes thousands of buildings. We are using the data for a major report and I figure we'd need to update whenever the report is run. Here is a look at what I did to get my data, I need to double check the distances but they look correct....

GO

BEGIN

IF OBJECT_ID('tempdb..#Location') IS NOT NULL

DROP TABLE #Location;

IF OBJECT_ID('tempdb..#BuildingLocation') IS NOT NULL

DROP TABLE #BuildingLocation;

END;

CREATE TABLE [dbo].[#BuildingLocation](

[BuildingID] [int] NULL,

[Latitude] [decimal](8, 6) NULL,

[Longitude] [decimal](9, 6) NULL

) ON [PRIMARY]

GO

CREATE CLUSTERED INDEX IDX_PK_BuildingLocations ON #BuildingLocation(BuildingID, Latitude, Longitude)

CREATE TABLE [dbo].[#Location](

[BuildingID] [int] NULL,

[Latitude] [decimal](8, 6) NULL,

[Longitude] [decimal](9, 6) NULL,

[BuildingID2] [int] NULL,

[Latitude2] [decimal](8, 6) NULL,

[Longitude2] [decimal](9, 6) NULL

) ON [PRIMARY]

GO

CREATE CLUSTERED INDEX IDX_PK_Locations ON #Location(BuildingID, Latitude, Longitude, BuildingID2, Latitude2, Longitude2)

INSERT INTO #BuildingLocation

SELECT a.BuildingID, a.Latitude, a.Longitude

INSERT INTO #Location

SELECT a.BuildingID, a.Latitude, a.Longitude,

b.buildingid AS BuildingID2, b.Latitude AS Latitude2, b.Longitude AS Longitude2

FROM #BuildingLocation a

FULL OUTER JOIN #BuildingLocation b ON a.BuildingID<>b.BuildingID

--WHERE a.buildingid=667

/*

create function [dbo].[fnCalcDistanceMiles] (@Lat1 decimal(8,4), @Long1 decimal(8,4), @Lat2 decimal(8,4), @Long2 decimal(8,4))

returns decimal (8,4) as

begin

declare @d decimal(28,10)

set @Lat1 = @Lat1 / 57.2958

set @Long1 = @Long1 / 57.2958

set @Lat2 = @Lat2 / 57.2958

set @Long2 = @Long2 / 57.2958

-- Calc distance

set @d = (Sin(@Lat1) * Sin(@Lat2)) + (Cos(@Lat1) * Cos(@Lat2) * Cos(@Long2 - @Long1))

-- Convert to miles

if @d <> 0

begin

set @d = 3958.75 * Atan(Sqrt(1 - power(@d, 2)) / @d);

end

return @d

end

*/

SELECT BuildingID,Min(ABS(dbo.fnCalcDistanceMiles(Latitude,Longitude,Latitude2,Longitude2))) NearestBuilding_MILES,

Min(ABS(dbo.fnCalcDistanceMiles(Latitude,Longitude,Latitude2,Longitude2)))*1.6 NearestBuilding_KMS

FROM #Location

GROUP BY BuildingID

• pietlinden

SSC Guru

Points: 62848

Why not just a table of  (BuildingFromID, BuildingToID, Distance) ? you'd have to create a check to make sure there are no duplicates in (From,To) and (To,From).

• marty.seed

SSCrazy

Points: 2441

How does that help?

• Phil Parkin

SSC Guru

Points: 244661

marty.seed wrote:

How does that help?

Because it gives you the distance from every building to every other building.

If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

• Jeff Moden

SSC Guru

Points: 996841

Phil Parkin wrote:

marty.seed wrote:

How does that help?

Because it gives you the distance from every building to every other building.

And, if you use the right measurements, it gives you the actual driving distance instead of "as the crow flies" distance like Lat/Lon would give.  You might even be able to find something on Google Maps that would do the conversions for you.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".
"If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

• pietlinden

SSC Guru

Points: 62848

How would you get driving distance? use a web API or a .NET library to get the directions from Point A to Point B? (Is this a Solomon Rutzky question?)

• Phil Parkin

SSC Guru

Points: 244661

The Google Maps API looks like it could do it, though it's going to be an RBAR solution, by the look of it.

If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

• David Burrows

SSC Guru

Points: 64736

Jeff Moden wrote:

Phil Parkin wrote:

marty.seed wrote:

How does that help?

Because it gives you the distance from every building to every other building.

And, if you use the right measurements, it gives you the actual driving distance instead of "as the crow flies" distance like Lat/Lon would give.  You might even be able to find something on Google Maps that would do the conversions for you.

I did this once for driving distance (only a piddly 150 dealerships). Calculated the driving distance and stored as from/to and distance as already mentioned. Made queries easy and fast. Note that if you do use driving distance, to and from journeys could be different. Another advantage to this is you can alter the distance to cater for anomalies.

Far away is close at hand in the images of elsewhere.
Anon.

• jcelko212 32090

SSCrazy Eights

Points: 9021

I did a problem like this a few decades ago for emergency services. We had to include driving time and consider the time of day. There is also the problem that sometimes it's faster to get from point A to point B than it is to go from point B to point A.

But let's keep it simple. Assume that :

1) the distance from point A to point A is zero.

2) the distance from point A to point B equals the distance from point B to point A.

CREATE TABLE Map

(location_id CHAR(5) NOT NULL PRIMARY KEY,

location_lat DECIMAL (5,3) NOT NULL,

location_long DECIMAL (5,3) NOT NULL);

Now let's build trips, using a little spherical trig. I wouldn't do this in SQL because SQL is a database language and not a computational language. The computations can little tricky when you get near the poles or the equator because you need to compensate for curvature.

CREATE TABLE Trips

(start_location_id CHAR(5) NOT NULL

REFERENCES Map(location_id)

finish_location_id CHAR(5) NOT NULL

REFERENCES Map(location_id)

CHECK (start_location_id < finish_location_id),

trip_distance DECIMAL(6,3) NOT NULL

CHECK (trip_distance > 0.000)

);

I like to make my DDL do as much work as possible, so I add check constraints and references. I also don't like wasting space with redundant data, so I expect to also use (start, finish) and build (finish, start) pairs in my query or a view. This means that given a map of (n) locations, Trips can have at most ((n² /2) - n) rows. Basically, I built a square array (n x n), remove the diagonal and keep half of it.

Your proximity query is now very easy.

WITH Undirected_Trips(location_id, trip_distance)

AS (SELECT T1.start_location_id, T1.trip_distance

FROM Trips AS T1

WHERE T1.start_location_id = @my_location

UNION ALL

SELECT T2.finish_location_id, T2.trip_distance

FROM Trips AS T2

WHERE T2.finish_location_id = @my_location)

SELECT @my_location, U.location_id, U.trip distance,

DENSE_RANK(ORDER BY U.trip distance ASC)

AS proximity_order

FROM Undirected_Trips AS U;

the undirected trips are built in a CTE, then that result is passed down to a query that uses a dense rank function to handle ties with it orders them.