Nearest Proximity For every record

  • Hello!

    We have a query we are trying to build with a key component. All of our locations have a lat and long entered as own values in the table. We need to find out the nearest location to EACH building from all the other buildings. Another tricky part is we have to make sure this can run across multiple SQL Server instances, we have some that are 2008 and some that are 2012.

    For table structure just figure BuildingID int, Latitude decimal, Longitude decimal

    Has anyone had to do this? Any help is appreciated

     

  • Off the top of my head, you'd use a cross join so you'd get two copies of Building, and then if your buildings are really close together, you could just use Pythagorean theorem to get distance, and then use TOP 1 to get the closest one.

    Yeah, I know Haversine formula would be better, probably, but if they're really close together...

  • You'd also want to keep the closest building id in the table structure (with a datetime of when it was computed) and that distance.

    The reason for that is to provide a limit for future searches.  For example, say you for building 1, you determine that building 2 is closest, and it's 102.4 miles/kms away (using the standard lat/long distance formula).  Then, when you subsequently searched for the closest building to building 2, you'd only have to look for building less than 102.4 mi/km away, because building 1 is known to be only that far.

    Presumably you'd search outward from closest longitude (absolute difference in longitude) then closest latitude.  Maybe check the 20 (?) closest buildings in that search order?!

    The idea, of course, is to prevent having to calc distance from each building to every other building every time, as that could take a very long time.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Do the buildings change often? Honestly, I'd expand on Scott's solution. If there were 20 buildings, or 100, now and they rarely change, I'd just calculate all the distances now and store those. Then look up the closest location quickly in there.

    The reason is that buildings and contracts are something that often takes weeks or longer to change. However, queries might happen every day. I'd just have part of the process of adding a new building include a re-calc of all distances.

     

  • Hi guys,

    Great responses and ideas. We have clients that have hundreds, sometimes thousands of buildings. We are using the data for a major report and I figure we'd need to update whenever the report is run. Here is a look at what I did to get my data, I need to double check the distances but they look correct....

    GO

    BEGIN

    IF OBJECT_ID('tempdb..#Location') IS NOT NULL

    DROP TABLE #Location;

    IF OBJECT_ID('tempdb..#BuildingLocation') IS NOT NULL

    DROP TABLE #BuildingLocation;

    END;

    CREATE TABLE [dbo].[#BuildingLocation](

    [BuildingID] [int] NULL,

    [Latitude] [decimal](8, 6) NULL,

    [Longitude] [decimal](9, 6) NULL

    ) ON [PRIMARY]

    GO

    CREATE CLUSTERED INDEX IDX_PK_BuildingLocations ON #BuildingLocation(BuildingID, Latitude, Longitude)

    CREATE TABLE [dbo].[#Location](

    [BuildingID] [int] NULL,

    [Latitude] [decimal](8, 6) NULL,

    [Longitude] [decimal](9, 6) NULL,

    [BuildingID2] [int] NULL,

    [Latitude2] [decimal](8, 6) NULL,

    [Longitude2] [decimal](9, 6) NULL

    ) ON [PRIMARY]

    GO

    CREATE CLUSTERED INDEX IDX_PK_Locations ON #Location(BuildingID, Latitude, Longitude, BuildingID2, Latitude2, Longitude2)

    INSERT INTO #BuildingLocation

    SELECT a.BuildingID, a.Latitude, a.Longitude

    FROM vwAdHocBuilding_Common_Object_GeoLocation a

    INSERT INTO #Location

    SELECT a.BuildingID, a.Latitude, a.Longitude,

    b.buildingid AS BuildingID2, b.Latitude AS Latitude2, b.Longitude AS Longitude2

    FROM #BuildingLocation a

    FULL OUTER JOIN #BuildingLocation b ON a.BuildingID<>b.BuildingID

    --WHERE a.buildingid=667

    /*

    create function [dbo].[fnCalcDistanceMiles] (@Lat1 decimal(8,4), @Long1 decimal(8,4), @Lat2 decimal(8,4), @Long2 decimal(8,4))

    returns decimal (8,4) as

    begin

    declare @d decimal(28,10)

    -- Convert to radians

    set @Lat1 = @Lat1 / 57.2958

    set @Long1 = @Long1 / 57.2958

    set @Lat2 = @Lat2 / 57.2958

    set @Long2 = @Long2 / 57.2958

    -- Calc distance

    set @d = (Sin(@Lat1) * Sin(@Lat2)) + (Cos(@Lat1) * Cos(@Lat2) * Cos(@Long2 - @Long1))

    -- Convert to miles

    if @d <> 0

    begin

    set @d = 3958.75 * Atan(Sqrt(1 - power(@d, 2)) / @d);

    end

    return @d

    end

    */

    SELECT BuildingID,Min(ABS(dbo.fnCalcDistanceMiles(Latitude,Longitude,Latitude2,Longitude2))) NearestBuilding_MILES,

    Min(ABS(dbo.fnCalcDistanceMiles(Latitude,Longitude,Latitude2,Longitude2)))*1.6 NearestBuilding_KMS

    FROM #Location

    GROUP BY BuildingID

  • Why not just a table of  (BuildingFromID, BuildingToID, Distance) ? you'd have to create a check to make sure there are no duplicates in (From,To) and (To,From).

  • How does that help?

  • marty.seed wrote:

    How does that help?

    Because it gives you the distance from every building to every other building.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Phil Parkin wrote:

    marty.seed wrote:

    How does that help?

    Because it gives you the distance from every building to every other building.

    And, if you use the right measurements, it gives you the actual driving distance instead of "as the crow flies" distance like Lat/Lon would give.  You might even be able to find something on Google Maps that would do the conversions for you.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • How would you get driving distance? use a web API or a .NET library to get the directions from Point A to Point B? (Is this a Solomon Rutzky question?)

  • The Google Maps API looks like it could do it, though it's going to be an RBAR solution, by the look of it.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Jeff Moden wrote:

    Phil Parkin wrote:

    marty.seed wrote:

    How does that help?

    Because it gives you the distance from every building to every other building.

    And, if you use the right measurements, it gives you the actual driving distance instead of "as the crow flies" distance like Lat/Lon would give.  You might even be able to find something on Google Maps that would do the conversions for you.

    I did this once for driving distance (only a piddly 150 dealerships). Calculated the driving distance and stored as from/to and distance as already mentioned. Made queries easy and fast. Note that if you do use driving distance, to and from journeys could be different. Another advantage to this is you can alter the distance to cater for anomalies.

    Far away is close at hand in the images of elsewhere.
    Anon.

  • I did a problem like this a few decades ago for emergency services. We had to include driving time and consider the time of day. There is also the problem that sometimes it's faster to get from point A to point B than it is to go from point B to point A.

    But let's keep it simple. Assume that :

    1) the distance from point A to point A is zero.

    2) the distance from point A to point B equals the distance from point B to point A.

    CREATE TABLE Map

    (location_id CHAR(5) NOT NULL PRIMARY KEY,

    location_lat DECIMAL (5,3) NOT NULL,

    location_long DECIMAL (5,3) NOT NULL);

    Now let's build trips, using a little spherical trig. I wouldn't do this in SQL because SQL is a database language and not a computational language. The computations can little tricky when you get near the poles or the equator because you need to compensate for curvature.

    CREATE TABLE Trips

    (start_location_id CHAR(5) NOT NULL

    REFERENCES Map(location_id)

    ON DELETE CASCADE,

    finish_location_id CHAR(5) NOT NULL

    REFERENCES Map(location_id)

    ON DELETE CASCADE,

    CHECK (start_location_id < finish_location_id),

    trip_distance DECIMAL(6,3) NOT NULL

    CHECK (trip_distance > 0.000)

    );

    I like to make my DDL do as much work as possible, so I add check constraints and references. I also don't like wasting space with redundant data, so I expect to also use (start, finish) and build (finish, start) pairs in my query or a view. This means that given a map of (n) locations, Trips can have at most ((n² /2) - n) rows. Basically, I built a square array (n x n), remove the diagonal and keep half of it.

    Your proximity query is now very easy.

    WITH Undirected_Trips(location_id, trip_distance)

    AS (SELECT T1.start_location_id, T1.trip_distance

    FROM Trips AS T1

    WHERE T1.start_location_id = @my_location

    UNION ALL

    SELECT T2.finish_location_id, T2.trip_distance

    FROM Trips AS T2

    WHERE T2.finish_location_id = @my_location)

    SELECT @my_location, U.location_id, U.trip distance,

    DENSE_RANK(ORDER BY U.trip distance ASC)

    AS proximity_order

    FROM Undirected_Trips AS U;

    the undirected trips are built in a CTE, then that result is passed down to a query that uses a dense rank function to handle ties with it orders them.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply