Map IP Address to a Geographical Location

  • hi,

    yes, fine tuning the queries to get the best performance is always welcome.

    For production environment, I suggest to process/publish the conversion of IP to num during the import process. This way applications are not affected by the overhead.

    thanks for the feedback

  • ozkary (7/24/2009)


    hi,

    yes, fine tuning the queries to get the best performance is always welcome.

    For production environment, I suggest to process/publish the conversion of IP to num during the import process. This way applications are not affected by the overhead.

    thanks for the feedback

    Heh...yes... unless you have to do it on the fly during a session.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • No need to cast either

    alter function dbo.ConvertIp2Num(@ip varchar(15))

    returns bigint

    as

    begin

    return16777216.0 * parsename(@ip, 4)

    + 65536 * parsename(@ip, 3)

    + 256 * parsename(@ip, 2)

    + 1 * parsename(@ip, 1)

    end


    N 56°04'39.16"
    E 12°55'05.25"

  • And with some error checking

    alter function dbo.ConvertIp2Num(@ip varchar(15))

    returns bigint

    as

    begin

    returncase

    when @ip like '%[^0-9.]%' then NULL

    when @ip like '%.%.%.%.%' then null

    when @ip like '%..%' then null

    when @ip like '%[0-9][0-9][0-9][0-9]%' then null

    else 16777216.0 * parsename(@ip, 4)

    + 65536 * parsename(@ip, 3)

    + 256 * parsename(@ip, 2)

    + 1 * parsename(@ip, 1)

    end

    end

    Or see this topic from 2006

    http://www.sqlservercentral.com/Forums/Topic302100-145-1.aspx


    N 56°04'39.16"
    E 12°55'05.25"

  • Peso (7/25/2009)


    Or see this topic from 2006...

    Refreshers are never a bad thing. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Of course not!

    I just googled the keywords and the link popped up.


    N 56°04'39.16"
    E 12°55'05.25"

  • Heh... sorry Peter... Thought you were trying to flatten the trajectory of a pork chop for some reason. 😀 I'm a bit touchy because I'm trying to quit smoking, again. Right now, I hate just about everything... even the damned TV. I was watching "Moonshot" to try to take my mind off of smokes and guess what the freakin' astronauts and their wives were doing? SMOKING!! :sick: I can't win. 😛 Scotty, beam me the hell out of here! :pinch:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Great article!

    I had downloaded the following scripts for a very similar database schema, that speeds up "select" queries against the blocks table. It increased my "lookups per second" from around 45 to 3500, really.

    I dropped indexes on GeoLiteCity_blocks table and created the following two indexes:

    CREATE NONCLUSTERED INDEX [NCL_END_IP] ON [dbo].[GeoLiteCity_blocks]

    (

    [endIpnum] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [NCL_IND_STARTIP] ON [dbo].[GeoLiteCity_blocks]

    (

    [startIpnum] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    Then to find the LocationID or any other column for a specific IP number is as follows:

    declare @IP bigint

    set @IP=123456789

    select * from GeoLiteCity_blocks

    where startIpnum=(select max(startIpnum) from GeoLiteCity_blocks where startIpnum<=@IP)

    and endIpnum >=@IP

    Please check this on a development environment with backups and tell me the results. Any comparisons would be great actually, I never had the time to do that.

    By the way, I could not find the original source of this trick. If anyone knows, you may add it also.

  • http://www.dyndnsservices.com/knowshow.aspx?ID=4 has a complete solution for the maxmind geo city database

  • Two things on the PARSENAME conversion routines:

    First, T-SQL scalar functions have no place in this world - the routines would be much more happily expressed in an in-line table-valued function.

    Second, PARSENAME is non-deterministic, despite what Books Online says. See https://connect.microsoft.com/SQLServer/feedback/details/488058/parsename-incorrectly-documented-as-deterministic

    A deterministic iTVF IP-to-BIGINT function:

    CREATE FUNCTION [dbo].[itfn_IPv4_Octets]

    (@IP NVARCHAR(15))

    RETURNS TABLE

    WITH SCHEMABINDING

    AS

    RETURN

    SELECT octet1 = CASE WHEN LEN(@IP) - LEN(REPLACE(@IP, N'.', N'')) = 3 THEN ISNULL(CONVERT(NVARCHAR(3), SUBSTRING(@IP, 1, (CHARINDEX(N'.', @IP, 1)) - 1)), N'') ELSE N'' END,

    octet2 = CASE WHEN LEN(@IP) - LEN(REPLACE(@IP, N'.', N'')) = 3 THEN ISNULL(CONVERT(NVARCHAR(3), SUBSTRING(@IP, (CHARINDEX(N'.', @IP, 1)) + 1, (CHARINDEX(N'.', @IP, (CHARINDEX(N'.', @IP, 1)) + 1)) - (CHARINDEX(N'.', @IP, 1)) - 1)), N'') ELSE N'' END,

    octet3 = CASE WHEN LEN(@IP) - LEN(REPLACE(@IP, N'.', N'')) = 3 THEN ISNULL(CONVERT(NVARCHAR(3), SUBSTRING(@IP, (CHARINDEX(N'.', @IP, (CHARINDEX(N'.', @IP, 1)) + 1)) + 1, (CHARINDEX(N'.', @IP, (CHARINDEX(N'.', @IP, (CHARINDEX(N'.', @IP, 1)) + 1)) + 1)) - (CHARINDEX(N'.', @IP, (CHARINDEX(N'.', @IP, 1)) + 1)) - 1)), N'') ELSE N'' END,

    octet4 = CASE WHEN LEN(@IP) - LEN(REPLACE(@IP, N'.', N'')) = 3 THEN ISNULL(CONVERT(NVARCHAR(3), SUBSTRING(@IP, (CHARINDEX(N'.', @IP, (CHARINDEX(N'.', @IP, (CHARINDEX(N'.', @IP, 1)) + 1)) + 1)) + 1, LEN(@IP) - (CHARINDEX(N'.', @IP, (CHARINDEX(N'.', @IP, (CHARINDEX(N'.', @IP, 1)) + 1)) + 1)))), N'') ELSE N'' END;

    That happens to split the IP address into octets, so converting further into a BINARY(4) - as suggested earlier - is not too hard:

    SELECT CONVERT

    (

    BINARY(4),

    CHAR(FN.octet1) +

    CHAR(FN.octet2) +

    CHAR(FN.octet3) +

    CHAR(FN.octet4)

    )

    FROM [dbo].[itfn_IPv4_Octets] ('162.74.5.51') FN;

    Conversion from BIGINT to dotted-octets:

    CREATE FUNCTION dbo.IntToIP

    (@IP BIGINT)

    RETURNS TABLE

    WITH SCHEMABINDING

    AS

    RETURN

    SELECT ip_address =

    CONVERT(VARCHAR(3), ((@IP & 0xFF000000) / 256 / 256 / 256)) + '.' +

    CONVERT(VARCHAR(3), ((@IP & 0xFF0000) / 256 / 256)) + '.' +

    CONVERT(VARCHAR(3), ((@IP & 0xFF00) / 256)) + '.' +

    CONVERT(VARCHAR(3), (@IP & 0xFF));

    Notwithstanding some icky implicit conversions, that function will decode the BINARY(4) representation too.

    Now those functions might look a bit hairy, but being deterministic, they're fully foldable, so the estimated execution plans for the following statements:

    SELECT * FROM dbo.itfn_IPv4_Octets(N'162.74.5.51');

    SELECT * FROM dbo.IntToIP(0xA24A0533);

    ...are both Constant Scans, with the answers fully listed in the operator - all the maths is done at compilation time, not execution time...which I think is remarkable.

  • Jeff Moden (7/25/2009)


    ...I'm a bit touchy because I'm trying to quit smoking, again...

    Good job I'm never giving up 😉

  • Very good article.

    In practice, have you checked the accuracy against other geocoding techniques? I'm curious with users who do not posess static IP addresses whether IP resolves to the Telco location rather than the actual user.:Whistling:

    Does anyone have insight on this?

  • i must have tunnel vision this morning;

    i didn't see any BULK insert scripts for the files; after downloading the csv files, i see they are UNIX style(slash r for the row delimiter, right?), but they are also dbl-quote delimited, so i need a format file.

    did anyone already do this, or do i need to get some coffee and sweat it out?

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (4/16/2010)


    i must have tunnel vision this morning;

    i didn't see any BULK insert scripts for the files; after downloading the csv files, i see they are UNIX style(slash r for the row delimiter, right?), but they are also dbl-quote delimited, so i need a format file.

    did anyone already do this, or do i need to get some coffee and sweat it out?

    http://www.dyndnsservices.com/knowshow.aspx?ID=4 has the bulk insert/import and the required format files etc. Also importing the data can be a little like beating a dead horse unless you restore correct ASCII formatting to the files. Just open and save them using wordpad before importing.

  • Mark-101232 (7/24/2009)


    Jeff Moden (7/24/2009)


    Mark (7/13/2009)


    I think both functions can be simplified

    create function dbo.ConvertIp2Num(@ip nvarchar(15))

    returns bigint

    as

    begin

    return ((cast(parsename(@ip,4) as bigint)*256+

    cast(parsename(@ip,3) as bigint))*256+

    cast(parsename(@ip,2) as bigint))*256+

    cast(parsename(@ip,1) as bigint)

    end

    That won't return the correct answer... the octets must be multiplied by powers of 256.

    select dbo.ConvertIp2Num('192.15.10.125') -- gives 3222211197

    The nesting of the multiplys gives the effect of multiplying by powers of 256. (Look at the brackets)

    Surprisingly, the function above seemed to take longer on average than the function below (which is what was shown earlier):

    CREATE FUNCTION [dbo].[fnDot2LongIP]

    (@ipaddr varchar(15))

    RETURNS bigint

    AS

    BEGIN

    DECLARE @longip bigint

    SELECT @longip = CAST(PARSENAME(@ipaddr, 1) AS bigint) % 256

    + CAST(PARSENAME(@ipaddr, 2) AS bigint) % 256 * 256

    + CAST(PARSENAME(@ipaddr, 3) AS bigint) % 256 * 65536

    + CAST(PARSENAME(@ipaddr, 4) AS bigint) % 256 * 16777216

    RETURN @longip

    END

    As for the modulus operations, I tried the function with and without them. It seemed slightly quicker WITH them rather than without them. Does anyone else have the same results?

    [font="Tahoma"]Zycon - The Search Engine for Manufacturers[/url]. Helping Engineers and Technical Buyers Locate Aluminum Extrusions, Metal Stampers, Plastic Fabricators, Valves, and More.[/font]

Viewing 15 posts - 16 through 30 (of 52 total)

You must be logged in to reply to this topic. Login to reply