Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Storing IPv4 Addresses for Performance Expand / Collapse
Author
Message
Posted Wednesday, August 19, 2009 8:31 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Thursday, October 9, 2014 1:02 PM
Points: 6,032, Visits: 5,284
Andrew (8/19/2009)
tdcheek (8/19/2009)
I've always liked storing IP's as binary(4) - still 4 bytes, but simpler conversions:


I like this format as well, since that's essentially what IP addresses are anyway. Like using INT and four separate TINYINT columns, it is efficient on space and has implicit validation. However, it has a few other advantages I like:

1) The queries I perform are very simple and efficient. Most of them are sargable. You don't have the problem of half the address space being negative that you have with INT. It is very easy to say:

WHERE @IPAddress BETWEEN RangeLowerLimit AND RangeUpperLimit

3) It's relatively simple to convert to dotted notation if necessary.

Andrew


I think the conversion to dotted notation is pretty easy either way, but I agree we get some built-in validation. I think the INT solution could work in a WHERE clause, but I think it would be more complex and have to take into account that and 128.0.0.0 we went negative, that being said, I think the binary(4) method is probably better but still all in all a thought provoking article.

CEWII
Post #773572
Posted Wednesday, August 19, 2009 8:41 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, December 22, 2010 12:12 AM
Points: 21, Visits: 81
tdcheek (8/19/2009)
I've always liked storing IP's as binary(4)

I had considered using binary(4) however since bitwise operations cannot be performed on binary, this made integer my final choice.

HansVE (8/19/2009)
1. You do not know of subnetting?

Subnetting can easily be achieved using bitmasking on integer IPs. The following will yield 192.168.1.0 :
SELECT [dbo].[IPAddressToString]([dbo].[IPAddressToInteger]('192.168.1.1') & [dbo].[IPAddressToInteger]('255.255.255.0'))

HansVE (8/19/2009)
A "ip_int_to_ip_string computed column" will impede performance, as will the functions in the article.

Although using multiple tinyint fields requires no computational conversion, it does require that every ORDER BY, UNION, WHERE, etc operation be done four times, whilst a single integer field requires only one operation in all instances.

Also, if you wish to submit your IP address as a varchar(15) to the stored procedure to be then stored as four tinyint field within the table, you will require the same string splitting conversion as a single integer. Selecting rows to return IP addresses as varchar(15) will also require similar conversions. The only computational difference is mostly the avoidance of a rather efficient bitmask operation, which in my opinion, far outweights the costs of multiple operations when joining, filtering or ordering tables by IP addresses stored as four tinyint fields.
Post #773580
Posted Wednesday, August 19, 2009 8:45 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 9:29 AM
Points: 4,436, Visits: 6,337
could you please redo your testing using a char(17) datatype instead of a varchar(17)?

Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #773588
Posted Wednesday, August 19, 2009 8:48 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, December 22, 2010 12:12 AM
Points: 21, Visits: 81
TheSQLGuru (8/19/2009)
could you please redo your testing using a char(17) datatype instead of a varchar(17)?

I'm not sure what would be the point since IP addresses (when used as strings) are not fixed length.
Post #773590
Posted Wednesday, August 19, 2009 8:52 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Tuesday, September 9, 2014 6:08 AM
Points: 77, Visits: 169
lhowe (8/19/2009)
Darn it HansVE, you beat me to the punch! I was going to say exactly the same thing. I don't see why someone would go to all of the trouble to convert IPv4 values this way, and back again, when it's far easier just to store them as four tinyint values.

There could be a performance benefit. My own tests a while back indicated that one int comparison was equally efficient as smallint or tinyint, probably because you're using a computer with 32-bit registers. The bit datatype stunk - but this was on SQL Server 2000 or maybe even 7. Still, I prefer to put bit in tinyint.

I agree however that four octets is useful in many ways. There's nothing to stop you storing an int or binary(4) version as well.

I spent a little while devising an encoding of 6 character alphanumeric strings (09AZ) into int, for an indexed key field. I got slightly poorer performance with int than with char(6) even not counting the cost of conversion, which wasn't what I expected. I have increased my knowledge of indexing since, too (apparently non-declared-unique indexes have 8 bytes arbitrarily added to the key??) Perhaps I got it wrong.

Anyway, maybe we should be thinking about IPv6, where the address is, what, 16 bytes? Plus refinements that the Wikipedia page hasn't yet communicated to me. Binary(16) for that, I think. Although IPv4 is not going away quickly yet.
Post #773595
Posted Wednesday, August 19, 2009 8:53 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 9:29 AM
Points: 4,436, Visits: 6,337
Jean-Sebastien Carle (8/19/2009)
TheSQLGuru (8/19/2009)
could you please redo your testing using a char(17) datatype instead of a varchar(17)?

I'm not sure what would be the point since IP addresses (when used as strings) are not fixed length.


The point would be exactly what your article was about - what offers the best performance. There is a chance that fixed storage will perform better (however slightly just like parsename as opposed to substrings). It should be very easy to modify your existing test suite to try this option out. It wasn't like I was asking you to completely reinvent the wheel. :)


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #773597
Posted Wednesday, August 19, 2009 9:26 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, November 18, 2014 12:25 PM
Points: 24, Visits: 253
rja.carnegie (8/19/2009)
lhowe (8/19/2009)
Darn it HansVE, you beat me to the punch! I was going to say exactly the same thing. I don't see why someone would go to all of the trouble to convert IPv4 values this way, and back again, when it's far easier just to store them as four tinyint values.

There could be a performance benefit. My own tests a while back indicated that one int comparison was equally efficient as smallint or tinyint, probably because you're using a computer with 32-bit registers. The bit datatype stunk - but this was on SQL Server 2000 or maybe even 7. Still, I prefer to put bit in tinyint.

I agree however that four octets is useful in many ways. There's nothing to stop you storing an int or binary(4) version as well.

I spent a little while devising an encoding of 6 character alphanumeric strings (09AZ) into int, for an indexed key field. I got slightly poorer performance with int than with char(6) even not counting the cost of conversion, which wasn't what I expected. I have increased my knowledge of indexing since, too (apparently non-declared-unique indexes have 8 bytes arbitrarily added to the key??) Perhaps I got it wrong.

Anyway, maybe we should be thinking about IPv6, where the address is, what, 16 bytes? Plus refinements that the Wikipedia page hasn't yet communicated to me. Binary(16) for that, I think. Although IPv4 is not going away quickly yet.


Yes, we now use binary(16) because we may need to support IPV6 in the future. The binary datatype has the advantage of working with SUBSTRING() and many other string functions. SQL Server 2008 also now has the style parameter of 2 to support conversion to and from strings without using undocumented and/or poorly performing functions.
Post #773656
Posted Wednesday, August 19, 2009 9:31 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, November 20, 2014 8:21 AM
Points: 21, Visits: 46
I'm with HansVE and the others on the TinyInt split of the address. You can index those columns on any table that includes them and then join each of the four pieces together very efficiently. You can optionally store the address as a varchar as well if necessary. The cool thing about this is that you can easily search the addresses within just 1 of the four parts (or any combo thereof). If you are worried about space over functionality and low TCO these days, pursue an archive/purging option.

BTW, very cool example of bit shifting.
Post #773665
Posted Wednesday, August 19, 2009 9:47 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, December 22, 2010 12:12 AM
Points: 21, Visits: 81
TheSQLGuru (8/19/2009)
The point would be exactly what your article was about - what offers the best performance. There is a chance that fixed storage will perform better (however slightly just like parsename as opposed to substrings). It should be very easy to modify your existing test suite to try this option out. It wasn't like I was asking you to completely reinvent the wheel. :)

Right. However, char(15) introduces a new element, trailing whitespace. Wouldn't you consider that as a reason to negate performance benefits char(15) could bring over varchar(15)?
Post #773686
Posted Wednesday, August 19, 2009 10:07 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 9:29 AM
Points: 4,436, Visits: 6,337
Jean-Sebastien Carle (8/19/2009)
TheSQLGuru (8/19/2009)
The point would be exactly what your article was about - what offers the best performance. There is a chance that fixed storage will perform better (however slightly just like parsename as opposed to substrings). It should be very easy to modify your existing test suite to try this option out. It wasn't like I was asking you to completely reinvent the wheel. :)

Right. However, char(15) introduces a new element, trailing whitespace. Wouldn't you consider that as a reason to negate performance benefits char(15) could bring over varchar(15)?


only 1 rtrim is required:

declare @ip char(15)
set @ip = '1.1.1.1'
select '/' + PARSENAME(@IP,4) + '/'
select '/' + PARSENAME(@IP,3) + '/'
select '/' + PARSENAME(@IP,2) + '/'
select '/' + PARSENAME(@IP,1) + '/'
select '/' + RTRIM(PARSENAME(@IP,1)) + '/'



Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #773699
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse