September 21, 2004 at 4:09 am
I have a nice problem. There is a char(255) field in a 3rd-party database that contains, among other things, an IP address that I would like to extract. But the IP address does not appear in the field at any fixed location; nor is it always preceded or followed by anydistinguishing characters.
How can I extract the IP address easily?
Regular expressions is one way, but all of the implementations I've found use functions, and I am on a SQL 7 server, so I cannot set up any user-defined functions.
Any assistance would be wolcome.
-steve
September 21, 2004 at 4:54 am
Easily? Without any distinguishing charachteristics? Can't be done "easily". ![]()
Anyway, you'll have to loop through the entire string until you find the ip (or at least by best guess believe you have found it)
The thought is this - find the first period in the string, see if the preceding 3 charachters are numbers, if they are, we hope this is the first octet in the ip, so we grab it from there. If it's not a number, we go look for the next period in the string. Repeat until there are no more periods.
Since there can't be a function, you may implement it as a stored procedure instead..
-- example
declare @s-2 varchar(255), @i int, @j-2 int
set @s-2 = 'long.. . . string from 123.123.123.123 that is placed here by chance. It could very well be somewhere else as well.'
set @i = 1
set @j-2 = len(@s)
while @i <= @j-2
begin
if (select isnumeric(substring(@s, (charindex('.', @s-2, @i) - 3), 3))) = 1 -- triplet is a number
begin
select substring(@s, (charindex('.', @s-2, @i) - 3), 15) -- grab the ip string
set @i = @j-2
end
else if isnumeric(substring(@s, (charindex('.', @s-2, @i) - 3), 3)) = 0 -- not a number
begin
set @i = @i + charindex('.', @s-2, @i) -- move offset so we can find the next period
end
end
Good luck
/Kenneth
September 21, 2004 at 7:38 am
Try this for a laugh
SELECT COALESCE(
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),15),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),14),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),13),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),14),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),13),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),13),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),14),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),13),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),13),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9].[0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9][0-9].[0-9].[0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),13),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9].[0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9][0-9].[0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9].[0-9][0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9].[0-9][0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9][0-9][0-9].[0-9].[0-9].[0-9]%',[column]),9),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),14),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),13),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),13),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9][0-9].[0-9].[0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),13),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9][0-9].[0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9].[0-9][0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9][0-9].[0-9].[0-9]%',[column]),9),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9][0-9].[0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9].[0-9][0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9][0-9].[0-9]%',[column]),9),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9].[0-9][0-9][0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9].[0-9][0-9]%',[column]),9),
SUBSTRING([column],PATINDEX('%[0-9][0-9].[0-9].[0-9].[0-9]%',[column]),8),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),13),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9][0-9].[0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9].[0-9][0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9][0-9].[0-9].[0-9]%',[column]),9),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),12),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9][0-9].[0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9].[0-9][0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9][0-9].[0-9]%',[column]),9),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9].[0-9][0-9][0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9].[0-9][0-9]%',[column]),9),
SUBSTRING([column],PATINDEX('%[0-9].[0-9][0-9].[0-9].[0-9]%',[column]),8),
SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9][0-9]%',[column]),11),
SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9][0-9].[0-9][0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9][0-9].[0-9]%',[column]),9),
SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9].[0-9][0-9][0-9]%',[column]),10),
SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9].[0-9][0-9]%',[column]),9),
SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9][0-9].[0-9]%',[column]),8),
SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9].[0-9][0-9][0-9]%',[column]),9),
SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9].[0-9][0-9]%',[column]),8),
SUBSTRING([column],PATINDEX('%[0-9].[0-9].[0-9].[0-9]%',[column]),7),
'')
FROM
Far away is close at hand in the images of elsewhere.
Anon.
September 22, 2004 at 1:18 pm
I agree it is a messy search for an IP string. Consider a string like this:
DECLARE @string varchar(255)
SET @string = 'Title: Lawn Mower. Cost: $125.67. IP Address: 123.12.1.16'
There's a false positive in this string. I think the pure characteristics of an IP address
are: The IP string is all numbers and periods (period: aka point aka full-stop). There are
exactly 3 periods, the length of any numeric substring is between 1 and 3, and the total length of
the IP string is between 7 and 15. In fact, the value of each numeric substring must be
between 0 and 255.
-- Here's the plan for the stored procedure:
-- 1. Find iX = location of first '.'
-- 2. Set iY = iX-1; decrement while char at iY is NUMERIC, setting iBegin = iY
-- 3. Set iX2, iX3 = Location of second and third '.'
-- 4. Set iY = iX3+1; increment while char at IY is NUMERIC, setting iEnd = iY
-- 5. You have the begin and end. Calc IPLENGTH = end - begin + 1; get the candidate IP address
-- 6. Additional checks; add these if the above returns strings that are not the IP address:
-- - All characters from IX+1 to IX2-1 are numeric
-- - All characters from IX2+1 to IX3-1 are numeric
-- - Length of string is <= 15 (eg: 123.123.123.123)
-- - Length of string is >= 7 (eg: 1.2.3.4)
-- - Etc., etc. I'd code as few edits as necessary to get a reasonable answer.
-- 7. If the found string fails the edit, then resume at the next period.
DECLARE @work varchar(255), @string varchar(255)
DECLARE @ix int, @ix2 int, @ix3 int, @iy int, @ibegin int, @iend int
DECLARE @len int, @iplength int, @ipstring, @valid int
-- 1. Find iX = Location of first '.'
SET @work = @string
SET @len = LEN(@work)
SET @ix = CHARINDEX('.',@work,1)
SET @valid = 1
WHILE @valid = 1
BEGIN
-- 2. Set iY = iX-1; decrement while char at iY is NUMERIC, setting iBegin = iY
SET @iy = @ix -1
SET @ibegin = @ix
WHILE ISNUMERIC(SUBSTRING(@work,@iy,1))
BEGIN
SET @ibegin = @iy
IF @iy = 1 BREAK
SET @iy = @iy - 1
END
-- 3. Set iX2, iX3 = Location of second and third '.'
SET @ix2 = CHARINDEX('.',@work,@ix+1)
SET @ix3 = CHARINDEX('.',@work,@ix2+1)
-- 4. Set iY = iX3+1; increment while char at IY is NUMERIC, setting iEnd = iY
SET @iy = @ix3 + 1
SET @iend = @ix3
WHILE ISNUMERIC(SUBSTRING(@work,@iy,1))
BEGIN
SET @iend = @iy
IF @iy = @len BREAK
SET @iy = @iy + 1
END
-- 5. You have the begin and end. Calc IPLENGTH = end - begin + 1; get the candidate IP address
SET @iplength = @iend - @begin + 1
SET @ipstring = SUBSTRING(@work,@ibegin,@iplength)
SET @valid = 1
-- 6. Additional checks; add these if the above returns strings that are not the IP address:
-- - All characters from IX+1 to IX2-1 are numeric
-- - All characters from IX2+1 to IX3-1 are numeric
-- - Length of string is <= 15 (eg: 123.123.123.123)
-- - Length of string is >= 7 (eg: 1.2.3.4)
-- 7. IF THE STRING FAILS ANY EDIT(S) - SET @valid = 0
-- and keep looking starting at the next '.'
IF @valid = 0
BEGIN
SET @ix = @ix2
END
END
-- Return to the caller: @ipstring
Bob Monahon
Viewing 4 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply