Identifying repeating values in a string

  • The address field in my User table has data like:

    1905 1905 S OAK ST

    314 ROSSELL AVE

    252 252 HIGH MEADOWS ST

    1402 1402 TOWNSEND CT

    2021 MAHAN AVE

    347 347 N 7 AVE

    As you see we have repeating House #s here. I would like to write a query to:

    1.Give me addresses and its counts where the house # has been repeated.

    2.Correct the repeating house # to have a single house # in the Street Address

    Please help me out here.

  • Hi,

    Assuming that the first characters found in the address field are the house #s then you could do something like this.

    --find addresses with repeating house #s
    select *

    from

    user_table

    where

    substring(address,1,charindex(' ', address)-1) = substring(address, charindex(' ', address) + 1, (charindex(' ', address, charindex(' ', address) + 1)) - charindex(' ', address)-1)

    and

    isnumeric(substring(address,1,charindex(' ', address)-1)) = 1

     
     
    --remove repeating numbers

    update

    user_table

    set

    address = substring(address, charindex(' ', address) + 1, len(address))

    from

    user_table

    where

    substring(address,1,charindex(' ', address)-1) = substring(address, charindex(' ', address) + 1, (charindex(' ', address, charindex(' ', address) + 1)) - charindex(' ', address)-1)

    and

    isnumeric(substring(address,1,charindex(' ', address)-1)) = 1

    I imagine that's one way of doing it and there are no doubt going to be various ways of solving the same problem.  And of course, this will only work if the assumption that the field starts with the house # is correct.

    Hope that helps,

  • Hi there,

    I think this solution will satisfy you:

    First, I made a table with your records -

    /*-- -Object: table [dbo].[tHouses]    -------*/

    CREATE TABLE [dbo].[tHouses] (

     [ID] [smallint] IDENTITY (1,1) NOT NULL,

     [Street] [nvarchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL

    ) ON [PRIMARY]

    GO

    /*-- -Object: primary key [dbo].[tHouses].[PK_tHouses]    -------*/

    ALTER TABLE [dbo].[tHouses] ADD

     CONSTRAINT [PK_tHouses] PRIMARY KEY CLUSTERED

     ([ID]) ON [PRIMARY]

    GO

    SET XACT_ABORT ON

    SET ARITHABORT ON

    SET NUMERIC_ROUNDABORT OFF

    BEGIN TRANSACTION

    DECLARE @error INT

    DECLARE @ptrBinary varbinary(16)

    -- --Table: [dbo].[tHouses]

    SET IDENTITY_INSERT [dbo].[tHouses] ON

    -- --Insert

    INSERT INTO [dbo].[tHouses] ([ID], [Street]) VALUES(1, '1905 1905 S OAK ST ')

    INSERT INTO [dbo].[tHouses] ([ID], [Street]) VALUES(2, '314 ROSSELL AVE ')

    INSERT INTO [dbo].[tHouses] ([ID], [Street]) VALUES(3, '252 252 HIGH MEADOWS ST ')

    INSERT INTO [dbo].[tHouses] ([ID], [Street]) VALUES(4, '1402 1402 TOWNSEND CT ')

    INSERT INTO [dbo].[tHouses] ([ID], [Street]) VALUES(5, '2021 MAHAN AVE ')

    INSERT INTO [dbo].[tHouses] ([ID], [Street]) VALUES(6, '347 347 N 7 AVE ')

    SET IDENTITY_INSERT [dbo].[tHouses] OFF

    COMMIT TRANSACTION

     

    After that I script an sql to give you the slution to pt.1 and there you have commented the update of your records -

    declare @first_space tinyint

    declare @second_space tinyint

    declare @street_re nvarchar(50)

    declare @nr1 nvarchar(50)

    declare @nr2 nvarchar(50)

    declare @street nvarchar(50)

    declare @id smallint

    DECLARE nr_cursor CURSOR SCROLL FOR SELECT ID,Street FROM tHouses

     OPEN nr_cursor

     FETCH FIRST FROM nr_cursor INTO @id,@street

     WHILE @@FETCH_STATUS = 0

     BEGIN

      select @first_space = patindex('% %',@street)

      select @nr1 = left(@street,@first_space) -- first number

      select @street_re = right(@street,len(@street)-@first_space+1)

      select @second_space = patindex('% %',@street_re)

      select @nr2 = left(@street_re,@second_space) -- second number

      

      if (IsNumeric(@nr2)=1) and (@nr1=@nr2)

      begin

       select @id as ID,@street_re,@nr1 as FirstNumber,@nr2 as SecondNumber,@street as Street

    --   update tHouses set Street = @street_re where ID=@id

      end

      

       FETCH NEXT FROM nr_cursor INTO @id,@street

     END

    CLOSE nr_cursor

    DEALLOCATE nr_cursor

    In Theory, theory and practice are the same...In practice, they are not.
  • That looks like it will work.  Of course, you want to avoid cursors where possible

  • Yeah, I know. I have code to replace cursors with table variables, but this solution is assumed to be only for building or test environment. On product environment i have no cursors.

    In Theory, theory and practice are the same...In practice, they are not.
  • Thanks!!

    I used Karls solution and it worked well from me. I was able to solve the problem records for now. I had to fire the script on a 2.5MM table to update around 350,000 records. The results was perfect. Appreciate all you help.

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply