duplicate entries in a field

  • I have a field that holds address and city. For some reason the app populated the city twice- for example: 123 Anywhere Place, NY, NY but it should be 123 Anywhere Place, NY. The app is fixed, but now I have to fix all the records this has affected(about 2000) - How would I write an update statement that only updates the records with duplicate cities?

     

    Thanks!

    Thanks!

  • update Customer

    set address = LEFT (address, LEN(address) - CHARINDEX(',', REVERSE(address)))

  • TRY THIS

    Assumed ',' will be the separator between the duplicates and repeated city name is the last word in the Address

    Ex: Updated Name column in Tab_Test table

    UPDATE TAB_TEST

    SET NAME=reverse(substring(reverse(rtrim(name)),charindex(',',reverse(rtrim(name)))+1, len(reverse(rtrim(name))) ))

  • You can try this also

    update Tab_test

    set name=LEFT (name, LEN(rtrim(name)) - CHARINDEX(',', REVERSE(rtrim(name))))

    Added rtrim to Nagabhushanam  approach

     

  • Thanks, but both of these update statements get rid of both instances of the city. I need to retain one of the cities. Any ideas?

    Thanks!

    Thanks!

  • R u sure???

    LEFT

    (name, LEN(rtrim(name)) - CHARINDEX(',', REVERSE(rtrim(name)))) works fine for me.

     

    Nice code BTW.... I need to start using this reverse function more often...

  • I assumed that duplicate city is separated by ','

    If not, what are all the characters separating two values.

    If both are not separated by any of the characters then what is the length of city is it variable?

  • Sorry- you are correct, these statements do work. When I tested it - it showed me also the records that do not have a city whatsoever. But it does work on eliminating one instance of the city on the records with dups. Thank you so much for you help!!

    Thanks!

  • ok- I just tried it in a test table and  it doesn't work.

    This is the update statement I'm using:

    update Tab_test

    set name=LEFT (name, LEN(rtrim(name)) - CHARINDEX(',', REVERSE(rtrim(name))))

     

    Records that were 123 Anywhere Pl,NY,NY are now

    123 Anywhere Pl,NY (which is good)

    But the ones that were correct are now

    123 Anywhere Pl

     

    I'm not sure how to make this work.

     

    thanks!

    Thanks!

  • I created a function to count the commas in the address field like

    Create Function CountCommas(@str varchar(4000)) returns int as

    begin

    Declare @i int,

     @count int

    set @i=1

    set @count=0

    while @i>0

    begin

     set @i =charindex(',',@str)

     set @STR=substring(@str,@i+1,len(@str))

     set @count=@count+1

    end

    return @count-1

    end

    Now update statment will be

    update Tab_test

    set name=case dbo.countcommas(name) when 0 then name  when 1 then name else LEFT (name, LEN(rtrim(name)) - CHARINDEX(',', REVERSE(rtrim(name)))) end

    This statement also have the limitation that if Address contains more than 1 comma it fails

     

  • So, that wouldn't work either for my situation- right? I need to get rid of the second city after the second comma...but if there is only one comma, leave it be. I'm not sure if this is possible.

     

    Thanks!

    Thanks!

  • Anita,

    your application is not fixed!!!

    As I can see user enters parts of address into separate fields, and application concatenates it into one string before saving it in database.

    STOP IT!

    Fix the application and probably your database as well.

    Store information in the smallest bits are available for you.

    It's not a big deal to concatenate parts into one string when you really need one string.

    But it becomes a real headache when you need to extract part of your string for seaching, grouping, etc.

    For example, when you need all addresses from NY city, but not from NY state. And you must consider that strings "N.Y.", "New York", "New York City", "N Y" are probably the same as "NY" from business point of view. And not to mess up with "123 NY Place, Freaky Town"

    _____________
    Code for TallyGenerator

  • This is a vendor product - and they are not willing to take the city out of that field.This  The  city is also stored in it's  own separate field and believe me I've tried to get them to fix it so that the city is not stored also with the location...but no luck. For now I have to live with it and clean up the mess.

    Anyway, thank you everyone for your help, but I've figured out a way to fix it. Since there are only 6 possible cities...I ran 6 update statement that look like this and it's worked:

    Update    table

    Set location = Case when Location like

    '%,NY,NY' then LEFT (location, LEN(rtrim(location)) - CHARINDEX(',', REVERSE(rtrim(location))))

    Else location

    end

    Thanks!

  • If vendor could fix one error then they could fix and another one.

    But if you have city stored in separate column your task becomes really easy:

    Update    table

    Set location = REPLACE (Location, ',' + City + ','+ City, ','+ City)

    Check out if you need space after comma in ','

    _____________
    Code for TallyGenerator

  • Try this,

    should work with any duplicate comma separated endings

    UPDATE 

      table

    SET 

      Location=LEFT (Location, LEN(RTRIM(Location)) - CHARINDEX(',', REVERSE(RTRIM(Location))))

    WHERE

      CHARINDEX(

     SUBSTRING(

      RTRIM(Location),

      LEN(LEFT (Location, LEN(rtrim(Location)) - CHARINDEX(',', REVERSE(RTRIM(Location)))))+1,

            CHARINDEX(',', REVERSE(RTRIM(Location)))),

     LEFT(Location,LEN(RTRIM(Location)) - CHARINDEX(',', REVERSE(RTRIM(Location))))

       )>0

     

    Magnus W

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply