Problem segregating third occurence from a column

  • Hi ,

    I have a address column with data like

    5605 PRINCETON AVE STE B* COLUMBUS* GEORGIA*31904*(01/11/06 - 03/26/09)

    for each customer.

    I want to break it into

    address city state

    5605 PRINCETON AVE STE B COLUMBUS GEORGIA

    i can get first two using chardindex and substring,

    but the third occurrence is giving me real hard times.

    I will appreciate if somebody can help me with this.

    I am pasting the the query for first two occurence ,

    You can just copy and paste ,to see the results if you like

    FIRST OCCURENCE

    select substring('504 cleveland avenue * honlulu* Hawai*31904*(01/11/06 - 03/26/09)',1,charindex('*','504 cleveland avenue * honlulu* Hawai*31904*(01/11/06 - 03/26/09)',1)-1)

    SECOND OCCURENCE

    select substring('504 cleveland avenue * honlulu* Hawai*31904*(01/11/06 - 03/26/09)',charindex('*','504 cleveland avenue * honlulu* Hawai*31904*(01/11/06 - 03/26/09)',1)+1,CHARINDEX('*', SUBSTRING('504 cleveland avenue * honlulu* Hawai*31904*(01/11/06 - 03/26/09)',

    CHARINDEX('*','504 cleveland avenue * honlulu* Hawai*31904*(01/11/06 - 03/26/09)', 1) + 1,

    150)) -1 )

    THIRD OCCURENCE ???????????????????????????????????

    thanks

  • It may need some twiking, but does this help?

    USE [SandBox]

    GO

    /****** Object: UserDefinedFunction [dbo].[DelimitedSplit2] Script Date: 05/07/2009 10:03:14 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE function [dbo].[DelimitedSplit2] (

    @pString varchar(max),

    @pDelimiter varchar(2)

    )

    returns table

    as

    return

    with

    a1 as (select 1 as N union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1),

    a2 as (select

    1 as N

    from

    a1 as a

    cross join a1 as b),

    a3 as (select

    1 as N

    from

    a2 as a

    cross join a2 as b),

    a4 as (select

    1 as N

    from

    a3 as a

    cross join a2 as b),

    Tally as (select top (DATALENGTH(@pString))

    row_number() over (order by N) as N

    from

    a4),

    ItemSplit(

    ItemOrder,

    Item

    ) as (

    SELECT

    N,

    SUBSTRING(@pDelimiter + @pString + @pDelimiter,N + DATALENGTH(@pDelimiter),CHARINDEX(@pDelimiter,@pDelimiter + @pString + @pDelimiter,N + DATALENGTH(@pDelimiter)) - N - DATALENGTH(@pDelimiter))

    FROM

    Tally

    WHERE

    N < DATALENGTH(@pDelimiter + @pString + @pDelimiter)

    AND SUBSTRING(@pDelimiter + @pString + @pDelimiter,N,DATALENGTH(@pDelimiter)) = @pDelimiter --Notice how we find the delimiter

    )

    select

    row_number() over (order by ItemOrder) as ItemID,

    Item

    from

    ItemSplit

    GO

    declare @Address varchar(256);

    set @Address = '5605 PRINCETON AVE STE B* COLUMBUS* GEORGIA*31904*(01/11/06 - 03/26/09)'

    select * from dbo.DelimitedSplit2(@Address, '*')

  • I think you should use some kind of slit function.

    I use this one, but I'm sure there's a better one out there.

    CREATE FUNCTION [dbo].[fSplit]

    (

    @List VARCHAR(6000),

    @SplitOn VARCHAR(5)

    )

    RETURNS @RtnValue TABLE

    (

    ID INT identity(1,1),

    Value VARCHAR(100)

    )

    AS

    BEGIN

    WHILE (Charindex(@SplitOn,@List)>0)

    BEGIN

    INSERT INTO

    @RtnValue (value)

    SELECT

    Value = ltrim(rtrim(Substring(@List,1,Charindex(@SplitOn,@List)-1)))

    SET @List = Substring(@List,Charindex(@SplitOn,@List)+len(@SplitOn),len(@List))

    END

    INSERT INTO

    @RtnValue (Value)

    SELECT

    Value = ltrim(rtrim(@List))

    RETURN

    END

    You can use it like this:

    SELECT dbo.fSplit(MyAddressField,'*')

    FROM MyTable

    -- Gianluca Sartori

  • Gianluca Sartori (5/7/2009)


    I think you should use some kind of slit function.

    I use this one, but I'm sure there's a better one out there.

    CREATE FUNCTION [dbo].[fSplit]

    (

    @List VARCHAR(6000),

    @SplitOn VARCHAR(5)

    )

    RETURNS @RtnValue TABLE

    (

    ID INT identity(1,1),

    Value VARCHAR(100)

    )

    AS

    BEGIN

    WHILE (Charindex(@SplitOn,@List)>0)

    BEGIN

    INSERT INTO

    @RtnValue (value)

    SELECT

    Value = ltrim(rtrim(Substring(@List,1,Charindex(@SplitOn,@List)-1)))

    SET @List = Substring(@List,Charindex(@SplitOn,@List)+len(@SplitOn),len(@List))

    END

    INSERT INTO

    @RtnValue (Value)

    SELECT

    Value = ltrim(rtrim(@List))

    RETURN

    END

    You can use it like this:

    SELECT dbo.fSplit(MyAddressField,'*')

    FROM MyTable

    Look up, you'll find one without a while loop.

  • ... ooops!

    I vote for Lynn's solution! There's also an interesting thread on this topic

    http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx

    -- Gianluca Sartori

  • Thanks guys for such a quick response,

    actually the function might work,i used some myself .

    the problem is the result set has to use a group by because i need to spit the address column for each customer.

    And i want to run that on the table so that i can get the segregation for each customer in the table.

    i will appreciate it

    thanks

  • rayash16 (5/7/2009)


    Thanks guys for such a quick response,

    actually the function might work,i used some myself .

    the problem is the result set has to use a group by because i need to spit the address column for each customer.

    And i want to run that on the table so that i can get the segregation for each customer in the table.

    i will appreciate it

    thanks

    That will require additional work from you. We would need the DDL for the table(s), sample data (in a readily consumable format), expected results based on the sample data.

    You can get help with this by reading the first article I reference below in my signature block.

  • I will be more happy to provider you the DDl and the Data,

    I ahve working on this since yesterday with no results.I will appreciate if you can help.

    My table has around 2000k rows.

    Here is the DDl for the :

    USe [Tempdb]

    --===== If the test table already exists, drop it

    IF OBJECT_ID('TempDB..customer_address','U') IS NOT NULL

    DROP TABLE customer_address

    --===== Create the test table with

    CREATE TABLE customer_address

    (

    Customer_key NVARCHAR(100),

    Provider_complete_Address NVARCHAR(150)

    )

    --===== Insert the test data into the test table

    INSERT INTO customer_address

    (Customer_key,Provider_complete_Address)

    SELECT 'AALEMANSOUR, SIAMAK: 858256221','18141 BEACH BLVD STE 130* HUNTINGTN BCH* CALIFORNIA*92648*(02/02/06 - 03/11/09)' UNION ALL

    SELECT 'ABAD, ERWIN: 256585455 ','1475 BANNISTER ST* YORK* PENNSYLVANIA*17404*(01/11/06 - 06/25/08)' UNION ALL

    SELECT 'ABBOTT, ROGER: 256123456','650 S MAIN ST* RIVER FALLS* WISCONSIN*54022*(01/23/06 - 05/28/08)' UNION ALL

    SELECT 'AGUTO, FELIX: 550839635','15293 AMBERLY DR* TAMPA* FLORIDA*33647*(03/06/06 - 09/26/07)' UNION ALL

    SELECT 'AGEE, DEEP: 21541215','19 E THIRD ST* MAYSVILLE* KENTUCKY*41056*(01/03/06 - 06/10/08)' UNION ALL

    SELECT 'AGOSTONI, DAVID: 515451112','11005 S PARKER RD* PARKER* COLORADO*80134*(01/31/06 - 11/09/07)' UNION ALL

    SELECT 'AGUIRRE, RAMON: 23154512','159 S MAIN AVE* SIOUX CENTER* IOWA*51250*(01/25/06 - 09/01/06)' UNION ALL

    SELECT 'AGUTO, FELIX: 254612542','650 S MAIN ST* RIVER FALLS* WISCONSIN*54022*(01/23/06 - 05/28/08)'

    OUTPUT I Expect is

    Provider_complete_address has to be splitted into address,city and state.

    This is the way i want my output to look.

    Customer_key, address, City, State

    Thanks in advance It will be great help.

    thanks

  • Gianluca Sartori (5/7/2009)


    ... ooops!

    I vote for Lynn's solution! There's also an interesting thread on this topic

    http://www.sqlservercentral.com/Forums/Topic695508-338-1.aspx

    Uh huh... just be careful with what you see in some of the posts on that thread. There's a lot of really good information but some of the testing used data where the position of the delimiters from row to row didn't vary much and the optimizer recognized that resulting in some artificially fast times in some cases. The devil's in the data.

    The originator of that post is actually writing an article on the subject and will likely coalesce some of those findings.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here is the code I threw together:

    CREATE TABLE dbo.customer_address

    (

    Customer_key NVARCHAR(100),

    Provider_complete_Address NVARCHAR(150)

    )

    --===== Insert the test data into the test table

    INSERT INTO customer_address

    (Customer_key,Provider_complete_Address)

    SELECT N'AALEMANSOUR, SIAMAK: 858256221','18141 BEACH BLVD STE 130* HUNTINGTN BCH* CALIFORNIA*92648*(02/02/06 - 03/11/09)' UNION ALL

    SELECT N'ABAD, ERWIN: 256585455 ','1475 BANNISTER ST* YORK* PENNSYLVANIA*17404*(01/11/06 - 06/25/08)' UNION ALL

    SELECT N'ABBOTT, ROGER: 256123456','650 S MAIN ST* RIVER FALLS* WISCONSIN*54022*(01/23/06 - 05/28/08)' UNION ALL

    SELECT N'AGUTO, FELIX: 550839635','15293 AMBERLY DR* TAMPA* FLORIDA*33647*(03/06/06 - 09/26/07)' UNION ALL

    SELECT N'AGEE, DEEP: 21541215','19 E THIRD ST* MAYSVILLE* KENTUCKY*41056*(01/03/06 - 06/10/08)' UNION ALL

    SELECT N'AGOSTONI, DAVID: 515451112','11005 S PARKER RD* PARKER* COLORADO*80134*(01/31/06 - 11/09/07)' UNION ALL

    SELECT N'AGUIRRE, RAMON: 23154512','159 S MAIN AVE* SIOUX CENTER* IOWA*51250*(01/25/06 - 09/01/06)' UNION ALL

    SELECT N'AGUTO, FELIX: 254612542','650 S MAIN ST* RIVER FALLS* WISCONSIN*54022*(01/23/06 - 05/28/08)'

    select

    Customer_key,

    max(case ItemID when 1 then ltrim(rtrim(Item)) else null end) as Addr,

    max(case ItemID when 2 then ltrim(rtrim(Item)) else null end) as City,

    max(case ItemID when 3 then ltrim(rtrim(Item)) else null end) as [State]

    from

    dbo.customer_address

    cross apply dbo.DelimitedSplit2(Provider_complete_Address,'*')

    group by

    Customer_key;

    drop table dbo.customer_address;

  • Hey Lynn ,

    You my man.It worked wonders.

    THANKS A TON MAN

  • DECLARE @addr VARCHAR(1000), @delim CHAR(1), @x INT

    SET @delim = '*'

    SET @addr = '5605 PRINCETON AVE STE B* COLUMBUS* GEORGIA*31904*(01/11/06 - 03/26/09)'

    SET @x = 1

    WHILE CHARINDEX(@delim, @addr, @x)>0

    BEGIN

    SELECT SUBSTRING(@ADDR, @X, CHARINDEX(@delim, @addr, @x)-@x)

    SET @x=(CHARINDEX(@delim, @addr, @x)) + 1

    IF (CHARINDEX(@delim, @addr, @x)=0)

    SELECT SUBSTRING(@ADDR, @X, LEN(@ADDR)-@X+1)

    END

    Saurabh Dwivedy
    ___________________________________________________________

    My Blog: http://tinyurl.com/dwivedys

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537

    Be Happy!
  • saurabh

    this one is a handy script as well.for now i got throught that problem.but surely i can use it at many palces.

    thanks a lot

  • Thanks Rayash. I am sure you are aware and realize it but I will still say that my solution is not elegant: It uses a cursor based approach which is not ideal. But I just wanted to present it because it seemed like a quick solution which could come in handy on occasions.

    Saurabh Dwivedy
    ___________________________________________________________

    My Blog: http://tinyurl.com/dwivedys

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537

    Be Happy!
  • saurabh.dwivedy (5/8/2009)


    DECLARE @addr VARCHAR(1000), @delim CHAR(1), @x INT

    SET @delim = '*'

    SET @addr = '5605 PRINCETON AVE STE B* COLUMBUS* GEORGIA*31904*(01/11/06 - 03/26/09)'

    SET @x = 1

    WHILE CHARINDEX(@delim, @addr, @x)>0

    BEGIN

    SELECT SUBSTRING(@ADDR, @X, CHARINDEX(@delim, @addr, @x)-@x)

    SET @x=(CHARINDEX(@delim, @addr, @x)) + 1

    IF (CHARINDEX(@delim, @addr, @x)=0)

    SELECT SUBSTRING(@ADDR, @X, LEN(@ADDR)-@X+1)

    END

    Using a while loop may not scale as well.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply