Pipe delimited VARCHAR column

  • There is a column in an existing database from which I need to extract data. I'll call it DataCol The data is stored in a pipe delimited format in a VARCHAR(4096) column something like this (note: the "90_90,,,," is one data element):

    50|78|12|90_90,,,,|65|

    It relates to a column (let's call it PosCol) in another table which gives the 0 based position of the data. So in a given record, if PosCol has a 2 in it, it is pointing to the 12 in the above string. I have to JOIN these two tables.

    DataCol is a pretty large table, currently 2 million records, growing to maybe 8 million. I have written a procedure to break apart the pipe delimited values and store them (with a position code and other identifying columns) in separate records in another table. I use a cursor to read through the records, store the pipe delimited column in a variable and use string operations like LEFT(@DataCol,CHARINDEX('|',@DataCol)-1) to insert one record to the other table. Then I use SUBSTRING to remove the first element and do it again until the string is gone. Then I move to the next record and do it all again. Finally, I get to link my new table to the table that has PosCol in it.

    This is pretty slow. I don't believe I'll have time in the day to do the whole table when it grows. Is there some built-in command I am missing to break a pipe delimited character string into an array? Or is there some better way to handle this? I cannot change the existing database, by the way.

    Thanks!

  • using a cursor to do this will be very slow, the better way seems to be to use a tally table and split the data up with this;

    check out this excellent article on how to do this;

    http://www.sqlservercentral.com/articles/TSQL/62867/"> http://www.sqlservercentral.com/articles/TSQL/62867/

  • This looks like great info. Thanks!

  • Hello cvgalante,

    I am facing the same situation here i have a data in the column what you mentioned with pipe delimited. I want that dat ato be inserted into the columns of a table... Could you please provide me the sample code what you have done.

    Thanks,

    Maverick.

  • Maverick, I found this one example. You can ignore the fields in the SELECT statement up to the ,N. The important code for you is the part that breaks up StuGrades, which is the pipe delimited field. Hope this helps you!

    --===== Add a Primary Key to maximize performance

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally_N

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    --===== Expand the grades in GBStudents

    IF OBJECT_ID('dbo.PKgrades') IS NOT NULL

    DROP TABLE dbo.PKgrades

    --===== Fix GBStudents for cases where end character is not "|"

    UPDATE GBStudents

    SET StuGrades = RTRIM(StuGrades) + '|'

    WHERE StuGrades<>' '

    AND RIGHT(RTRIM(StuGrades),1) <> '|'

    --===== Get the fields we need and use the Tally table to break the delimiters

    SELECT STUSchoolYear AS PKGSchoolYear

    ,STUSchoolID AS PKGSchoolID

    ,STUTeacherID AS PKGTeacherID

    ,STUCycle AS PKGCycle

    ,STUSheet AS PKGSheet

    ,STUSchoolID AS PKGId

    ,STUStudentID AS PKGStudentID

    ,STULName AS PKGLName

    ,STUFName AS PKGFName

    ,STUMName AS PKGMName

    ,STUGradeLevel AS PKGGradeLevel

    ,STUSection AS PKGSection

    ,STUPeriod AS PKGPeriod

    ,N

    ,SUBSTRING('|'+StuGrades,N+1,CHARINDEX('|','|'+StuGrades,N+1)-N-1) AS PKGGrade

    ,ROW_NUMBER() OVER (ORDER BY STUSchoolYear --==== Row numbers will create the AsgId

    ,STUSchoolID

    ,STUTeacherID

    ,STUCycle

    ,STUSheet

    ,STUStudentID

    ,N) AS RowNum

    INTO PKgrades

    FROM dbo.Tally t

    CROSS JOIN GBStudents

    WHERE N < LEN('|'+StuGrades)

    AND SUBSTRING('|'+StuGrades,N,1) = '|'

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply