Counting the characters in a string before a space

  • if as OP was specific about "spaces"...heres a first cut ....may be a start??;-)

    --drop table #tempcall

    SELECT

    ID = IDENTITY(INT, 1, 1),

    RTRIM(LTRIM(REPLACE(REPLACE(REPLACE(REPLACE(SUBSTRING(calldata,0,7),'-',''),'%',''),'+',''),'/','-'))) as cDate,

    RTRIM(LTRIM(SUBSTRING(calldata,7,9))) as cStartTime,

    RTRIM(LTRIM(SUBSTRING(calldata,19,9))) as cDuration,

    RTRIM(LTRIM(SUBSTRING(calldata,29,6))) as callingparty,

    RTRIM(LTRIM(SUBSTRING(calldata,41,6))) as origCall,

    RTRIM(LTRIM(SUBSTRING(calldata,47,19))) as cDestination,

    REPLACE(RTRIM(LTRIM(SUBSTRING(calldata,34,7))),'*','') as cTimetoAnswer,

    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(RTRIM(LTRIM(SUBSTRING(calldata,41,28))),'A',''),'B',''),'I',''),'E',''),'T',''),'*','') as cDigitsDialed,

    RTRIM(LTRIM(SUBSTRING(calldata,69,6))) as calledparty,

    RTRIM(LTRIM(SUBSTRING(calldata,108,3))) as cSystemID,

    RTRIM(LTRIM(SUBSTRING(calldata,112,19))) as cANI,

    RTRIM(LTRIM(SUBSTRING(calldata,124,21))) as cDNIS,

    RTRIM(LTRIM(SUBSTRING(calldata,148,15))) as cCallID,

    RTRIM(LTRIM(SUBSTRING(calldata,162,2))) as cCallIDSeq

    into #tempcall

    FROM [dbo].[MitelCallTrace]

    WHERE len(calldata) <> 0 AND SUBSTRING(calldata, 0, 7) <> ''

    ;

    --select * from #tempcall

    WITH cdsplit as (

    SELECT ID

    , MAX(CASE WHEN ItemNumber = 1 THEN Item END) p1

    , MAX(CASE WHEN ItemNumber = 2 THEN Item END) p2

    , MAX(CASE WHEN ItemNumber = 3 THEN Item END) p3

    , MAX(CASE WHEN ItemNumber = 4 THEN Item END) p4

    FROM #tempcall

    CROSS APPLY dbo.DelimitedSplit8K(cDestination, ' ') split

    GROUP BY ID

    )

    SELECT tc.ID

    , tc.cDate

    , tc.cStartTime

    , tc.cDuration

    , tc.callingparty

    , tc.origCall

    , tc.cdestination

    , ISNULL(cdsplit.p1, '') AS cd1

    , ISNULL(cdsplit.p2, '') AS cd2

    , ISNULL(cdsplit.p3, '') AS cd3

    , ISNULL(cdsplit.p4, '') AS cd4

    , tc.cTimetoAnswer

    , tc.cDigitsDialed

    , tc.calledparty

    , tc.cSystemID

    , tc.cANI

    , tc.cDNIS

    , tc.cCallID

    , tc.cCallIDSeq

    FROM [#tempcall] AS tc

    INNER JOIN cdsplit

    ON tc.ID = cdsplit.ID

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • I will give this a try in a little while, I appreciate all that have stuck through this post. I will mark answered if this is a solution or even close.

  • Ok... first of all, I made no real attempt to figure out what the names of the fields are. Somewhere, someone has a RECORD LAYOUT of what the actual fields are and so my example is just that... an example.

    Second, someone needs to crown whomever decided that the date of the call should only have month and day. Why did they do that? To save 2 characters? I recommend that someone talk with the person that's providing this data and get that huge mistake repaired right away.

    Shifting gears, we don't actually need SSIS for this. Yes, it's one of the tools that you could use for this but I never do. I've always done these types of simple imports and parses using T-SQL and never got out of the habit.

    Shifting to yet another gear, there are a dozen different ways to import and parse this data. Since the OP appears (I could certainly be mistaken there) to be just starting out in the world of doing such imports, I'm taking the super simple mode in a fashion similar to what JLS did. Again, I didn't hazard a guess as to what the field names are. There are also some fields that I may have missed parsing on because some of what could be 2 fields looked like just one. You need the RECORD LAYOUT to tell and you'll also need the RECORD LAYOUT to figure out how Col05 should be split.

    So, here we go. Step 1 is to import the data. This could be done using a BCP format file but, again, I'm trying to keep it simple until someone actually finds the RECORD LAYOUT so we can be more definitive. My testing was done by saving the file (that the OP provided) in the C:\Temp directory on my desktop box and the BULK INSERT will import from there. Obviously you'll need to change a bit of code for that and you'll need to make the dynamic if you want to make the code handle different files, but all that is comparatively easy to do once you have the hard-coding done and tested.

    Here's a simple method to import the data...

    CREATE TABLE #MyHead (CDRData VARCHAR(300))

    BULK INSERT #MyHead

    FROM 'C:\Temp\T20140503.txt'

    WITH (

    FIELDTERMINATOR ='',

    ROWTERMINATOR =''

    )

    ;

    Since we're importing whole rows, I forced an empty FIELDTERMINATOR. Yeah, you could have used the BLOB option here. Like a said, there's a dozen differnent ways to do this. Pick one.

    Something else to notice is the double "newline" ROWTERMINATOR. That actually happens in the text file and makes it look like there's a blank line between the CDRs. Using the double newline terminator makes it so those blank lines disappear during the import.

    Next, is to do the split of the fixed-field-length data like JLS did. To reiterate, some of the fields might need additional splitting but I don't know what they are because I don't have the RECORD LAYOUT. We do know that what I'm calling Col05 is going to need some additional post-process attention as will the call date and duration fields. As soon as the OP coughs up the RECORD LAYOUT, we can do more pretty easily using DelimitedSplit8K on that particular field.

    SELECT CallDT = RTRIM(SUBSTRING(CDRData, 2,15))

    ,CallDuration = RTRIM(SUBSTRING(CDRData, 18,10))

    ,Col03 = RTRIM(SUBSTRING(CDRData, 29, 8))

    ,Col04 = RTRIM(SUBSTRING(CDRData, 37, 5))

    ,Col05 = RTRIM(SUBSTRING(CDRData, 42,26))

    ,Col06 = RTRIM(SUBSTRING(CDRData, 68, 1))

    ,Col07 = RTRIM(SUBSTRING(CDRData, 69, 1))

    ,Col08 = RTRIM(SUBSTRING(CDRData, 70,38))

    ,Col09 = RTRIM(SUBSTRING(CDRData,108, 6))

    ,Col10 = RTRIM(SUBSTRING(CDRData,114,21))

    ,Col11 = RTRIM(SUBSTRING(CDRData,135,19))

    ,Col12 = RTRIM(SUBSTRING(CDRData,154, 9))

    ,Col13 = RTRIM(SUBSTRING(CDRData,163, 2))

    ,Col14 = RTRIM(SUBSTRING(CDRData,165,99))

    FROM #MyHead

    ;

    That's pretty much it except maybe for some datatype conversions. This is also the way I'll typically start an import project because it's easy to troubleshoot. Once I have this type of import running, I'll generally convert it to use a BCP format file and a staging table that has the actual columns I need along with the correct datatypes.

    JLS may be spot on with his field names... I just didn't want to presume. He also took a shot at Col05 using DelimitedSplit8K but (IMHO) the OP needs to get the RECORD LAYOUT and the "Content Description" of what that field can actually contain before he can be certain of what to get out of that field.

    The reason why I keep going on about the RECORD LAYOUT is because what the OP provided and what JLS used doesn't appear to actually line up with the data in the text file the OP provided. He also appears to have missed the fact that the file contains at least one other field that starts at character 165.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff...+1 many times 🙂

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Tell me about it I have told them before any projects are started the data needs cleaned up, these tables don't even have Pk or fk, they b Ave me joining on tables with no relation at all and I try to tell them they are reporting incorrectly. They won't listen. You should see the customer and contact tables, there is no data validation when input to the database. It's a nightmare. ...

  • cbrammer1219 (6/14/2014)


    Tell me about it I have told them before any projects are started the data needs cleaned up, these tables don't even have Pk or fk, they b Ave me joining on tables with no relation at all and I try to tell them they are reporting incorrectly. They won't listen. You should see the customer and contact tables, there is no data validation when input to the database. It's a nightmare. ...

    The thing I learned SQL on was CDRs so I feel your pain on this. Have they provided you with a Record Layout that you could share? I think it's really important because you missed the data in the file that starts at character 165. I also think that the record layout would contain more information about what the infamous Col05 contains so that we could pull off the perfect split for you. It would also help us define what the PK needs to be so that you can check for duplicate calls, which AT&T and most of the other "Biggees" are nortorious for providing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • cbrammer1219 (6/12/2014)


    Yes, I am not looking to count the spaces, I want the count of characters before each space, I can find the spaces, but I am needing for example 777 8888 99999 ---> count of first set of chars before the space(3)....count of second bf space(4) and then (5)...final count of chars.

    Is this coming from specs or is it a observed pattern? Have the feeling you might be chasing a red herring here, any CDR format should contain these information explicitly, not implied as this.

    Understanding the logic of these record is mandatory for processing them, such as call types A B C D, Transfers, Trunks etc.

    😎

  • I don't need all of the data, I am inserting the data from my original post to create a SSRS report, the data is coming from the flatfile which I am loading into SQL 2012 table. The other fields that aren't being inserted aren't needed for my report, this is something the 3rd party system is doing. Called Mettel...

  • cbrammer1219 (6/14/2014)


    I will give this a try in a little while, I appreciate all that have stuck through this post. I will mark answered if this is a solution or even close.

    and the answer is...???

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • cbrammer1219 (6/15/2014)


    I don't need all of the data, I am inserting the data from my original post to create a SSRS report, the data is coming from the flatfile which I am loading into SQL 2012 table. The other fields that aren't being inserted aren't needed for my report, this is something the 3rd party system is doing. Called Mettel...

    Just step a way from counting characters and spaces for a moment, could you tell us what you want to report on? If it is something like incoming call/outgoing call/transferred call, originating/terminating numbers, then you are indeed chasing ghosts and red herrings in flocks as the fields being ignored would hold all the information you are after.

    And there is more, a CDR will hold multiple records for a single call, A/B record for the call initiation, another for the pick-up/answer etc. Without knowing this logic and how to interpret the different record types this data is not going to make much sense.

    😎

  • Yes that is what I am reporting on, for each extension. The count of incoming and out going call for each extension displaying customer contact and call duration.

  • Here's the file being processed.

  • cbrammer1219 (6/15/2014)


    Here's the file being processed.

    Now we are getting somewhere, looks like an utility extract from a CDR, possibly MiTel (hence the table name in the earlier post), will look into it.

    😎

  • Just curious could I create a lookup table with all the extension in it and then insert that into the CallTrace table where the extension and do the same for the Customer table, because right now there are no PK, FK, it is the worst DB structure I have ever worked with, and yet they want miracles; After this I am going to insist on taking the time to work with my peer to begin creating a datamart, any suggestion for a beginning SQL to accomplish this?

  • cbrammer1219 (6/17/2014)


    Just curious could I create a lookup table with all the extension in it and then insert that into the CallTrace table where the extension and do the same for the Customer table, because right now there are no PK, FK, it is the worst DB structure I have ever worked with, and yet they want miracles; After this I am going to insist on taking the time to work with my peer to begin creating a datamart, any suggestion for a beginning SQL to accomplish this?

    Welcome to the real world, and mind you it is not going to get better, when ever you think you've seen the worst, it gets worse. That's the business and one better get used to it.

    I'm trying to find specs on the format, MiTel isn't to common any more, will let you know.

    You shouldn't need an extension table unless you can relate/tie the extension to a business entity/value. But if so, in a data-mart/dw feeding into a cube, that would probably become a dimension.

    The sql part is tertiary, first business requirements then data models. Only after those steps one starts talking sql.

    😎

Viewing 15 posts - 46 through 60 (of 71 total)

You must be logged in to reply to this topic. Login to reply