SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Manipulating a String


Manipulating a String

Author
Message
raym
raym
SSC Veteran
SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)

Group: General Forum Members
Points: 213 Visits: 307
I'm still battling with PatIndex.I have two issues TYpe_1 is partly correct except
that if

1.
For the Column Type_1 : using the first record , I'm trying to extract the '80' as the TYPE_1
its seperated by a space. Sometimes in one space sometimes its a double space.

2.
For Column Code1 : i.e '80 98439 9991 997 -50.000' I want to extract the 98439.Sometimes this
is 4 characters long or 3 or five.

If I do it in two steps it works. How can I do it in one step and split those two field.

Output should be:
'80 98439 9991 997 -50.000' should be Type = 80 Code = 98439
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'01 759 9991 997 -0.560' should be Type = 01 Code = 759
'01 9990 9991 997 -127.910' should be Type = 01 Code = 9990


CREATE TABLE #Test (String Varchar(100),ROWID INT IDENTITY(1,1))


INSERT #test VALUES ('80 98439 9991 997 -50.000')
INSERT #test VALUES ('0000058916 00000074000039708')
INSERT #test VALUES ('80 98408 9991 997 -10.000')
INSERT #test VALUES ('80 98402 9991 997 -1.400')
INSERT #test VALUES ('01 9990 9991 997 -127.910')
INSERT #test VALUES ('01 759 9991 997 -0.560')
INSERT #test VALUES ('80 98440 9991 997 -84.330')
INSERT #test VALUES ('80 98474 9991 997 -2.000')
INSERT #test VALUES ('80 98402 9991 997 -0.280')
INSERT #test VALUES ('0000058916 00000074000039708')
INSERT #test VALUES ('80 98408 9991 997 -10.000')
INSERT #test VALUES ('80 98402 9991 997 -1.400')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('01 6060 3 997 -1000.000 CB')
INSERT #test VALUES ('METLIFE_MALL_2 LA TREE D')
INSERT #test VALUES ('80 98418 3 997 -5.000')
INSERT #test VALUES ('80 98421 3 997 -9.500')
INSERT #test VALUES ('80 98402 3 997 -2.030')
INSERT #test VALUES ('01 6060 3 997 -500.000 CB')
INSERT #test VALUES ('M.A.N_SUPERMKT KI NG WILLIAM D')
INSERT #test VALUES ('80 98418 3 997 -5.000')
INSERT #test VALUES ('80 98421 3 997 -4.750')
INSERT #test VALUES ('80 98402 3 997 -1.370')
INSERT #test VALUES ('01 6060 3 997 -100.000 CB')
INSERT #test VALUES ('M.A.N_SUPERMKT KI NG WILLIAM D')
INSERT #test VALUES ('80 98418 3 997 -5.000')
INSERT #test VALUES ('80 98421 3 997 -0.950')
INSERT #test VALUES ('80 98402 3 997 -0.830')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('01 6060 3 997 -20.000 CB')
INSERT #test VALUES ('METLIFE_MALL_2 LA TREE D')
INSERT #test VALUES ('80 98418 3 997 -5.000')
INSERT #test VALUES ('80 98421 3 997 -0.190')
INSERT #test VALUES ('80 98402 3 997 -0.730')
INSERT #test VALUES ('80 98484 9991 997 -50.000')
INSERT #test VALUES ('80 98402 9991 997 -7.000')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('0000074164 00000074000039708')
INSERT #test VALUES ('80 98408 9991 997 -10.000')
INSERT #test VALUES ('80 98402 9991 997 -1.400')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')

----Step 1
drop table ##DDD
SELECT
SUBSTRING(String, 1, CHARINDEX(' ', String) - 1) AS TYPE_1,
LTRIM(RTRIM(SUBSTRING(String,CHARINDEX(' ', String) + 1, LEN(String)))) AS Code
,MonetoryValue =
CASE
WHEN CHARINDEX('.',REVERSE(String)) = 0 THEN NULL ELSE
REVERSE(SUBSTRING(REVERSE(String), CHARINDEX('.',REVERSE(String))+1,
CHARINDEX(' ',REVERSE(String),CHARINDEX('.',REVERSE(String))) - CHARINDEX('.',REVERSE(String))))
END +
CASE
WHEN CHARINDEX('.',String) =0 THEN NULL ELSE
REPLACE(REPLACE(SUBSTRING(String,CHARINDEX('.',String)
,LEN(String) - CHARINDEX('.',String)+1),'CB',''),'FC','')
END
,String INTO ##DDD
FROM
#Test WHERE String != 'M.A.N_SUPERMKT KI NG WILLIAM D'


--Step 2
SELECT Type_1,MonetoryValue,LEFT(Code, NULLIF(CHARINDEX(' ', Code) - 1, -1)) AS Code1,Code,string FROM ##DDD



Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)

Group: General Forum Members
Points: 88838 Visits: 41134
Output should be:
'80 98439 9991 997 -50.000' should be Type = 80 Code = 98439
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'01 759 9991 997 -0.560' should be Type = 01 Code = 759
'01 9990 9991 997 -127.910' should be Type = 01 Code = 9990


Hey! Matt Miller! This looks like a great place for one of those awesome Regex things you do! Smile

RayM,

In the absense of Regex, you first need to build a Tally table to make your life easier for a lot of things. A Tally table is nothing more than a table with a single column of very well indexed sequential numbers... it's used to replace loops and the like... here's how to build one...

--===== Create and populate the Tally table on the fly
SELECT TOP 11000 --equates to more than 30 years of dates
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2

--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

--===== Allow the general public to use it
GRANT SELECT ON dbo.Tally TO PUBLIC



Once that's done, your problem becomes a whole lot easier and doesn't require performance robbers like the REVERSE function...
 SELECT s.RowID,
Type = MAX(CASE WHEN s.Posit =1 AND LEN(s.SplitString) =2 THEN s.SplitString ELSE NULL END),
Code = MAX(CASE WHEN s.Posit =2 AND LEN(s.SplitString)<=5 THEN s.SplitString ELSE NULL END)
FROM (--==== Derived table "s" splits all of the strings and marks the position (ordinal index) of each
SELECT RowID,
SplitString = SUBSTRING(' '+h.String+' ', t.N+1, CHARINDEX(' ', ' '+h.String+' ', t.N+1)-t.N-1),
Posit = t.N-LEN(REPLACE(LEFT(' '+h.String+' ',t.N), ' ', ''))
FROM dbo.Tally t
RIGHT OUTER JOIN --Necessary in case String is NULL
#Test h
ON SUBSTRING(' '+h.String+' ', t.N, 1) = ' '
AND t.N < LEN(' '+h.String+' ')
AND h.String NOT LIKE '%[_]%'
)s
WHERE s.Posit IN (1,2)
GROUP BY s.RowID



--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)

Group: General Forum Members
Points: 88838 Visits: 41134
Sorry... missed the "MonetaryValue" column that you included in your code but not in your problem definition... this will do it...


SELECT s.RowID,
Type = MAX(CASE WHEN s.Posit =1 AND LEN(s.SplitString) =2 THEN s.SplitString ELSE NULL END),
Code = MAX(CASE WHEN s.Posit =2 AND LEN(s.SplitString)<=6 THEN s.SplitString ELSE NULL END),
MonetaryValue = MAX(CASE WHEN s.Posit =5 THEN s.SplitString ELSE NULL END)
FROM (--==== Derived table "s" splits all of the strings and marks the position (ordinal index) of each
SELECT RowID,
SplitString = SUBSTRING(' '+h.String+' ', t.N+1, CHARINDEX(' ', ' '+h.String+' ', t.N+1)-t.N-1),
Posit = t.N-LEN(REPLACE(LEFT(' '+h.String+' ',t.N), ' ', ''))
FROM dbo.Tally t
RIGHT OUTER JOIN --Necessary in case String is NULL
#Test h
ON SUBSTRING(' '+h.String+' ', t.N, 1) = ' '
AND t.N < LEN(' '+h.String+' ')
AND h.String NOT LIKE '%[_]%'
)s
WHERE s.Posit IN (1,2,5)
GROUP BY s.RowID



--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
raym
raym
SSC Veteran
SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)SSC Veteran (213 reputation)

Group: General Forum Members
Points: 213 Visits: 307
SELECT * FROM tbPROCESSED

This is the full picture of what I'm trying to do.The problematic field is called Actual_Text which is the string that requires manipulation.
I'm deriving three fields from field Actual_Text
fields Type,Code & MonetoryValue

Field Type is the digits starting from the left and end when it encounters a space
only where there is a Valid monetoryValue

ie. '80 98437 3 997 -2.500'
in this case Type will be = 80


ie. 'PARK AVENUE MANOR Pr RAZOO 0'
in this case Type will be = null

ie. '530719000254'
in this case Type will be = null

-----------------------------Code
Field Code is the digits after the first space starting from the left and ends when it encounters the
second space ,only where there is a Valid monetoryValue

ie. '80 98437 3 997 -2.500'
Code will be = 98437


ie. 'PARK AVENUE MANOR Pr RAZOO 0'
Code will be = null

ie. '530719000254'
Code will be = null

ie. '01 6060 3 997 -20.000 CO'
Code will be = 6060


ie. '0997 2247'
Code will be = null because there is no MonetoryValue

---MonetoryValue

I can extract the monetory value only problem is ho do I remove
the alpha characters without using replace

i.e '01 6060 3 997 -20.000 CO'
'-20.000 CO' to exclude CO etc...

'01 6060 3 997 -50.000 CB'



Attachments
GenerateData_1.txt (7 views, 53.00 KB)
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)

Group: General Forum Members
Points: 88838 Visits: 41134
The code I posted extracts all three fields even for the "01" records...

You didn't even try the code, did you?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Matt Miller (4)
Matt Miller (4)
SSChampion
SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)

Group: General Forum Members
Points: 12659 Visits: 18584
Jeff Moden (11/22/2007)
Output should be:
'80 98439 9991 997 -50.000' should be Type = 80 Code = 98439
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'01 759 9991 997 -0.560' should be Type = 01 Code = 759
'01 9990 9991 997 -127.910' should be Type = 01 Code = 9990


Hey! Matt Miller! This looks like a great place for one of those awesome Regex things you do! Smile



Sorry - just realized I got "paged" to the SSC forum during the long vacation weekend....Smile

Yes - you could use a CLR Regex Function to do just that if you want to. It's actually rather cute, since it allows you to "throw out" the result if you don't have BOTH (if you so desire).

I just put this together, if you should want to go down that path.

Here's the CLR (.NET code):


Imports System
Imports System.Data
Imports System.Data.SqlClient
Imports System.Data.SqlTypes
Imports Microsoft.SqlServer.Server
Imports System.Text.RegularExpressions

Partial Public Class UserDefinedFunctions
Private Const optionS As RegexOptions = RegexOptions.CultureInvariant
<Microsoft.SqlServer.Server.SqlFunction(IsDeterministic:=True, IsPrecise:=True)< _
Public Shared Function RegexMatch(ByVal input As SqlChars, ByVal pattern As SqlString) As SqlString
' Add your code here
Dim rex As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern.Value, optionS)
Return New SqlString(CType(rex.IsMatch(New String(input.Value)), String))
End Function
<Microsoft.SqlServer.Server.SqlFunction(IsDeterministic:=True, IsPrecise:=True)< _
Public Shared Function RegexMatchGroup(ByVal input As SqlChars, ByVal pattern As SqlString, byval groupnum as SqlInt32) As SqlString
Dim rex As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern.Value, optionS)

Return New SqlString(CType(rex.Matches(New String(input.Value))(0).Groups(groupnum.Value).ToString, String))
End Function
End Class



At that point, you could use something like this to pick out the pieces:

select dbo.regexmatchgroup('80 12345 p9a996458aaa','^(?<1>[0-9]{2}) (?<2>[0-9]{3,5})',2)

The last parameter could be 0, 1, or 2 (0=whole string, 1/2 = the named backreferences to 80 and 12345 respectively in my example).

It does require (at least for me) knowing how to correctly structure the grouping constructs for .NET, but this should give you a running start.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)

Group: General Forum Members
Points: 88838 Visits: 41134
There he is! And I knew he'd have a pretty good answer for this one, too!

How was the weekend, Matt?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search