Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Parse CC in String Expand / Collapse
Author
Message
Posted Friday, March 29, 2013 12:40 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Friday, February 28, 2014 10:33 AM
Points: 103, Visits: 498
Hi Guys,

I have a table with a column which is VARCHAR. This column has text and potentially Credit Card # and also has couple of dates. What I need to do is find all the records that may potentially have CC# in this column... I was thinking of using something like

WHERE  patindex('%[0-9][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -]%',Memo) >  0


patindex('%[0-9][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -]%',Memo) > 0


Is there a better way to look for numbers in the string that may potentially be CC#. It is fine if some of them are not CC#. Maybe someone has already worked on a function like this where you find CC# from a string... Any help is appreciated. I am not looking for CLR function though as I have to do this using Query Analyzer.

Thanks,
Laura
Post #1437053
Posted Friday, March 29, 2013 12:58 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 2:05 PM
Points: 11,927, Visits: 10,967
Didn't you post something just like this a day or two ago? I know I saw some other thread trying to do this exact same thing. I can't however find it now.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1437057
Posted Friday, March 29, 2013 12:59 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 5:26 AM
Points: 12,741, Visits: 31,043
Laura i think it would be easier to delete dashes, and maybe also spaces from the string, and then look for 16 [0-9] digits in a row in the remaining string. (or 15 amex number?);
i saw your previous post, but now that it has percolated a bit, i'm thinking you need to manip the comment bit first to make the search easier


Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #1437059
Posted Friday, March 29, 2013 1:57 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 9:47 PM
Points: 22,472, Visits: 30,138
I saw something similar also, but I'm not sure it was from this OP. I can't find the post either.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1437085
Posted Friday, March 29, 2013 3:40 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 3:46 PM
Points: 1,730, Visits: 2,527
That seems like a reasonable approach. Using REPLICATE() will make it easier to see how many numbers are in the pattern and/or adjust it later:


WHERE PATINDEX('%[0-9]' + REPLICATE('[0-9 -]', 12) + '%', Memo) ...





SQL DBA,SQL Server MVP('07, '08, '09)
I'm not fat, I'm gravity challenged.
Post #1437115
Posted Saturday, March 30, 2013 11:36 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 5:41 PM
Points: 35,944, Visits: 30,229
Laura_SqlNovice (3/29/2013)
Hi Guys,

I have a table with a column which is VARCHAR. This column has text and potentially Credit Card # and also has couple of dates. What I need to do is find all the records that may potentially have CC# in this column... I was thinking of using something like

WHERE  patindex('%[0-9][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -]%',Memo) > 0

Is there a better way to look for numbers in the string that may potentially be CC#. It is fine if some of them are not CC#. Maybe someone has already worked on a function like this where you find CC# from a string... Any help is appreciated. I am not looking for CLR function though as I have to do this using Query Analyzer.

Thanks,
Laura


Do you actually want to extract the numbers (even if more than one in a row) or are you just trying to isolate the rows using the WHERE clause?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1437211
Posted Saturday, March 30, 2013 6:24 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Sunday, September 29, 2013 1:24 AM
Points: 429, Visits: 1,721
i'm afraid this task isn't as easy as it may sound. The difficulty comes from the complexity of credit card number formulation. Now let's assume we just want to find the most common cards AMEX, MasterCard, Visa, Discover, Carte Blanche/Diners Club and perhaps a few others. Even among just these major carriers the rules are different. For example, AMEX numbers are 15 chars, MasterCard 16, Visa 13 or 16, Discover 16, and CBDC is 14.

Then there is the Issuer Identification Number (IIN) which varies from one digit (Visa) to as many as 6 digits (Discover). These IINs have changed over the years so there are many different ranges of digits for just about every type of card. These ranges can be looked up and put into a table to be used for CC validation.

And there's more...the last number of every card number is a checksum digit calculated by the Luhn Algorithm which is public domain. Virtually all cards use the Luhn Algorithm for the checksum.

So...just finding sets of 15 or 16 digits won't tell you whether it's likely to be a credit card number at all, much less if it's might be a valid card. A number can be well-formed and have a proper checksum and still not be valid if it's been retired or never issues for example. So all we can do is weed out numbers that break the rules. The rest can only be confirmed by the card issuer and that's another can of worms.

I've dug out some functions I had for validating card numbers and made a few adaptations to make them suitable for using as an example.

Function 1 is just a version of DelimitedSplit8K that splits EVERY character. Jeff Moden deserves most of the credit.


CREATE FUNCTION [dbo].[DelimitedSplit8KByChar]
(@pString VARCHAR(8000))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
cteTally(N) AS (SELECT 0 UNION ALL
SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E2
)
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY t.N),
Item = SUBSTRING(@pString,t.N+1,1)
FROM cteTally t
WHERE NULLIF(SUBSTRING(@pString,t.N+1,1),'') IS NOT NULL

GO


Function 2 parses the CC number and compares it to a table of Issuer Identification Number ranges.
THIS IS ONLY A SAMPLE AND MUST BE UPDATED WITH OFFICIAL ISSUER DATA BEFORE USING IN PRODUCTION!!


CREATE FUNCTION [dbo].[itvfGetCCIIN]
(
@CCNum VARCHAR(50),
@CCLen INT
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
cteIINValue(ID,CCType,StartIIN,EndIIN,CCLen) AS (
SELECT ID,CCType,StartIIN,EndIIN,CCLen FROM
(VALUES
(1,'American Express',34,34,15)
,(2,'American Express',37,37,15)
,(3,'Diners Club',300,305,14)
,(4,'Carte Blanche',300,305,14)
,(5,'enRoute',2014,2014,15)
,(6,'enRoute',2149,2149,15)
,(7,'MasterCard',51,55,16)
,(8,'Visa',4,4,13)
,(9,'Visa',4,4,16)
,(10,'Discover',6011,6011,16)
,(11,'Discover',622126,622925,16)
,(12,'Discover',644,649,16)
,(13,'Discover',65,65,16)
,(14,'JCB',3528,3589,16)
) AS Data (ID,CCType,StartIIN,EndIIN,CCLen)
),
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
E8(N) AS (SELECT 1 FROM E4 a, E4 b), --10E+6 or 100,000,000 rows max
cteTally(N) AS (SELECT 0 UNION ALL
SELECT TOP (SELECT ISNULL(MAX(EndIIN),10000) FROM cteIINValue)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E8
)
SELECT
ID
,IINType
,CCIIN
,IINLen
,StartIIN
,EndIIN
,CCLen
,Prefix
FROM
(
SELECT
ROW_NUMBER() OVER (ORDER BY t.N) AS ID
,cte.ID AS IINID
,CCType AS IINType
,N AS CCIIN
,CCLen AS IINLen
,StartIIN
,EndIIN
,@CCLen AS CCLen
,(CASE
WHEN ISNUMERIC(@CCNum) = 1
AND CONVERT(BIGINT,LEFT(@CCNum,LEN(t.N))) BETWEEN StartIIN and EndIIN
THEN CONVERT(BIGINT,LEFT(@CCNum,LEN(t.N)))
ELSE 0
END)
AS Prefix
FROM
cteIINValue cte
CROSS APPLY
cteTally t
WHERE
t.N BETWEEN StartIIN and EndIIN
) r
WHERE
1=1
AND r.CCIIN = r.Prefix
AND r.Prefix > 0
AND r.IINLen = r.CCLen

GO


Function 3 is the Luhn Algorithm for getting the checksum.


CREATE FUNCTION [dbo].[tvfLuhnValidation]
(
@CCStr VARCHAR(100)
)
RETURNS
@CheckSumValidation TABLE
(
ID INT IDENTITY(1,1) NOT NULL,
CCNum VARCHAR(20) NULL,
CkSumRemainder INT NULL,
PRIMARY KEY (ID)
)
WITH SCHEMABINDING
AS
BEGIN

DECLARE
@CCNum BIGINT
,@CheckIIN BIT
,@AllDigits BIGINT
,@ReverseDigits BIGINT
,@CheckSum BIGINT
,@CheckSumRemainder INT


SET @CCStr = REPLACE(REPLACE(@CCStr,' ',''),'-','')

IF PATINDEX('%[^0-9]%',@CCStr) > 0
BEGIN
INSERT INTO @CheckSumValidation
(CCNum,CkSumRemainder)
SELECT 0,99
END
ELSE
BEGIN

SET @CCNum = CONVERT(BIGINT,@CCStr)
SET @AllDigits = @CCNum
SET @CheckSum = CAST(RIGHT(@AllDigits,1) AS BIGINT)
SET @ReverseDigits = RIGHT(REVERSE(@AllDigits),LEN(@AllDigits)-1)


/* Get the check digit using the Luhn Algorithm */

;WITH cteCheckSum
AS
(
SELECT
(SUM(Item)+@CheckSum)%10 AS CheckSumRemainder
FROM
(
SELECT
s2.ItemNumber
,CAST(s2.Item AS BIGINT) AS Item
FROM
dbo.DelimitedSplit8KByChar(@ReverseDigits) AS s1
OUTER APPLY
dbo.DelimitedSplit8KByChar(s1.Item*2) AS s2
WHERE
s1.ItemNumber%2 <> 0
UNION ALL
SELECT
ItemNumber
,CAST(Item AS BIGINT) AS Item
FROM
dbo.DelimitedSplit8KByChar(@ReverseDigits)
WHERE
ItemNumber%2 = 0
) d
)
INSERT INTO @CheckSumValidation
(CCNum,CkSumRemainder)
SELECT
@CCNum AS CCNum
,CheckSumRemainder
FROM
cteCheckSum cs

END


RETURN

END
GO


Last but not least is some script to tie all these functions together. It's not the swiftest code because a lot is going on internally and I'm sure someone (as always!) will be able to offer improvements.


WITH cteSampleData --replace this with your real data
AS
(
SELECT * FROM
(VALUES
(1,'VISA','4012888888881881','10/14')
,(2,'MasterCard','5269924854210552','06/15')
,(3,'Voyager','869994992762272','08/14')
,(4,'VISA','4539390243132435','12/15')
,(5,'enRoute','214992938007085','09/13')
,(6,'VISA','4485983356242218','11/14')
,(7,'JCB','3088518677707770','01/14')
,(8,'VISA','4532254137583730','07/14')
,(9,'JCB','3560777438925512','12/15')
,(10,'Discover','6011618612311087','11/14')
,(11,'VISA','4417123456789113','07/15')
,(12,'Diners Club','3022329080952x','12/13')
) AS Data (ID,CCType,CCNum,CCExp)
)
--SELECT * FROM cteSampleData
SELECT
r2.ID
,(CASE
WHEN r2.CCNum IN (0,1) OR r2.CCNum IS NULL
THEN CAST(r2.OrigCCNum AS VARCHAR(50))
ELSE CAST(r2.CCNum AS VARCHAR(50))
END) AS CCNum
,r2.IINType AS ProbableCardType
,(CASE
WHEN v.CkSumRemainder = 0 THEN 'OK'
WHEN r2.CCNum IN (0,1) OR r2.CCNum IS NULL THEN 'Invalid Number'
ELSE 'Invalid CheckSum'
END) AS CardNumberStatus
FROM
(
SELECT
cte1.ID
,(CASE
WHEN cte1.CCNum IS NULL THEN 0
WHEN PATINDEX('%[^0-9]%',cte1.CCNum) > 0 THEN 1
ELSE CAST(cte1.CCNum AS BIGINT)
END)
AS CCNum
,cte1.CCNum AS OrigCCNum
,(CASE
WHEN r1.IINType IS NOT NULL THEN r1.IINType
ELSE 'Unknown'
END)
AS IINType
FROM
cteSampleData cte1
LEFT OUTER JOIN
(
SELECT DISTINCT
ROW_NUMBER() OVER (PARTITION BY r.ID ORDER BY r.ID) AS CCNumGroup
,r.ID
,r.CCNum
,iin.IINType
FROM
(
SELECT
cte.ID
,cte.CCNum
,LEN(cte.CCNum) AS CCLen
FROM
cteSampleData cte
) r
CROSS APPLY
dbo.itvfGetCCIIN(r.CCNum,r.CCLen) iin
) r1
ON cte1.ID = r1.ID
WHERE
CCNumGroup = 1 OR CCNumGroup IS NULL
) r2
CROSS APPLY
dbo.tvfLuhnValidation(r2.CCNum) AS v
ORDER BY
ID




 
Post #1437258
Posted Monday, April 01, 2013 8:33 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Friday, February 28, 2014 10:33 AM
Points: 103, Visits: 498
Yes Jeff I just need to isolate those row... I do not need to find the numbers. I will go through the response from everyone now. Thanks a lot to everyone in taking to respond to this. Yeah somehow my earlier post was deleted... may be they thought I had put actual CC# in the script I had... Or I might have done something wrong while creating the post. Thanks Steven for the scripts.
Post #1437462
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse