|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Tuesday, April 16, 2013 1:54 PM
Points: 6,
Visits: 106
|
|
I am glad we are cool. 
The first code I submitted worked for me to clean up some data for my company. In my case the goal was not to identify what was a valid number following a given criteria but rather getting the numeric values out of the data where existing. That was a one off so I did not pay excessive attention to come up with the best possible code but rather to get the job done. I guess that that may be the case of many of us on our day jobs, so that is why I submitted my code in the first place 
My code to extract the human readable numeric values would be:
select isnumeric([Value]), case when isnumeric([Value]) = 1 and [Value] not like '%[a-z]%' then convert(float, convert(money, [Value])) end from ( select [Value] = '$1,234.12' union all select '£1234' union all select '£1,23.4' union all select '1.234' union all select '12,34' union all select '12c34' union all select '12e34' ) x
I have used RegEx several times to extract and clean up data for small to medium sized record sets. It has always be up to the job despite being used straight on SQL CLR. And the flexibility of RegEx is a plus no one can deny. However, in order to get the best of it as a core feature for an system I would not run it from SQL though, but rather from the .net framework using SSIS or C#.
Despite that I must confess I am a fanatic of TSQL and I find easier to do these things straight on the SQL engine. Please, send me the sample data so I can play with it a check how RegEx would perform.
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Today @ 7:53 AM
Points: 10,990,
Visits: 10,577
|
|
Usman Butt (9/17/2012)
SQL Kiwi (9/17/2012)
In any case, the assumption must be that TRY_* will only be used where necessary, and (relatively) slow as the intrinsic functions may be, it will likely out-perform any other T-SQL method, while having the advantage of being presumably very well-tested.But one thing I cannot understand is that if these functions are CLRs then why they cannot be part of earlier versions? Only TRY_PARSE is hosted by the CLR, the other two are T-SQL intrinsic functions. It would be quite unusual to add T-SQL language features to old versions of the product through a service pack. Perhaps there is a commercial element too - all part of the reasons to upgrade!
Paul White SQL Server MVP SQLblog.com @SQL_Kiwi
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Today @ 7:53 AM
Points: 10,990,
Visits: 10,577
|
|
|
|
|
|
SSC-Addicted
      
Group: General Forum Members
Last Login: Monday, April 15, 2013 11:42 PM
Points: 424,
Visits: 55
|
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 7:18 AM
Points: 33,112,
Visits: 27,038
|
|
|
|
|
|
SSC Journeyman
      
Group: General Forum Members
Last Login: Tuesday, June 04, 2013 8:08 AM
Points: 94,
Visits: 212
|
|
I've read the article and skimmed through all the comments looking if there was something that talked about how to actually do the cross apply as Jeff suggests, but I haven't seen an example of it.
I'm actually trying to implement this, but I'm just not getting it and was wondering if someone could help me out.
Here's the problem I'm trying to solve
I've got a list of product numbers that I need to validate are exclusively digits.
I created the function as specified ( renamed it to fn_isAllDigits to adhere to our naming standards and am trying to use it but I guess I just don't understand the how to properly format the query to use the cross apply.... Can someone help here's what I'm starting with:
Note: In this version of the query I'm trying to drill in and find all product numbers that are six digits long. The latest request was to include any rows that are longer than 6 digits but have leading zeroes that should also be included. eg. 00000001234546 should also be included) The simple solution seemed to be to just convert it to an int and then check the length as you see I attempted because the patindex check did not catch this: 0-3-101
SELECT DISTINCT PrimeProdNo FROM ourDatabase.schema.tbl_product tp1 WHERE (6 = LEN(RTRIM(PrimeProdNo)) OR 6 = LEN(RTRIM(CAST(PrimeProdNo AS BIGINT)))) --I just added this to AND PATINDEX('%[A-Z]%', PrimeProdNo) = 0 --this needs to be more than just a letter check AND LEFT(PrimeProdNo, 1) > 0
Side Note 2. I got this code from someone else, but I was thinking about changing that from distinct to a group by on PrimeProdNo. Anyone have any comments on which is the better method to employ?
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 7:26 AM
Points: 6,721,
Visits: 5,792
|
|
Brett Phipps (1/9/2013)
I've read the article and skimmed through all the comments looking if there was something that talked about how to actually do the cross apply as Jeff suggests, but I haven't seen an example of it. I'm actually trying to implement this, but I'm just not getting it and was wondering if someone could help me out. Here's the problem I'm trying to solve I've got a list of product numbers that I need to validate are exclusively digits. I created the function as specified ( renamed it to fn_isAllDigits to adhere to our naming standards and am trying to use it but I guess I just don't understand the how to properly format the query to use the cross apply.... Can someone help here's what I'm starting with: Note: In this version of the query I'm trying to drill in and find all product numbers that are six digits long. The latest request was to include any rows that are longer than 6 digits but have leading zeroes that should also be included. eg. 00000001234546 should also be included) The simple solution seemed to be to just convert it to an int and then check the length as you see I attempted because the patindex check did not catch this: 0-3-101 SELECT DISTINCT PrimeProdNo FROM ourDatabase.schema.tbl_product tp1 WHERE (6 = LEN(RTRIM(PrimeProdNo)) OR 6 = LEN(RTRIM(CAST(PrimeProdNo AS BIGINT)))) --I just added this to AND PATINDEX('%[A-Z]%', PrimeProdNo) = 0 --this needs to be more than just a letter check AND LEFT(PrimeProdNo, 1) > 0
Side Note 2. I got this code from someone else, but I was thinking about changing that from distinct to a group by on PrimeProdNo. Anyone have any comments on which is the better method to employ?
Brett, your best bet is to take this to the T-SQL forum. Paste in your function code (refer by link back to Jeff's article if you need to), paste in your query code, paste in table DDL and an INSERT statement with sample data. With all that information, we will be able to help you a lot better. Plus you'll get a lot more attention there with a new post than on a thread this old.
Brandie Tarvin, MCITP Database Administrator, MCDBA, MCSA
Webpage: http://www.BrandieTarvin.net LiveJournal Blog: http://brandietarvin.livejournal.com/ On LinkedIn!, Google+, and Twitter.
Freelance Writer: Shadowrun Latchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.
|
|
|
|
|
SSCrazy Eights
        
Group: General Forum Members
Last Login: Today @ 7:54 AM
Points: 8,957,
Visits: 8,525
|
|
Brett this is actually pretty simple. I included some sample data using your names and even included the condition of len = 6. for numbers though we don't want to use len, we want to use a range of values.
;with tbl_product(PrimeProdNo) as ( select '383745' union all --valid select '093847' union all --this is not length 6 so should not return select '1234567' union all --this is length 7 so should not return select 'asdf3e5' union all --non numerics select '12(854' union all --non numerics select '867596' union all --valid select '0-3-101' --invalid ) , tempResult as ( select * from tbl_product cross apply dbo.IsAllDigits(PrimeProdNo) )
select * from tempResult where IsAllDigits = 1 and PrimeProdNo > 99999 and PrimeProdNo <= 999999
The reason we can safely do this type of range validation now is because we have stripped out any rows that are not numeric in the cte so the implicit conversion in the where clause will be successful.
You might want to take a look at Paul White's articles on using and understanding apply.
http://www.sqlservercentral.com/articles/APPLY/69953/ http://www.sqlservercentral.com/articles/APPLY/69954/
--EDIT--
In retrospect we don't need the second cte. The non-numeric value will already be removed. The following will also work.
select * from tbl_product cross apply dbo.IsAllDigits(PrimeProdNo) where IsAllDigits = 1 and PrimeProdNo > 99999 and PrimeProdNo <= 999999
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Moden's splitter.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
|
|
|
|
|
SSC Journeyman
      
Group: General Forum Members
Last Login: Tuesday, June 04, 2013 8:08 AM
Points: 94,
Visits: 212
|
|
I'm not sure if I should respond or do as Brandie suggests, but since I've gotten a response here. I'm going to go ahead and post the create for my table
CREATE TABLE [ret].[tbl_product]( [LocationNo] [decimal](7, 0) NOT NULL, [PrimeProdNo] [char](16) NOT NULL, [ProdCategory] [char](3) NOT NULL, [ProdCategoryHomeOffice] [char](3) NOT NULL, [MFA] [varchar](5) NULL, [ProdDesc] [char](30) NOT NULL, [ProdDescHomeOffice] [char](30) NOT NULL, [QuantityOnHand] [decimal](7, 0) NULL, [ProdUomSale] [char](5) NOT NULL, [ProdUomPurchase] [char](5) NOT NULL, [ProdUomA] [char](5) NOT NULL, [Density] [decimal](6, 2) NULL, [RateA] [decimal](6, 2) NULL, [LastSaleDate] [date] NULL, [LastReceiptDate] [date] NULL, [AddedDate] [date] NULL ) ON [PRIMARY]
GO
This table was created from a cobol file and loaded with data from over a hundred different stores who have each entered data in a myriad of methods. We are trying to find all the distinct differences so we can create one new file that will be pushed back out to the stores.
I've attempted using Sean's method above but when I switch to using the real table I get "Conversion failed when converting the varchar value '00161B ' to data type int.
When I use the example below it works but it doesn't allow me to filter out the products that are less than 100000. If i enable the
select * from ret.tbl_product cross apply Utilities.dbo.fn_isAllDigits(RTRIM(PrimeProdNo)) where IsAllDigits = 1 --and PrimeProdNo BETWEEN 100000 and 999999 AND (6 = LEN(RTRIM(PrimeProdNo)) --OR 6 = LEN(RTRIM(CAST(PrimeProdNo AS BIGINT))) )
If I try to enable the check for PrimeProdNo I get the following error.
Msg 245, Level 16, State 1, Line 25 Conversion failed when converting the varchar value '3EHVHD ' to data type int. Msg 248, Level 16, State 1, Line 25 The conversion of the varchar value '5510708105 ' overflowed an int column.
so I did this to test if it's an issue with the table layout and proved that it is:
IF OBJECT_ID('tempdb..#tbl_product') IS NOT NULL DROP TABLE #tbl_product
CREATE TABLE #tbl_product( [PrimeProdNo] [char](16) NOT NULL ) ON [PRIMARY]
INSERT INTO #tbl_product (PrimeProdNo) select '383745' union all --valid select '093847' union all --this is not length 6 so should not return select '1234567' union all --this is length 7 so should not return select 'asdf3e5' union all --non numerics select '12(854' union all --non numerics select '867596' union all --valid select '0-3-101' UNION ALL --invalid SELECT '0-3-101 ' UNION ALL SELECT '00161B ' SELECT * FROM #tbl_product AS tp select * from #tbl_product cross apply Utilities.dbo.fn_isAllDigits(RTRIM(PrimeProdNo)) where IsAllDigits = 1 and PrimeProdNo BETWEEN 100000 and 999999 AND (6 = LEN(RTRIM(PrimeProdNo)) --OR 6 = LEN(RTRIM(CAST(PrimeProdNo AS BIGINT))) )
Using this you will get the same conversion error I'm running into. I suspect this is because the field is defined as a char instead of varchar.
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 6:13 AM
Points: 5,296,
Visits: 7,236
|
|
Hi Brett,
The problem is that you are combining a check for all numbers and a cast to integer in a single query. The optimizer is free to carry out those opoerations in any order - and in this case, it chooses to do the converstion first, then do the user-defined function.
One way to work around this would be to use CASE, where a THEN clause should only be evaluated if the corresponding WHEN clause is true. You would get something like
select * from ret.tbl_product CROSS APPLY Utilities.dbo.fn_isAllDigits(RTRIM(PrimeProdNo)) where IsAllDigits = 1 --and PrimeProdNo BETWEEN 100000 and 999999 AND (6 = LEN(RTRIM(PrimeProdNo)) OR 6 = CASE WHEN IsAllDigits = 1 THEN LEN(RTRIM(CAST(PrimeProdNo AS BIGINT))) ELSE 0 END )
Hugo Kornelis, SQL Server MVP Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
|
|
|
|