July 23, 2008 at 1:27 pm
Does anybody have some code to build a table of strings from 1 to 10 characters in length, with no trailing or leading spaces (embedded spaces are allowed) containing all permutations of characters that could be entered by a user on a USA keyboard?
I have rewritten an internal UDF using a tally table, and need to verify that it returns the same value as the function it was written to replace. (I wanted to eliminate a while loop in the original function.) As I look at what I need, I just can't seem to get my brain on track.
😎
July 23, 2008 at 1:56 pm
Um...that's a lot of strings.
I don't have anything pre-made, but something like this is what comes to mind:
[font="Courier New"]; WITH CharSet (N, Chr)
AS (
SELECT N, CHAR(N) FROM dbo.Tally WHERE N BETWEEN 65 AND 90 --Upper case letters
UNION SELECT N, CHAR(N) FROM dbo.Tally WHERE N BETWEEN 65 AND 90 --Lower Case letters
UNION SELECT N, CHAR(N) FROM dbo.Tally WHERE N BETWEEN 48 AND 57 --Numbers
)
SELECT
T1.Chr + T2.Chr + T3.Chr
FROM
CharSet T1
CROSS JOIN CharSet T2
CROSS JOIN CharSet T3[/font]
You would have to set up the CTE with the entire character set you wanted and add the other seven cross joins.
If your SQL Server can return all of the possible permutations before it melts or the world ends, please let me know.
July 23, 2008 at 2:11 pm
LOL!! :w00t:
Thanks. Once I finish working on the second function, I'll start with your code and see what I can come up with. I appreciate the jump start!
Edit: Anyone happen to know mathematiclly how many strings we may be talking about?
😎
July 23, 2008 at 2:18 pm
Well, if you used just letters and numbers - say all lowercase you have 35 characters.
Enter 35^10 in calculator and see what you get...
July 23, 2008 at 2:34 pm
Does this number look about right: 2,758,547,353,515,625?
That's a heck of a lot of data to ensure that both functions are equivalent.
😎
July 23, 2008 at 3:31 pm
Lynn Pettis (7/23/2008)
Does anybody have some code to build a table of strings from 1 to 10 characters in length, with no trailing or leading spaces (embedded spaces are allowed) containing all permutations of characters that could be entered by a user on a USA keyboard?
Ummm, ... even assuming that we are discounting upper/lower case differences, 5810 is a very large number. :w00t:
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
July 23, 2008 at 3:33 pm
Michael Earl (7/23/2008)
Well, if you used just letters and numbers - say all lowercase you have 35 characters.
Hmmm, I'm pretty sure that would be 36.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
July 23, 2008 at 6:16 pm
Lynn Pettis (7/23/2008)
Does anybody have some code to build a table of strings from 1 to 10 characters in length, with no trailing or leading spaces (embedded spaces are allowed) containing all permutations of characters that could be entered by a user on a USA keyboard?I have rewritten an internal UDF using a tally table, and need to verify that it returns the same value as the function it was written to replace. (I wanted to eliminate a while loop in the original function.) As I look at what I need, I just can't seem to get my brain on track.
😎
You know me, Lynn... I've just gotta ask... what on Earth are you going to use this for?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 23, 2008 at 7:21 pm
Four years ago, my predecessors predecessor wrote 2 scaler UDF's, one to encrypt and one to decrypt a string using the same algorithm that the SIS system uses on-line. It is loop-based to access each character one at a time. Not to bad considering it only has to work with strings upto 10 characters long. Of course, what if things change and it jumps to 128 or 256.
The purpose of these functions is to allow us to import and export the data outside of the application. Therefore, I have to be absolutely positive that what I wrote mirrors what she wrote in all cases. It would be nice if I knew how to prove it identical mathematically, but I'm not really sure how to go about that either.
I have run tests based on data in the system, but that isn't necessarily good enough. There could be a subtle difference between them that only manifests itself in very specific cases. The only other way I know, is brute force, check everything. Of course, this may not be feasible either. I do think the more testing I do, the more confident I'l be with my rewrite.
I am also looking at eliminating a cursor in a stored proc. They loop through the data to be entered one record at a time. Not bad when you only have a few hundred rows, but certainly not scaleable and that is what I am after, especially since they didn't need to use a cursor for this update to begin with.
Anything else you would like to know? I'd be happy to tell you more if need be.
😎
July 23, 2008 at 8:33 pm
Ah... I get it now... you want to test/compare every combination of 1 to 10 characters (36 characters total) of the old UDF against the new UDF. (3611)-1 is 131,621,703,842,267,135 or about 131.6 QUADRILLION... Heh... I don't think there's enough time left in the current decade to test all of those combinations unless you happen to have a Cray in your back pocket. 😛 I'm thinking you'll need to be happy with an extreme desk check comparison of the code and just a couple of million comparisons.
I'm also nervous for you... unless someone happens to have some extreme crypto experience (I don't), I wouldn't trust the homegrown algorithm unless it were certified by an outside crypto org.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 24, 2008 at 12:01 am
A) That is why I am trying to come up with a subset that is large enough to satisfy my testing. If after a few million rows of test data are checked, and both return the same data for all attempts, good enough.
B) It isn't that strong, but until we move to something else, like LDAP, it is the best we can do for now. At least it isn't our Personnel or Finance systems.
C) That is why I didn't publish the code for all to see. With no clue of where to start, it at least makes it a little harder to crack unless you are adamant about doing so.
Plus, it was fun getting something done using the Tally table instead of a loop. It was worth the exercise for the brain (as opposed to the physical exercise last week, 16 soccer games over 5 days from U11 to U19 boys and girls).
😎
July 24, 2008 at 4:48 am
So, do you have a grip on generating the test data or do you still need a bit o' help on that?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 24, 2008 at 5:11 am
rbarryyoung - Yeah, 36.
Thanks for pointing out the fact that I forgot about the zero.
Sorry, I was mid-laugh thinking about how many combinations there would really be.
July 24, 2008 at 6:09 am
Michael Earl (7/24/2008)
rbarryyoung - Yeah, 36.Thanks for pointing out the fact that I forgot about the zero.
Don't worry about it, western civilization got along without it for several thousand years... 🙂
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
July 24, 2008 at 11:19 am
Jeff Moden (7/24/2008)
So, do you have a grip on generating the test data or do you still need a bit o' help on that?
With thanks to Michael Earl, yes I got a grip on the test data. Generated over 80 million test records and after almost 3 hours, both functions returned identical values.
I feel confident in my changes. Now, the question is how does everyone else here feel about the changes.
Thank you for the help.
😎
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply