Log in  ::  Register  ::  Not logged in

 Recent PostsRecent Posts Popular TopicsPopular Topics
 Home Search Members Calendar Who's On

 Generating Test Data: Part 1 - Generating Random Integers and Floats Rate Topic Display Mode Topic Options
Author
 Message
 Posted Monday, March 26, 2012 7:39 PM
 Hall of Fame Group: General Forum Members Last Login: Today @ 6:35 PM Points: 3,593, Visits: 5,102
 Jeff,The following thought occurred to me last night but I didn't get a chance to test it until this morning.Isn't the modulo function designed to always return a positive integer?Hence, in this part of your data generator, I don't believe you need to use ABS:`SomeRandomInteger = CHECKSUM(NEWID()) % @Range + @StartValue`I didn't try it for a million rows but the first 200 came up all as positives. My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!My thought question: Have you ever been told that your query runs too fast?My advice:INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.Need to UNPIVOT? Why not CROSS APPLY VALUES instead?Since random numbers are too important to be left to chance, let's generate some!Learn to understand recursive CTEs by example.Splitting strings based on patterns can be fast!
Post #1273128
 Posted Tuesday, March 27, 2012 3:51 AM
 Ten Centuries Group: General Forum Members Last Login: Thursday, April 17, 2014 7:44 AM Points: 1,371, Visits: 2,597
 Hi GuysBy declaring the variables as below - Will that guarantee that every number generated is Unique?`SELECT @NumberOfRows = 10000000, @StartValue = 1, @EndValue = 10000000, @Range = @EndValue - @StartValue + 1`Thanks
Post #1273329
 Posted Tuesday, March 27, 2012 6:41 AM
 SSC-Dedicated Group: General Forum Members Last Login: Today @ 6:29 PM Points: 35,977, Visits: 30,266
 derekr 43208 (3/27/2012)Hi GuysBy declaring the variables as below - Will that guarantee that every number generated is Unique?`SELECT @NumberOfRows = 10000000, @StartValue = 1, @EndValue = 10000000, @Range = @EndValue - @StartValue + 1`ThanksYes for the rownumber generator, No for the random number generator.If you want unique but randomly sorted numbers, it will involve a sort (which will be time consuming for as many rows as you've identified) using NEWID() to sort on.`WITH cteRowNumberGenerator AS(--==== Prevents sorting all the rows of the cross join. SELECT TOP (1000) --Put your desired number here UniqueNumber = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2)--==== Randomly sorts just the numbers generated from above. SELECT UniqueNumber FROM cteRowNumberGenerator ORDER BY NEWID();` --Jeff Moden"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column." "Change is inevitable. Change for the better is not." -- 04 August 2013(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013Helpful Links:How to post code problemsHow to post performance problems
Post #1273455
 Posted Tuesday, March 27, 2012 1:46 PM
 Hall of Fame Group: General Forum Members Last Login: Wednesday, April 16, 2014 1:11 PM Points: 3,081, Visits: 11,230
 dwain.c (3/26/2012)Jeff,The following thought occurred to me last night but I didn't get a chance to test it until this morning.Isn't the modulo function designed to always return a positive integer?Hence, in this part of your data generator, I don't believe you need to use ABS:`SomeRandomInteger = CHECKSUM(NEWID()) % @Range + @StartValue`I didn't try it for a million rows but the first 200 came up all as positives.It's pretty easy to show that the modulus operator will return a negative number:`select x = -1111%100`Results:`x ----------- -11 `
Post #1273872
 Posted Tuesday, March 27, 2012 2:02 PM
 Hall of Fame Group: General Forum Members Last Login: Wednesday, April 16, 2014 1:11 PM Points: 3,081, Visits: 11,230
 Jeff Moden (3/26/2012)Michael Valentine Jones (3/26/2012)Here is an alternate method that I use to generate the pseudo random numbers. The basic method is to take the right 7 bytes from the NEWID function and convert that to a BIGINT before applying the MODULUS operator. No need for the ABS function, since 7 bytes can only produce a positive BIGINT number.`if object_id('tempdb..#t','U') is not null begin drop table #t end-- Generate 20,000,000 rowsselect top 20000000 NUMBER = identity(int,1,1)into #tfrom (select top 4473 * from master.dbo.syscolumns) a cross join (select top 4473 * from master.dbo.syscolumns) b-- Show distribution of rowcount around average of 40000select a.RandomNo, Rows = count(*)from ( select RandomNo = (convert(bigint,convert(varbinary(7),newid()))%500)+1 from #t aa ) agroup by a.RandomNoorder by count(*), a.RandomNo``RandomNo Rows -------------------- ----------- 335 39455 3 39457 76 39481 426 39489 494 39535 242 39539 278 39539 490 39548 445 39553 244 39566 ......... 124 40400 228 40402 425 40410 286 40434 45 40458 463 40463 373 40531 152 40586 (500 row(s) affected)`Like I said in the article, the conversion to VARBINARY will slow things down and to no good end if you don't really need BIGINT for the random integer. If you really want BIGINT capability (and I realize that wasn't one of your goals in your example), I believe you'd also have to convert the whole NEWID() to VARBINARY.I also thought you were involved in some testing that showed the use of the square root of the final number of desired rows as a TOP for the self joined table in the Cross Join really wasn't worth it.The main point that I'm trying to make is that if it's too complicated, folks won't use it.I never was involved in any testing with cross joins using the square root of number of desired rows. I just grabbed a piece of code I had laying to be able to generate a test table, and I wasn't trying to say that was the way you should do it.Have you actually tested to confirm that the conversion to VARBINARY and then to BIGINT is slower than CHECKSUM and ABS? I haven't but I would be surprised if there is much difference.I was just showing an alternate method to get a random number. I think it is useful in cases where the range of random numbers is greater than an INT.Of course, if you want a random full BIGINT, you would do something like this, instead of stopping at 7 bytes; I just stopped at 7 bytes to eliminate the ABS function.`select convert(bigint,convert(varbinary(8),newid()))`
Post #1273886
 Posted Tuesday, March 27, 2012 3:19 PM
 SSC-Dedicated Group: General Forum Members Last Login: Today @ 6:29 PM Points: 35,977, Visits: 30,266
 Sorry. Thought it was you. Must have been Peter Larsson. Yes, we tested it and the conversion does make quite a bit of differrence. --Jeff Moden"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column." "Change is inevitable. Change for the better is not." -- 04 August 2013(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013Helpful Links:How to post code problemsHow to post performance problems
Post #1273952
 Posted Tuesday, March 27, 2012 4:15 PM
 SSC Eights! Group: General Forum Members Last Login: Thursday, April 10, 2014 3:50 PM Points: 872, Visits: 952
 Great article! Thanks.
Post #1273972
 Posted Tuesday, March 27, 2012 4:27 PM
 SSC-Insane Group: General Forum Members Last Login: Today @ 6:16 PM Points: 20,467, Visits: 14,101
 Thanks for the article Jeff. Jason AKA CirqueDeSQLeilI have given a name to my pain...MCM SQL ServerSQL RNNRPosting Performance Based Questions - Gail ShawPosting Data Etiquette - Jeff ModenHidden RBAR - Jeff ModenVLFs and the Tran Log - Kimberly Tripp
Post #1273977
 Posted Tuesday, March 27, 2012 7:02 PM
 SSC-Dedicated Group: General Forum Members Last Login: Today @ 6:29 PM Points: 35,977, Visits: 30,266
 Kangana Beri (3/27/2012)Great article! Thanks.Thanks for the read and the feedback, Kangana. I appreciate it. --Jeff Moden"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column." "Change is inevitable. Change for the better is not." -- 04 August 2013(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013Helpful Links:How to post code problemsHow to post performance problems
Post #1274001
 Posted Tuesday, March 27, 2012 7:03 PM
 SSC-Dedicated Group: General Forum Members Last Login: Today @ 6:29 PM Points: 35,977, Visits: 30,266
 SQLRNNR (3/27/2012)Thanks for the article Jeff.Thanks for the feedback, Jason. Always good to hear from you. --Jeff Moden"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column." "Change is inevitable. Change for the better is not." -- 04 August 2013(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013Helpful Links:How to post code problemsHow to post performance problems
Post #1274003

 Permissions

 Copyright © 2002-2014 Simple Talk Publishing. All Rights Reserved. Privacy Policy. Terms of Use. Report Abuse.