I appreciate anyone that will step up to the plate with an article or a script. Since you brandish the title of "MVP" in your handle and I've verified that you are a Data Platform MVP, I thought you could stand a bit of a critique.
If you hard code an @RowCount of just 100,000 in your code, it takes 48 seconds to complete. That's because of the 100,000 rows you end up inserting in a totally RBAR fashion.
The following does the functional equivalent as your code but in a Set Based fashion and takes only 2 seconds. (The reason for the funky numbers is because your @Texto string is actually only 127 characters long according to the LEN() function and not the 130 you stated.
CREATE TABLE dbo.RandomDataTable
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
,CustomerID int NOT NULL
,SalesPersonID varchar(10) NOT NULL
,Quantity smallint NOT NULL
,NumericValue numeric(18,2) NOT NULL
,Today date NOT NULL
DECLARE @RowCount INT = 100000
E1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))E0(N))
,E10(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c, E1 d, E1 e, E1 f, E1 g, E1 h, E1 i, E1 j)
,TallY(N) AS (SELECT TOP(@RowCount) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E10)
INSERT INTO dbo.RandomDataTable
(CustomerID, SalesPersonID, Quantity, NumericValue, Today)
SELECT CustomerID = @RowCount+1-t.N
,SalesPersonID = SUBSTRING(ca.Texto,ABS(CHECKSUM(NEWID())%126)+1,2)
,Quantity = ABS(CHECKSUM(NEWID())%1000)
,NumericValue = RAND(CHECKSUM(NEWID()))*100+5
,Today = DATEADD(dd,ABS(CHECKSUM(NEWID())%1000),GETDATE())
FROM TallY t
CROSS APPLY (SELECT '0123456789@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyzŽŸ¡ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåæçèéêëìíîïðñòóôõöùúûüýÿ')ca(Texto)
SELECT ID, CustomerID, SalesPersonID, Quantity, NumericValue, Today
How does it all work. Please see the following articles for an introduction to the "Tally Table", Itzik Ben-Gan's method to replace it with inline code, and an introduction to the concept of "pseudo cursors" (which isn't specifically mentioned in the article but that's what it is), all of which are used to replace While Loops, Recursive CTE's, and a wealth of other RBAR with something much faster and, IMHO, easier to write. The article will change your professional career if you look beyond the simple examples in the article to embrace the overall concept.
As for creating random values, here are a couple of articles on the subject to get you away from the issues of using RAND() as a random source so that you don't have to use a While Loop to generate test data.
Just a couple of other observations.
1. Consider always using the 2 part naming convention to avoid problems with the eventuality of duplicate table names being using across multiple schemas.
2. Consider right-sizing column data-types. I hate to see folks waste space by using NUMERIC(18,2) (a horrible default from front end designers) and VARCHAR(10) when the column will be filled with 99.9999% 10 character values. VARCHAR(anything) carries an extra 2 byte overhead to remember the length of the value.
3. Consider always using properly places semi-colons. Not using them has actually been deprecated for many years and the proverbial poo is going to hit the fan when they make them fully required.
4. Consider not mixing human languages in the code for things like variable names.
is pronounced "ree-bar
" and is a "Modenism
" for R
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
"If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
"Change is inevitable... change for the better is not."
When you put the right degree of spin on it, the number 3|8
is also a glyph that describes the nature of a DBAs job. 😉
How to post code problems