how to create random values for 1 million records or 10,000 records

  • CREATE TABLE dbo.RandomData (
      RowId INT IDENTITY(1,1) NOT NULL,
      firstname VARCHAR(10), lastname varchar(10),city varchar(10),state varchar(10),country varchar(30),
      age NUMERIC(5),salary numeric(10,2),vacationhrs numeric(5),phoneno numeric(7) )
    GO

    i want to add random values to these columns , no of rows should be 10,000 or 10 million ..any suggestion on this ?
    i mean i want to add dummy data of 10 million or 10000 records ?

    i just tried the below code, but i recieve error  
    An explicit value for the identity column in table 'dbo.RandomData111' can only be specified when a column list is used and IDENTITY_INSERT is ON.
    and is there anything which i can real state names like TX, UT,MA,DC etc

    SET NOCOUNT ON

    GO
    SET IDENTITY_INSERT dbo.RandomData111 ON
    GO

    CREATE TABLE dbo.RandomData111 (
      RowId INT IDENTITY(1,1) NOT NULL,
    firstname VARCHAR(10), lastname varchar(10),city varchar(10),state varchar(10),country varchar(30),
    age NUMERIC(5),salary numeric(10,2),vacationhrs numeric(5),phoneno numeric(7), SomeDateTime DATETIME )
    GO

    DECLARE @count INT
    SET @count = 1

    WHILE @count <= 10
    BEGIN
    INSERT INTO dbo.RandomData111
      SELECT  @count,
        CASE WHEN DATEPART(MILLISECOND, GETDATE()) >= 500 THEN 0 ELSE 1 END [SomeBit],
                 CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) [firstname],
                
                CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) [lastname],
                
                CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) [city],
                
                CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) [state],
                
                CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
        CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) [country],
                
                (ABS(CHECKSUM(NEWID())) % 100001) + ((ABS(CHECKSUM(NEWID())) % 100001) * 0.00001) [age],
                
                (ABS(CHECKSUM(NEWID())) % 100001) + ((ABS(CHECKSUM(NEWID())) % 100001) * 0.00001) [salary],

                (ABS(CHECKSUM(NEWID())) % 100001) + ((ABS(CHECKSUM(NEWID())) % 100001) * 0.00001) [vacationhrs],
                
        (ABS(CHECKSUM(NEWID())) % 100001) + ((ABS(CHECKSUM(NEWID())) % 100001) * 0.00001) [phoneno]
      SET @count += 1
    END

  • Ok, let's start with why this is failing. You're setting the SET INSERT IDENTITY ON before the table exists. This isn't going to work, as there's no table to chaneg the property of :). Secondly, do you need to insert into that column if you've already declared it as an IDENTITY column?

    I've changed a few of your data types, as they didn't make sense for what you were doing. You'll also need to complete the data for the states, if you don't already have a table for it.

    This is going to also be very slow using a WHILE statement. I've put this version together, which makes use the the vTally view found at the start of this article: http://www.sqlservercentral.com/articles/calendar/145206/. On our server (256GB RAM, 2 x Intel Xeon E7 @2.13Ghz) here ran in less than a second to do 10,000 rows (1,000,000 took a while longer @ 31 seconds). If you have any questions, feel free to ask:
    USE DevTestDB;
    GO

    CREATE TABLE dbo.RandomData (
      RowId int IDENTITY(1,1) NOT NULL,
      firstname varchar(10),
      lastname varchar(10),
      city varchar(10),
      state char(2), --Considering that this will only store 2 characters
      country varchar(30),
      age int,
      salary decimal(10,2),
      vacationhrs int,
      phoneno varchar(7), --considering that phone number can start with 0's, I've changed this.
      SomeDateTime datetime)
    GO

    WITH States AS --I am using a CTE with the states in. Hopefully you have a table or something you can use.
      (SELECT *
       FROM (VALUES (1,'TX','Texas'),
                     (2,'NY','New York'),
                     (3,'WA','Washington'),
                     (4,'AL','Alaska'),
                     (5,'HA','Hawaii'))  
      AS S (Num, Abbr, Name))
    , RandData AS (
      SELECT TOP 10000 --Limit to 10,000 rows. Change this as you need
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) AS Firstname,
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) AS Lastname,
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) +
            CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) + CHAR((ABS(CHECKSUM(NEWID())) % 26) + 97) AS City,
            (ABS(CHECKSUM(NEWID())) % 5) +1 AS StateID, --Change the 5 to be how many states you have in your state table
            'USA' As Country, --Seeing as we are using US states, I would assume these should all now be USA.
            (ABS(CHECKSUM(NEWID())) % 60) +18 AS Age , --Ages 18 to 77
            (ABS(CHECKSUM(NEWID())) % 120000) + 5000 AS Salary, --Salary, with 5,000 being the minimum,
            (ABS(CHECKSUM(NEWID())) % 50) + 100 AS Holiday, --50 to 149 holiday hours,
            RIGHT('0000000' + (ABS(CHECKSUM(NEWID())) % 9999999) + 1,7) AS Phone, --Phone numvbers 0000001 to 9999999,
            DATEADD(MINUTE,ABS(CHECKSUM(NEWID())) % 9999999,'01-Jan-2010') AS SomeDate--Some date time
      FROM vTally v)
    INSERT INTO RandomData(firstname, lastname, city, state, country, age, salary, vacationhrs, phoneno, SomeDateTime)
    SELECT RD.Firstname, RD.Lastname,
           RD.City, S.Abbr, RD.Country,
           RD.Age,
           RD.Salary,
           RD.Holiday,
           RD.Phone,
           RD.SomeDate
    FROM RandData RD
        LEFT JOIN States S on RD.StateID = S.Num;
    GO

    SELECT *
    FROM RandomData;

    GO
    DROP TABLE RandomData;

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Hi sir
    That was really helpful ,
    I  did ask the question accurately  earlier.
    1) I want to create a dummy data  with last name and first name ( for eg Smith T, Mark S, Thomas Edison), i mean is there any possibility i can get some random real dummy names for all those columns.
    2) get an random phone no based on state ? externallink
    any help on this one ?

  • From a SQL perspective you're going to need tables of names, and then logic to define how the phone number is generated using the state as well. SQL doesn't know what a "real name" is, the string "Thomas" is pretty much the same as "sdfdsf" in the sense that they are made up of 6 characters, so you can't say ask it to generate "real" names. You can see how I applied this a few states, so see what you can do yourself with the example I gave you.

    if you get stuck, show what you've done so far and i can point you in the right direction.

    Failing that, if you really need very "realistic" data, you'd be better off purchasing a product to create the data for you.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Maybe take a look at http://www.databasetestdata.com 
    It is free, and not limited to just a few rows (like many other demos of for-fee generators).
    Chris

  • Here's a website with a free sample data generator. It has column types to give relevant data created randomly.
    http://www.yandataellan.com/

    When I have no access, I just play around with AdventureWorks to get the data scrambled and at the appropriate size.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • You can also look here

    USA Census Data 2000

    USA Census Data 2010

    You can download surnames occurring more than 100 times with their distribution, so if you fancy it you can create a random list of surnames but with the same distribution as in the USA in 2000 or 2010.
    I have downloaded a first name one as well, but couldn't find it.

    Rodders...

  • I found that Red Gate SQL Data Generator is quite helpful at creating randomized test data.  It's not a free product, but I found it to be worth the money, at least in my use cases.
    I recommend taking advantage of their free 14-day trial offer.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply