Blog Post

Select random records using TABLESAMPLE clause

,

While I was reading Martin Catherall’s post about Selecting from a table with no rows returned I remembered TABLESAMPLE function which does similar thing.

TABLESAMPLE function works together with SELECT statemenet to show sample of rows from respective table. You can use this function to “randomize” standard SELECT TOP xx FROM some_table.

Usage is following:

SELECT * FROM <table_name>  TABLESAMPLE (10 PERCENT)

… this will return approximately 10% procent of rows from <table_name>.

It’s quite ellegant solution for SELECTing news or random records from table for any purpose. How resulting data are randomized in the background? SQL Server sort of randomly picks physical data page and return everything on the page. This means that it is not guaranteed that TABLESAMPLE returns always same number of rows. From table which has 100 000 records, TABLESAMPLE (10 PERCENT) may return 9943 records for the first call and 10 232 for second call. This is because combination of two facts: TABLESAMPLE picks whole content of randomly chosen data page and each data page can carry different number of rows. This behavior might suits you or not. You choose.

You can also combine TABLESAMPLE with TOP clause:

SELECT TOP 10 * FROM <table_name> TABLESAMPLE (10 PERCENT)

… but in that case, this solution for selecting exact count of random records is better:

SELECT TOP 10 * FROM <table_name> ORDER BY NEWID() 

It has one “but” – you must use TABLESAMPLE only with tables, not views or table UDF. It make sense, it just comes from its internal working.

One more thing. You can use table sample also like this:

SELECT * FROM <table_name>  TABLESAMPLE (1000 ROWS)

It returns “approximately” 1000 rows from <table_name>. But anyway, ROWS clause is internally converted to percentage of rows based on available statistics.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating