Recent PostsRecent Posts Popular TopicsPopular Topics
 Home Search Members Calendar Who's On

 TABLESAMPLE Rate Topic Display Mode Topic Options
Author
 Message
 Posted Monday, June 30, 2008 9:29 PM
 SSCrazy Group: General Forum Members Last Login: Today @ 9:02 AM Points: 2,678, Visits: 2,361
Post #526304
 Posted Tuesday, July 01, 2008 1:03 AM
 SSC Rookie Group: General Forum Members Last Login: Sunday, May 10, 2009 10:10 AM Points: 33, Visits: 30
 The tablesample returns a random number every time as the seed used to generate the random number varies marginally everytime, eventhough we say tablesample(10 percent) it will reutrn +1000 or -1000 rows approximatly,example : 1. select * from results TABLESAMPLE SYSTEM(100 ROWS) 2. select * from results TABLESAMPLE SYSTEM(10 PERCENT)The number of rows will vary everytime. To get a constant number of rows everytime we can use the keyword REPEATABLE.
Post #526346
 Posted Tuesday, July 01, 2008 1:05 AM
 SSC Rookie Group: General Forum Members Last Login: Sunday, May 10, 2009 10:10 AM Points: 33, Visits: 30
 The tablesample returns a random number every time as the seed used to generate the random number varies marginally everytime, eventhough we say tablesample(10 percent) it will reutrn +1000 or -1000 rows approximatly,example : 1. select * from results TABLESAMPLE SYSTEM(100 ROWS) 2. select * from results TABLESAMPLE SYSTEM(10 PERCENT)The number of rows will vary everytime. To get a constant number of rows everytime we can use the keyword REPEATABLE.select * from results TABLESAMPLE SYSTEM(10 PERCENT) REPEATABLE(1000).Given the same seed, you will get the same rows back. One thing to note here: this is not like therepeatable read isolation level. If another user makes changes to the data in the table, you will not get back the exact same rows. It is only true for a given “version” of the table.
Post #526347
 Posted Tuesday, July 01, 2008 2:06 AM
 SSCrazy Group: General Forum Members Last Login: Today @ 9:02 AM Points: 2,678, Visits: 2,361
 just to clarify, it's not that a random number of rows is returned, a random value is generated for each page of the table, and then this is used with the percentage to generate the sampleFrom http://technet.microsoft.com/en-us/library/ms189108.aspxTABLESAMPLE SYSTEM returns an approximate percentage of rows and generates a random value for each physical 8-KB page in the table. Based on the random value for a page and the percentage specified in the query, a page is either included in the sample or excluded. Each page that is included returns all rows in the sample result set. For example, if you specify TABLESAMPLE SYSTEM 10 PERCENT, SQL Server returns all the rows on approximately 10 percent of the specified data pages of the table. If the rows are evenly distributed on the pages of the table, and if there is a sufficient number of pages in the table, the number of rows returned should approximate the sample size that is requested. However, because the random value that is generated for each page is independent of the values that are generated for any other page, a larger, or smaller, percentage of pages than have been requested might be returned.
Post #526383
 Posted Tuesday, July 01, 2008 2:48 AM
 Old Hand Group: General Forum Members Last Login: Wednesday, September 24, 2008 1:27 PM Points: 350, Visits: 229
 I got this right, but the explanation, to me, doesn't really sound right:"TABLESAMPLE returns an approximate percentage of rows, even if a number of rows is specified. This is used to get a sample of data from large rows and does not guarentee a number of rows or a random sample"From http://technet.microsoft.com/en-us/library/ms189108.aspx, it says TABLESAMPLE returns a sample number of rows from the result set, so if 10 percent is used, then 10 percent is returned. Quote from the link says:The TABLESAMPLE clause limits the number of rows returned from a table in the FROM clause to a sample number or PERCENT of rows.However, if SYSTEM is used, then if 10 percent is specified, then around 10 percent of the result set is returned. Quote from the link says:TABLESAMPLE SYSTEM returns an approximate percentage of rows and generates a random value for each physical 8-KB page in the table.....For example, if you specify TABLESAMPLE SYSTEM 10 PERCENT, SQL Server returns all the rows on approximately 10 percent of the specified data pages of the table.....However, because the random value that is generated for each page is independent of the values that are generated for any other page, a larger, or smaller, percentage of pages than have been requested might be returned.Am I reading that right?
Post #526410
 Posted Tuesday, July 01, 2008 3:17 AM
 SSCrazy Group: General Forum Members Last Login: Today @ 9:02 AM Points: 2,678, Visits: 2,361
 It doesn't matter whether you specify SYSTEM or notFrom the same linkSYSTEM specifies an ANSI SQL implementation-dependent sampling method. Specifying SYSTEM is optional, but this option is the only sampling method available in SQL Server 2005 and is applied by default.
Post #526421
 Posted Tuesday, July 01, 2008 4:13 AM
 Old Hand Group: General Forum Members Last Login: Wednesday, September 24, 2008 1:27 PM Points: 350, Visits: 229
 kevriley (7/1/2008)It doesn't matter whether you specify SYSTEM or notFrom the same linkSYSTEM specifies an ANSI SQL implementation-dependent sampling method. Specifying SYSTEM is optional, but this option is the only sampling method available in SQL Server 2005 and is applied by default. Oh right ok, so it doesn't matter then if you specify SYSTEM or not.
Post #526446
 Posted Tuesday, July 01, 2008 10:43 AM
 SSC-Dedicated Group: Administrators Last Login: Yesterday @ 5:42 PM Points: 32,275, Visits: 14,529
 It's likely you would get 10% back, but there's no guarantee. Follow me on Twitter: @way0utwestForum Etiquette: How to post data/code on a forum to get the best help
Post #526787
 Posted Tuesday, July 01, 2008 11:54 AM
 Hall of Fame Group: General Forum Members Last Login: Friday, December 06, 2013 11:21 AM Points: 3,924, Visits: 1,572
 Steve Jones - Editor (7/1/2008)It's likely you would get 10% back, but there's no guarantee.I tested it more than 10 times and it never returned 10% back approx. It mostly got 476 rows effected or 1454 or 1904 rows.But, I doubt, does any body uses this in day to day programming? If no, what's the actual use of this function -- "tablesample system". SQL DBA.
Post #526840
 Posted Tuesday, July 01, 2008 3:49 PM
 SSC-Enthusiastic Group: General Forum Members Last Login: Wednesday, December 04, 2013 4:24 AM Points: 191, Visits: 183
 It's used whenever you need to quickly get a representative sample of data from a large data set for testing purposes. It's especially useful when you're developing data warehouses as you can develop and test quickly without having to process a very large database every time you make an amendment to a report or cube.
Post #526939

 Permissions