September 7, 2007 at 8:03 am
All,
I just came across a reference to TableSample and I wanted to verify something.
Correct me if I'm wrong, but is the only real difference between TableSample and Top X Percent the fact that TableSample is more random and will pick from anywhere in the table regardless of order whereas Top X Percent will only get the top records based on order?
Did that question even make sense?
I'm coming up with interesting results with TableSample. Ran it twice, once with and once without the Percent keyword just to see if it would take a literal number. Just as the book says, it returned a different record count each time (12525 rows the first time, 11832 rows the second time) regardless of the keyword. So, it definitely treats a literal number as a percentage... FUN...
September 7, 2007 at 8:17 am
If you specify TABLESAMPLE(50000 ROWS) it will first convert this to a percentage based on the estimate of the table size, and then sample the table. The tablesample is also getting back records on a per page basis, so much depends on how your pages are organised. If you ahve a varchar(6000) column, and sometimes it is filled so much that only one row fits a page, sometimes it is null, and you can have hundreds of rows on a database page, the result of table sample, in terms of number of rows will be much more random. Top is much more exact. (I also making the assumption that you are interested in selecting from a single table only).
Regards,
Andras
September 7, 2007 at 8:25 am
Actually, the reason I tested TableSample without the Percent was to see if it did rows and it definitely does not. I don't even think it converts to a percentage so much as assumes that any number entered is a percentage (I used 10).
In fact, as I was writing the above line, I tested TableSample(50000) and got the following error:
Msg 476, Level 15, State 0, Line 2
Invalid PERCENT tablesample size "50000.000000" for table "MyTable". The PERCENT tablesample size must be between 0 and 100.
Which just goes to prove there is no "conversion". It appears that the PERCENT keyword is simply optional.
So, in an effort to understand how this wonderful little feature works, I'm wondering, does it read all it's samples from the same page or does it actually pick different records from different pages? I'm assuming the same page since the PK identity field in random sample I got earlier was all sequential...
Other thoughts?
September 7, 2007 at 8:38 am
It seems to read full pages, so it includes everything on those pages. I've created a small table with 10000 rows, all of them tiny (so about max 600 can fit a page), TABLESAMPLE with a small percentage was picking individual pages, and returning the exact number of records on those pages.
Andras
Viewing 4 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply