• Alright, life got busy, so it took me much longer to post the demo than I'd wanted.

    At any rate, here's a demo of how the process works. So long as you have at least 4 GB RAM on a test system where you can run this, you'll be able to run it as is.

    Hopefully actually seeing the code will make it clear what I was failing to explain earlier 🙂

    In one query window, paste and step through this:

    --First set max server memory to 2000 MB

    sp_configure 'max server memory', 2000

    RECONFIGURE

    --Now create the table we will use to overrun

    --the buffer pool. I'm using nchar(3000) since

    --that forces the table to use one page for each

    --row, which makes sizing the table easier.

    CREATE TABLE BufferTest

    (SomeData nchar(3000));

    --Now I populate this with 2048 MB of data, which

    --will overrun the buffer pool since SQL Server

    --won't even be able to use all of the 2000 MB max server

    --memory for the buffer pool.

    WITH CTE (SomeData)

    AS

    (

    SELECT 'a'

    UNION ALL

    SELECT 'a'

    UNION ALL

    SELECT 'a'

    UNION ALL

    SELECT 'a'

    UNION ALL

    SELECT 'a'

    UNION ALL

    SELECT 'a'

    UNION ALL

    SELECT 'a'

    UNION ALL

    SELECT 'a'

    )

    INSERT INTO BufferTest

    SELECT CTE1.SomeData

    FROM

    CTE AS CTE1

    CROSS JOIN

    CTE AS CTE2

    CROSS JOIN

    CTE AS CTE3

    CROSS JOIN

    CTE AS CTE4

    CROSS JOIN

    CTE AS CTE5

    CROSS JOIN

    CTE AS CTE6

    --Confirm that the row count matches the number

    --of data pages for the table, and that this

    --is 2048 MB.

    SELECT p.rows,

    au.data_pages,

    au.data_pages/128 AS DataMB

    FROM sys.tables AS t

    INNER JOIN sys.indexes AS i ON t.object_id=i.object_id

    INNER JOIN sys.partitions AS p ON i.object_id=p.object_id

    INNER JOIN sys.allocation_units AS au ON au.container_id=p.partition_id

    WHERE t.name='BufferTest'

    --Now let's make sure that the buffer pool

    --is indeed holding less than 2000 MB. On

    --my test system, it's holding 1709 MB.

    --That's on 2012, where max server memory

    --includes more. On pre-2012 instances

    --it'll be closer to 2000, but should still

    --be less.

    SELECT COUNT(*)/128 AS BufferPoolMB

    FROM master.sys.dm_os_buffer_descriptors

    --Now we'll start a process to continually

    --scan the table. Since the table is bigger

    --than what the buffer pool can hold, this

    --will generate quite a bit of IO.

    DECLARE @SomeData nchar(3000)

    WHILE 1=1

    BEGIN

    SELECT @SomeData=SomeData

    FROM BufferTest

    END

    --Here open perfmon and watch read bytes/sec

    --for the drive with this DB's data file.

    --On my test system this generates about

    --100 MB/sec of reads after it has churned

    --through the buffer pool.

    --Now we can switch to our other query window,

    --leaving the above process running.

    Now in another query window, paste and step through this:

    --This creates a table that will store the identifying information

    --for each page that we record as having been in the buffer pool.

    --A side note about this:

    --I don't create any indexes on this table, as the most efficient

    --query plan for the query that uses it just ends up being a hash

    --join using all the columns as a key. A merge join would be more efficient

    --actually, if both inputs were already sorted, but since the other

    --"table" is sys.dm_os_buffer_descriptors, we can never achieve that.

    CREATE TABLE [dbo].[BufferTracking](

    [database_id] [int] NULL,

    [FILE_ID] [int] NULL,

    [page_id] [int] NULL

    ) ON [PRIMARY]

    GO

    --This checks the buffer pool for any pages we have not yet

    --recorded as having been in the buffer pool. Ordinarily

    --I would run this as a scheduled job that runs during

    --business hours and check the results after a day or two,

    --but here I'm just going to run it in a loop using a

    --WAITFOR DELAY until it doesn't find any new pages

    --in the buffer pool. On my test system, that takes about 90 seconds.

    DECLARE @Rows int=1

    WHILE @Rows>0

    BEGIN

    INSERT INTO BufferTracking (database_id,file_id,page_id)

    SELECT

    BUF.database_id,

    BUF.file_id,

    BUF.page_id

    FROM master.sys.dm_os_buffer_descriptors BUF

    WHERE NOT EXISTS (

    SELECT 1

    FROM BufferTracking BT

    WHERE

    BUF.database_id=BT.database_id AND

    BUF.file_id=BT.file_id AND

    BUF.page_id=BT.page_id)

    OPTION (MAXDOP 1)

    SET @Rows=@@ROWCOUNT

    WAITFOR DELAY '00:00:05'

    END

    --Now let's see how many pages we saw pass through

    --through the buffer pool, compare that to how much

    --the buffer pool can currently hold, convert that

    --to MB, and add that much to max server memory.

    --On my system, that comes out to 399 MB.

    DECLARE @CurrentMB int

    DECLARE @NeededMB int

    SELECT @CurrentMB=COUNT(*)/128

    FROM master.sys.dm_os_buffer_descriptors

    SELECT @NeededMB=COUNT(*)/128

    FROM BufferTracking

    SELECT @NeededMB-@CurrentMB

    --I'll reconfigure max server memory to be its current

    --value plus the amount indicated by that query.

    --In this case, that would be 2000+399=2399 MB.

    --I'll round to 2400 MB.

    sp_configure 'max server memory', 2400

    RECONFIGURE

    --Within a few seconds, the physical IO will drop

    --to nothing. Success!

    --On my 2008 R2 test instance, the result was even

    --more dramatic. The test still generated about 100

    --MB/sec in IO, but only 73 MB was indicated by the

    --process, and after adding that 73 MB, physical IO

    --dropped from 100 MB/sec to 0.

    --Now we can stop the read-generating process in the

    --other window and then clean up our tables.

    DROP TABLE BufferTest

    DROP TABLE BufferTracking

    Now, this case was obviously a very simple one, and since only one table was involved we could have guessed a good value just based on that table's size.

    The nice thing is that this process is equally as accurate in more complicated real-world situations where many tables are frequently accessed, and only pieces of those tables, etc.

    One caveat is that the time taken by the insert increases significantly with the size of the buffer pool, and 512 GB is the largest system I've used this on, so you're right on the edge of my experience. On that system, each run of the INSERT to populate the tracking table took about 5 minutes.

    Relatedly, while this process is extremely accurate, as with so many things, we don't get that for free. The time used by the process is almost exclusively CPU time, so ideally there would be enough CPU headroom on a system to run this with minimal impact.

    Because of that, my use of this process is usually limited to cases where one or more of the following are true:

    1) The client cannot justify adding memory without ironclad justification (e.g., the environment's virtual, and adding significant memory to the instance would require adding memory physically to the hosts, and that maintenance window is hard to come by)

    2) The amount of memory on the server is low enough where the process has almost zero impact (for most servers that tipping point is somewhere around 32 GB; after that it starts taking some time to run)

    3) The server has ample CPU headroom to accommodate the process

    For the one case where I used it on an instance with 512 GB, the cost of having the process run was offset by the cost of the alternative, which was going to be either maintenance to add memory to the underlying hosts so significant memory could be added to the instance, or costly upgrades to the storage layer (and yes, it had a happy ending; as in the demo, it turned out a fairly small bump in max server memory was sufficient to reduce IO significantly).

    I know this was a long post, but hopefully this clears up the confusion my previous explanations couldn't.

    Cheers!