What happens when the number of dirty pages grows beyond the number of pages in the buffer cache?

  • In this topic http://www.sqlservercentral.com/Forums/Topic1576549-3411-1.aspx you can see I wrote some custom baseline performance tests.

    The first test just insert 5.000.000 rows in one table using a loop and I'm tracking lead time, page life expectancy, buffer cache hit ratio, number of IO's on data and log file each 10.000 rows, etc.

    Now I have added a transaction around the loop.

    Consequence is that the lead time to insert the 5.000.000 rows decreases with factor 5.

    I guess that's because writing to the datafile is postponed until all 5.000.000 rows are written in the transaction log file.

    I also see the log files grows much larger than during previous tests in which I did not use one big transaction (actually I used 5.000.000 implicit transactions).

    Now I wonder what happens internally. I insert about 13GB of data, but my OS has only 6GB of RAM and max memory is set to 5GB.

    I presume all rows are written in the log file and stored in memory as dirty pages. At a certain moment the buffer cache is full with dirty pages but I guess SQL Server can not write those pages to the data files since they are all part of a transaction that has not been committed yet. So my question is what happens with the record to be inserted once buffer cache is filled with uncommitted dirty pages?

    Will SQL Server stop storing new records in RAM?

    Will it remove dirty pages from RAM to store the new dirty pages?

    Will it write the new records in tempdb?

    Will SQL Server use OS paging file to store the data?

    Will SQL Server store the data in the datafile with a flag that's the pages are not committed yet?

    I'm sure there is no real problem since all 5.000.000 rows are in the transaction log file, that can be used to flush all data to the datafile once the transaction is committed, but I wonder what SQL Server will be doing when the buffer cache is filled with dirty pages that have not been committed yet and therefor can not be subject for the regular checkpoint/lazy writer system.

    Thx

    Kind regards

    Peter.

  • This may help: http://www.sqlskills.com/blogs/paul/how-do-checkpoints-work-and-what-gets-logged/

    Notably:

    All dirty data file pages for the database are written to disk (all pages that have changed in memory since they were read from disk or since the last checkpoint), regardless of the state of the transaction that made the change.

  • Thanks a lot for this link!

    The article contains some technical aspects I have never looked at so far, so I'll have to invest some time to completly understand the process, but it seems like a very good start to get there. 🙂

    Thx a lot,

    Peter.

  • peter.roothans (3/3/2015)


    I guess SQL Server can not write those pages to the data files since they are all part of a transaction that has not been committed yet.

    You guess wrong.

    SQL has no problems with writing pages affected by uncommitted transactions back to disk. Both checkpoint and lazy writer will do just that.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hello Gail,

    Thanks for clarifying that. Now I wonder whether you can have two versions of the same page in a datafile? In my insert test that will not be the case, since there is no original page in the mdf/ndf, but when I would update 5.000.000 rows in one transaction, then I presume I will have a page in the mdf that contains the data as it was before the start of the transaction (a kind of snapshot) and I will have a (dirty) page (or two in case of page split) ... in the mdf as the data will be after commit. If this guess is correct, I assume, after commit or rollback the checkpoint mechanism will decide which pages to keep (the original pages in case of rollback) and which page to remove (the original page in case of commit)??

    PS1:

    It turns out (on our hardware configuration) using one big explicit transaction instead of 5.000.000 small implicit transaction improves the lead time of the test from about 40minutes to 10minutes. So, I guess, the IO pattern on the data file(s) changes when using one big transaction? Perhaps I get less but larger IO's, which might explain the performance impact?

    PS2:

    One thing I certainly see is that the logfile grows to 17GB when using one transaction while it does not really grow when using 5.000.000 implicit transactions. That seems logical to me since SQL Server must be able to rollback in case one of the inserts fail.

    Thanks a lot,

    Peter.

  • peter.roothans (3/3/2015)


    Hello Gail,

    Thanks for clarifying that. Now I wonder whether you can have two versions of the same page in a datafile?

    No. A page's page number*8kb is it's offset within the file. Hence there's no way you could possibly have two of the same page, as they'd have to occupy the same location in the file.

    I presume I will have a page in the mdf that contains the data as it was before the start of the transaction (a kind of snapshot) and I will have a (dirty) page (or two in case of page split) ... in the mdf as the data will be after commit.

    No you won't.

    Think for a moment whether what you've suggested is at all practical. What you're suggesting would be hell complex, a pain to keep track of and require that the DB is twice or more the size that it would be now.

    If this guess is correct, I assume, after commit or rollback the checkpoint mechanism will decide which pages to keep (the original pages in case of rollback) and which page to remove (the original page in case of commit)??

    No, because there's only one copy of a page, so there's nothing to 'remove'

    Checkpoint writes dirty pages to disk and, in simple recovery, runs log truncation. That's all. Lazy writer does a bit more as it also manages clock hands for caches, but in terms of IO it just writes dirty pages to disk in response to memory pressure.

    PS1:

    It turns out (on our hardware configuration) using one big explicit transaction instead of 5.000.000 small implicit transaction improves the lead time of the test from about 40minutes to 10minutes. So, I guess, the IO pattern on the data file(s) changes when using one big transaction? Perhaps I get less but larger IO's, which might explain the performance impact?

    No. Neither checkpoint nor lazy writer care much about transactions, so neither option will affect writes to the data file.

    Changing from lots of little transactions to one large transaction will make the log writes more efficient though.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hello Gail,

    Thank you for the extensive answer.

    It makes sense to me that 5.000.000 transactions in the logfile instead of 1 transaction (for inserting 5.000.000 rows) might have impact on the processing/management in the transaction log file. I understand now this might explain the different lead times. Great!!

    Regarding the dirty pages I'm still confused. I understand there can only be one version of a page in the datafile and it would be very difficult to manage multiple versions.

    I'll try to explain why I'm still confused : Consider max memory setting = 800KB, meaning you can only have 100 pages in the buffer cache. I write an update statement that updates 300 pages in one transaction. What will happen now? The first 100 pages are loaded in RAM, the update is logged in the transaction log file and the 100 pages in RAM become dirty pages. Now we have 100 pages in RAM that are different from the corresponding 100 pages on disk ... and therefor are called dirty. Correct? Now SQLServer has to load the next 100 pages in RAM to update page 100 to 200. In order to do this SQL Server has to create free space in the buffer cache, but all pages in the cache are dirty, BUT also not commited yet. You say the dirty pages will not be written as a second version ... but I should think SQL server must store somewhere both versions of the page : the dirty version to enable commit and the Original version in case rollback is required??? I probably make a logical mistake somewhere in my perception, but I don't see it at this moment.

    Kind regards,

    Peter.

  • Hello Gail,

    I could not move my mind away from this topic. After my last reply, I kept thinking about the statement "checkpoints can write uncommitted dirty pages to disk" ... and by doing so ... removing the data as it was before the update AND removing the data as it should be in case of rollback.

    I've read your answer again and I think I understand now how it works ...

    Checkpoint or lazy writer flushes the dirty pages from buffer cache to disk, even if the transaction that changed the page has not been committed yet. I tought this is a problem, but actually it's not since there is a lock on the page just until the transaction is committed or rolled back. Other users will not be able to read that data (when using read committed isolation level) until the transaction is completed and the lock is removed.

    So, indeed there is only one version left of the page, the uncommitted version, but that's not a real problem. In case of a rollback the original page can be recovered by reading the page back into the buffer cache and rolling back the changes based on the information in the transaction log file. In my example (updating 2400KB with max mem on 800KB) this might have a performance penality since a lot of physical IO has to be done to put back the data into it's original state, but that is acceptable since we might consider commit being regular behaviour and rollback being exceptional behaviour. Correct?

    One moment I thought the uncommitted dirty pages written to disk should be flagged as being uncommitted, but actually that's not necessary either since it's the lock manager that gives the required information to other sessions. So, when another session (using read committed) can execute a physical read of a page, that means that page has been committed (or rolled backed) ... and when another session (using the "dangerous" read uncommitted isolation level) reads the page ... it doesn't matter wether the page is in a state that still can be rolled back.

    Please let me know if I made wrong assumptions again in this hypothesis.

    Kind regards,

    Peter.

  • Sounds like a pretty good summary of the various situations to me.

  • Thanks a lot. Comprehension is vital for me to make the correct conclusions after running my baseline tests on different setups. I conclude SQL Server is cool in the way it handles transactions 🙂

    Kind regards

    Peter.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply