DBCC Internals

, 2007-06-05


- needs consistent view of the db. Needs to do this without locking. In

SQL Server 2000 log analysis is used. Almost everything is logged and

once the database is read, the log is read and inside checkdb, recovery

is performed to reconcile the differences. If there was a lot of

activity, this could take a long time. While the data portion is

multi-threaded, the t-log analysis is single threaded. There are

problems because not everything is logged. So it guesses in some cases.

Could cause false positives.

In SQL Server 2005, this was changed and they use COWs (copy on write).

An internal snapshot is generated and used for a consistent view. Still

with a high load, there are problems (just like a snapshot). Tempdb is

not used (nor can you create a snapshot on tempdb).

- no online checks on tempdb. recovery cannot be run and a snapshot cannot be created.

- only works if you have databases on NTFS.

- internal snapshot is an alternate file stream on each db file, but it will not show up in explorer.

- any space used in the alternate stream is returned to the drive after checkdb.

- If you run checkdb on a snapshot (if you're low on space), no alternate stream/snapshot is created.

- What does it do?

  - primitive checks on critical tables (5 tables)

    - works by getting to the leaf level of each of

these tables. Read every page, no cycles in the linkages, page audit of

every page.

    - if there's a  problem it's done.

  - allocation checks (= checkalloc) - checks consistency of

allocation maps. Ensures that the maps are accurately stored. IAM, GAM,

etc. ensure no two tables have the same map set, ensure that the GAM

and the IAM match up correctly. Check PFS pages - is page allocated, is

allocation page, etc.

  - so far this is fast

  - If running repair, it's run on allocation problems.

  - logical system table chceks and repairs

    - eq of checktable on these. Works in parallel

     - check indexes are correct, b-trees are correct, page audit, etc.

     - If page checksums are turned on, this is checked.

  - logical checks of user tables (and repairs)

    - this runs in parallel as well

  - service broker checks

  - metadata checks (check catalog, wasn't included in ss2k)

  - indexed views and XML index checks (and repairs). Determines

if the content of the indexed view matches the definition of the view.

Can determine if there are extra views or missing rows. Makes new index

to check. Similar thing with XML.

You can check percent complete column to figure out which stage and progress in the DMV (dm_exec_requests).


- only first two checks

- then reads every page to do checksum checks and page audits.

Leaf Index Checks

- Page audit (buffer pool does torn page or checksum checks). Can eat a checksum failure and continue on. Reports the issues.

  - look at page header, check that data is what is expected for that page.

- Record audit - gross audit of structure of all records. Are offsets correctly set.

- B-tree check

  - for index and leaf pages.

  - check linkages

  - are key ranges correct?

  - partitioning - is the record correct for the partitioning function.

  - key ordering - records on the page are not stored in order.

They are stored where they fit, so there's a key ordering store for the

records on a page.

- per record checks

  - complex columns - checks the pointer to the values. ensures that what's stored in the row is what's stored elsewhere.

  - computed columns - make sure persisted values are correct according to the formula.

  - Data purity - not all types have range checking. Datetime has

minutes after midnight. Must be < 1440, but the stored value could

be higher in theory. So checks are performed to be sure the stored

value is within the range.

  - 1 to 1 mapping - ensures that a nonclustered index maps to one base table row.

- Page counts

  - counts in header are correct.

No blocking of the table on SS2k and beyond.

How often do you run this? It depends.

checks could approach n<sup>2</sup> complexity. Fastest way

is N * log(n). Rather than following links, as pages are read, various

facts are generated about each page.

 - A (actual) - we read A

 - A (sinbling) b - who does A point to

 - P (parent) A - when A is read.

 Where it's an end page, nulls are generated, For parent pages, a

psuedo-parent is generated. A, B, O, are really the page IDs, index

IDs, etc. these are sorted and then checked. Every node must have 3

facts (can be null) or there's a missing link. If there are too many

items, like 2 parents, then there's a linkage issue. This is done so

pages can be read in allocation order, which is the fastest way to read

them. Other checks use a similar method with different facts.

 Index matching. It's getting complex, but the idea is that a

bitmap is generated. As the table (heap or clustered index), is read, a

hash is done and a bit is flipped. As the index is read, it's hashsed

and it should map up to the same value and the bit is flipped back. So

as long as a bit is flipped twice, it's good. so it digs through and

can reuser the bitmap for multiple tables. As long as everything flips

twice, it's ok. If at the end they do find a set bit  a "deep

drive" is performed, which is very time consuming to find the

corruption. This first check is 30% of the resources of the deep dive

and is fast. So if checkdb starts to take a long time, it's bad.

 Before SP2, you couldn't tell a deep-dive was being run. Post SP2, an error message is now logged.


- purpose is to make the db consistent. NOT protect data. Data is not

necessarily deleted, but the repairs are efficient, fast repairs and

data can be lost.

- mor eintrusive repairs ar edone first. Like allocation map for an

index repaired before a missing row. The more drastic ones often fix

less drastic ones.

- Not everything is repaired, or known. You have to check.

- Not feasible to make it online. Too complex.

CHECKDB doesn't run faster after index rebuilds.

If you run CHECKDB and find errors, does it run again with the repair

option? Yes, it runs twice. Because something could have changed.

Space required ro run checkdb.

 - facts require more space, so tempdb space is needed to store

facts. Run with_estimate_only, to get an idea of the space for facts.

 can internally do single index row inserts and deletes.

 If a value in a row is corrupt, the row can be deleted.

 If a pointer is corrupt, the entire page can be deleted.

 to be safe, you can start a transaction, run repair, and if data is deleted, you can rollback if needed.

 checkdb does not respect key relationships (PK, FK). It is

structurally complete, but you could have orphaned rows. Need to make


 Some things are unrepairable

  - system table clustered index problems

  - PFS pages

  - Data purity errors.

  Beware of third party kernel filters not working with CHECKDB that doesn't handle alternate streams.

  Not recommended to have anti-virus on SQL Server because it can interfere with IOs





Related content

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.


1,567 reads

Networking - Part 4

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...


1,530 reads

Speaking at Community Events - More Thoughts

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.


360 reads