Blog Post

Primary Key Discovery


As a part of my DB documentation process for a large database, I needed a method to quickly show me all of the Primary Keys in the database.  Not only did I want the PKs, but I wanted to know the DataType for each of the Primary Keys.

I began by fiddling with some queries first and then by searching on Google for something that might be fitting.  I found a couple of queries (one which was very similar to what I already had).  The queries used different views and thus sparked a question for me – which method was better?

So let’s start with the base query.

From this query, I can get the Column Name, Constraint Name (PK) and the Table to which it belongs.

Execution Plan for that Query:

With the Following IO Stats:

And here is the alternative base query:

The Execution Plan is much larger, and thus I have added the XML Plan for download via - Alt BaseQuery ExecutionPlan.

And the IO stats:

From both queries, we can see that they are returning the same data, same number of rows, and for this set particular run – both run in under 1 second (109 ms for Query 1 and 156 ms for Query 2).  For this base query both can be deemed acceptable.  However, I want to emphasize that Query 2 took nearly 50% longer to process in this test.

Now that we have a baseline, I need to be able to retrieve the data type for each of the columns in each primary key.  This was a part of the initial self-prescribed requirements.  Adding in the additional components for each query is a little different.  Once added in, I will be doing the final comparison in a single batch to illustrate a little more poignantly the differences in processing between these queries.

Query 1 needs the following additions (lines 3, and 9-13)

Query 2 needs the following changes (Lines 18, and 22-26)

When I run these in tandem, the query plan for Query 1 changes somewhat, and the Plan for Query 2 a little more.  Comparing the two, I see that Query 2 has a cost of 72% for the batch, and Query 1 has the remaining 28% for the batch.

Here is the XML Plan for both for download: Comparison_Plans.

Now the IO Stats:

Notice the one Table that is substantially and surprisingly different between the two sets.  In Query 1, sysiscols has 2119 scans and 4243 logical reads.  Both are substantially higher than Query 2 at 229 Scans and 459 logical reads.  Despite this information, we can find that Query 1 executes (during these tests) better than Query2.  The result of the testing could change depending on information used in testing.  If I were to just consider the IO stats, I may well opt for Query 2.  However, when I consider all factors (used in these tests), my results tell me to use Query1.

Download Scripts For this Article here.


You rated this post out of 5. Change rating




You rated this post out of 5. Change rating