SQLServerCentral Article

Design Oversight - Preliminary Review

,

Design Oversight - Preliminary Review

We all know what the ideal application design environment is for building a

database back-end: an experienced DBA takes inputs from end users and developers

and creates the database design in order to support the application being

developed.  Stored procedures, tables, views, indexes, the works, are all

developed by DBAs working closely with the programmers writing the

application.  Database best practices are the rule, not the

exception. 

But in reality, we don't get the opportunity to do application design like

this very often.  The ideal is quickly shoved aside and often developers

are doing their own design and building their own stored procedures and making

decisions on constraints and indexing in order to get an application out the

door the day before yesterday.  Meanwhile, the DBA is busy maintaining the

last few applications that went in and ensuring that the database servers are

performing as they should.  Every so often, the DBA has the time to take a

close look at what the developers are doing.  Usually, this is right before

some milestone when the next piece of the application is supposed to be done and

a demo to management is expected.  The DBA looks at the number of tables

and realizes, "I don't even know where to begin!"  Sound

familiar?

The problem of where to begin is a common one.  If we were to try and

run an sp_help against every table, we might be at it forever.  And

that's not even looking at the stored procedures.  So we need to do some

sort of preliminary review.  Basically, we need some tools to be our

"smoke detector" to spot a potential fire.   Here are some

things we can look at:

  • Tables without indexing
  • Tables without primary keys (a subset of the previous one)
  • Tables without foreign keys
  • Stored procedures using cursors

Keep in mind, this list is just a smoke detector type of test.  Its

purpose is to give some warning if something is suspicious, just like the smoke

detector in the hall goes off when we forget to turn on the fan over the

stove.  There isn't necessarily a problem, but we are alerted that there

might be.   And our smoke detector isn't going to be able to alert us

when the banks of the river overflow and our house floods.  So it's not

going to flag everything.  But it does give us that crucial "where to

start." 

Here's a warning: the queries I'll present use the system tables (SQL Server

versions 7.0 and 2000). 

Microsoft warns that it has the right to make changes to the system tables

without notice.  Microsoft recommends using INFORMATION_SCHEMA views

instead, but they don't provide all the information we need or they are unwieldy

for our particular purposes.  As a result, we need to

go to the system tables.  Should Microsoft make a change in a

Service Pack or next release, the queries may have to be modified accordingly.  

Tables Without Indexing

Tables without indexing are a concern for obvious reasons.  One of the

things to remember is that primary keys are instituted with indexes, so if a

table doesn't have an index, it doesn't have a primary key.  What we're

concerned with here, however, is more than just tables without primary

keys.  Tables without any indexes at all are generally warning signs that

the developers don't understand the purpose of indexes and thus have not

implemented them to aid performance.  There are exceptions to this

rule.  For example, if a table exists as a heap table simply for the

purpose of reporting or the like, we may choose not to place an index on the

table (including a primary key, simply because an index is created) in order to

allow writes to that table to be as quick as possible.  

Now in the script to follow, there is a derived table from the sysindexes

table.  Basically, we want to create a derived table that doesn't include

an indid of either 0 or 255.  For indid = 0 refers to the table

itself.  An indid = 255 refers to a text, ntext, or image column for a

particular table.  We're looking for clustered and non-clustered indexes,

so we need to look at indid between 1 and 254, 1 being a clustered index and

> 1 being a non-clustered index.  I've taken the liberty of adding an

ORDER BY simply for aesthetic reasons:

/* Script to check for tables with no indexes. */

SELECT USER_NAME(so.uid) Owner, so.name [Table]

FROM sysobjects so

LEFT JOIN (SELECT id FROM sysindexes 

WHERE indid BETWEEN 1 AND 254) si

ON so.id = si.id

WHERE so.type = 'U'

AND si.id IS NULL

ORDER BY Owner, [Table]

Tables Without Primary Keys

Since primary keys are established using indexes, if we look for tables

without primary keys, we're actually looking at a subset of tables without

indexes.  However, it is often beneficial to lock in and point out

specifically what tables don't have primary keys, especially when talking with

developers.  Chances are if we say, "These are the tables without

primary keys" we'll get a better understanding and response than if we say,

"These are the tables without indexes," simply because a developer may

not realize the connection between primary keys and indexes.  So while the

previous query returns the same information and more as the following query, I

provide it for the sake of isolating those tables without primary keys.

We'll be using the sysconstraints table here, because that lets us look

at the status column.  Now while the status column is int, we'll need to

perform a bitwise operation to get at the primary key (we'll do the same for

foreign keys in a bit). If we take a look at Books Online for sysconstraints, we'll

see the following information for the status field:

ValueMeaning
1Primary Key
2Unique Key
3Foreign Key
4CHECK constraint
5DEFAULT constraint
16Column-level constraint
32Table-level constraint

We'll have to use the bitwise and (&) operator, performing an & 7 which will only return

1 for a

given status field if the first bit is populated and the second and third bits

aren't, indicating we've got a primary

key on that table.  The reason we need to do an & 7 is pretty straight

forward if we look at the numbers in binary notation: 001 = 1 and that's a

primary key,  011 = 3 and that's a foreign key, 101 = 5 and that's a

DEFAULT constraint.  In all three cases we're seeing the first bit

populated.  If we do an & 7, and the second or third bits are

populated, we'll either get back 3 or 5, as the case may be.  If only the

first bit is populated, then we get back a 1.  And that's what we want.

For this particular query, I'm ensuring dtproperties

does not appear, which is considered a user table, even though it's for storing

database diagrams.  If you have replication already established (though

this isn't likely in development), you can filter out those tables as well,

because they are marked as user tables.

/* Script to check for tables with no primary keys. */

SELECT USER_NAME(so.uid) Owner, so.name [Table]

FROM sysobjects so

LEFT JOIN (SELECT id FROM sysconstraints

WHERE status & 7 = 1) sc

ON so.id = sc.id

WHERE so.type = 'U'

AND sc.id IS NULL

AND so.name <> 'dtproperties'

ORDER BY Owner, [Table]

Tables Without Foreign Keys

Tables without foreign keys can be an indicator that the developers aren't

taking a close look at referential integrity.  In most any database there

will be tables without foreign keys (for example, lookup tables), but these

would normally be very few in number unless because of business rules we're

having to resort to other methods (such as triggers) to enforce integrity.  If we're talking lookup tables, we

can have some sort of naming convention which identifies a lookup table clearly

as such and we can filter out those.  In any case, we'll be hitting the

sysconstraints table again, except this time we'll be looking for the value 3, which means we'll be doing an & operation using

3.  Since 3 in binary is 011 and 5 is 101, we can simply & 3 here,

since after 5 we jump to 16 (10000) and then 32 (100000).  The & 7 will

work, too, but & 3 is sufficient.

/* Script to check for tables with no foreign keys. */

SELECT USER_NAME(so.uid) Owner, so.name [Table]

FROM sysobjects so

LEFT JOIN (SELECT id FROM sysconstraints

WHERE status & 3 = 3) sc

ON so.id = sc.id

WHERE so.type = 'U'

AND sc.id IS NULL

AND so.name <> 'dtproperties'

ORDER BY Owner, [Table]

Stored Procedures Using Cursors

Programmers tend to like cursors, because cursors fit with the procedural

paradigm of most development languages.  It can often be a great learning

curve to get out of thinking along procedural lines and into a set-based

paradigm.  We know that SQL Server has support for cursors, but these are

best regarded as a last resort.  There will be cases where cursors are the

only option.  But if we pull a list of stored procedures using cursors, we

can look through them to verify that what the developer is trying to do can't be

done any other way.  For this effort, we'll have to hit against the

syscomments table, and we'll have to use the LIKE operator.  So this

particular query can consume some resources.  It should be noted that

the way this query is written, we'll also capture any triggers using cursors as

well, in case our developers are using them.

If developers have used the

WITH ENCRYPTION, it won't do any good looking at the syscomments table for that

particular stored procedure.  So we'll filter out stored procedures that

use such.  Also, since there could be multiple rows in the syscomments for

a given stored procedure definition, we'll use a select distinct since we're

interested in the owner and the name the stored procedure and only need to see

that once.  So of all the

"tools" presented, this query is certainly the most resource intensive.

/* script to check for stored procedures using cursors */

SELECT DISTINCT USER_NAME(so.uid) Owner, so.name [Stored Procedure]

FROM sysobjects so JOIN

(SELECT id FROM syscomments

WHERE encrypted = 0

AND [text] LIKE '%cursor%') sc ON sc.id = so.id

ORDER BY Owner, [Stored Procedure]

 Concluding Remarks

As I stated in the beginning, these queries point out potential problem areas

within the database portion of an application design.  They won't spot

everything. They may flag things that are all right.  However, they do

give us a starting point if we've not had the optimal amount of oversight for an

application development project.  By the way, they are also good if we're

looking at how a third party app implements a database.  We may not have

any control over the app  (and if they've chosen the encrypt their stored

procedures the last tool won't do us any good), but at least we'll be

forewarned.   Hopefully these queries will prove useful in your set of

tools.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating