SQL newbie looking for Indexing advise

  • Greetings SCC members! Allow me to preface by stating I am VERY new to db design and tuning. I am a ColdFusion developer and recently I've been asked to design a complete database from scratch. I've built the necessary tables and populated some test data. There are roughly 20 tables in the design with very strict PK/FK referential integrity between them. Every table has a Primary Key with a Clustered Index on the Primary Key. Potentially, there will be 100's of thousands of records in the main party_relationship table. My question, is how do I determine which column(s) should be indexed?

    One of the more heavily used queries is attached (it's a doosie!).

    P.S. - This is my first time posting on SCC forums. If I left something out, please let me know.

  • Good grief! That is a doosie 🙂

    Anyhow, I would suggest that you index the columns used in your join conditions and those columns used in your where clause.

    If you have anything other than SQL Express you could run the query through the database query tuning advisor - it will recommend required indices. You can let it know the expected number of rows, etc so that your (possibly) small amount of data during development isn't what's used for estimating work done in a production environment.

    In any edition of MS SQL 2005 you can run the query in the sql server management studio with the graphical execution plan displayed. You want to examine that plan and ensure you eliminate table scans or clustered index scans (although in the latter case if you were after a range of records such as dates and you had clustered on that then a scan may be appropriate). There are a few good articles on http://www.sql-server-performance.com about this

  • Not only is it a "doosie", but it's darned near a duplicate...

    http://www.sqlservercentral.com/Forums/Topic435994-361-1.aspx

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • 1) Jeff is right - please avoid posting to multiple forums on essentially the same topic.

    2) Creating an index strategy for an entire database application cannot be done via a few posts on a forum. Nor can it be done (correctly) without production-level amounts of data. You need to know your data, data value distributions, data access patterns, level of service/response needs, hardware limitations, etc, etc. If you have need of a performant system, you would do best to hire a professional to help you with the design, perhaps some code (if you chose to use stored procedures for data access) and tuning. As a newbie I am sorry to say it but for anything other than a truly simple system you will wind up with a mess and have MUCH hastle later on to get things fixed after the system has been live for some period. It is truly impossible for a non-trained/experienced developer to create a well-designed database application. I suggest some heavy study, coupled with some mentoring, to get you quickly up to speed.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • In addition to above advice:

    - Index primary keys and foreign keys.

    - Do not assume the primary key needs to be clustered. Look at the queries that you use and see how many require a range, usually data related queries. If you have lots of those, make that index clustered. Watch out for hot spots as you insert, so be aware of how much change there is.

    - Every database I've been asked to design says design for 100,000+ rows. Most never get there. Don't panic.

    - Look at your WHERE clauses, maybe see which queries are most frequently run, and then index those.

    - Considering covering queries if you have the same columns returned a lot (and without other columns).

    - Read lots, keep track of metrics and make changes here and there, not a bunch at once. If you don't know if before and after are different and things "seem faster", then you haven't approached this well. You aren't learning and you might be masking other issues.

    - Ask questions, but ask them one at a time. Focus on one area, one table, one set of queries, etc. This is a constant, ongoing process.

  • I'll go a little further to help practicing on indexes... here's how to build a million row table of test data... the details are in the comments...

    --===== Create and populate a 1,000,000 row test table.

    -- Column "RowNum" has a range of 1 to 1,000,000 unique numbers

    -- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers

    -- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings

    -- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers

    -- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times

    -- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'

    -- for all rows.

    -- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)

    -- Jeff Moden

    SELECT TOP 1000000

    RowNum = IDENTITY(INT,1,1),

    SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,

    SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65),

    SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),

    SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),

    SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

    SomeHex12 = RIGHT(NEWID(),12)

    INTO dbo.JBMTest

    FROM Master.dbo.SysColumns t1,

    Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

    --===== A table is not properly formed unless a Primary Key has been assigned

    ALTER TABLE dbo.JBMTest

    ADD PRIMARY KEY CLUSTERED (RowNum)

    Takes about a minute and a half to run the first time. Subsequent test runs take less than a minute because of caching and development of statistics.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • On another note....

    Sure you should try to determine what indexes are needed up front however, index tuning is an on going task for the skilled DBA and Architect.

    Dont feel that you have to nail this right out of the gates.

    Over time, you'll likely see patterns develop that you hadnt intended and you'll find that your preliminary guesses were a bit off anyways.

    Certainly you'll need to index join columns.

    Certainly you'll need "A" clustered index (Not necessarily on the PKey)

    As you continue to monitor you'll likely find it necessary to add some indexes or perhaps even drop some.

    hope this helps.

    Greg Jackson

    Gregory A Jackson MBA, CSM

  • you might also want to reviit your data structures, I generally take a pretty dim view of massive numbers of joins in queries, you just know as data grows this will be a problem and the likely to participate in a blocking scenario somewhere down the line, especially as data volumes grow. I'd probably suggest that the database is over normalised ( no I haven't examined the query in depth )

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply