Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

SQL Server Best Practices Expand / Collapse
Author
Message
Posted Friday, September 21, 2012 3:15 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, November 18, 2014 12:45 PM
Points: 327, Visits: 315
Hi All,

Let me start by saying that I know that I'm opening up a can of worms here. My company is trying to come up with a reasonable list of best practices related to SQL development in a multi-tier environment. I know there are probably hundreds, possibly thousands, of things that could be in this list, but we're not looking for a Magnum Opus of SQL Best practices. This is going to be given to .Net Software Engineers who are also responsible for varying degrees of SQL development, not DBAs. With that in mind, I have a tentative list and I would appreciate some input on things that are missing from the list or things that you think should be taken off the list.

Each of the items below is just a heading that will be fleshed out with more details before it is handed out. I also know that many of the items have exceptions, some of which will also be covered in the details. The idea is to get these principles in peoples heads so that they think about it and have a good reason before breaking them.

With that said, here's the list

[Edited based on comments]

General
1. Be set-minded, not row-minded, when designing anything in SQL
2. Keep SQL transactions as short as possible
3. Use the right isolation level, and know why you're using it.
4. Avoid the use of dynamic SQL where possible
5. Use sp_executesql instead EXEC('string') whenever writing dynamic SQL
6. When commenting SQL code, use block comments instead of line comments
7. Write SQL keywords, commands, and functions in upper case
8. USE TRY...CATCH instead of @@ERROR for error handling
9. Make error reporting, containment, and recovery part of your design

Table Design
1. Use the smallest data type appropriate to the data you're storing
2. Keep tables narrow
3. Do not blindly normalize everything Those who know why don't need this, those that don't are likely to abuse it.
4. Use VARCHAR(MAX) and NVARCHAR(MAX) instead of TEXT or NTEXT to store large textual data

Indexes
1. Add a clustered index to each table
2. Keep indexes narrow, especially the clustered index
3. Make sure that the columns in the clustered index aren't updated
4. Make clustered indexes monotonically increasing whenever possible
5. Design the column order of indexes around how it will be used
6. Make and use covering indexes where possible
7. Make clustered indexes unique
8. Do not make new or expand existing indexes to speed up a single query without considering the performance impact on the system as a whole

Query Design
1. Explicitly define the columns you want from a SELECT statement instead of using SELECT *.
2. Explicitly define column names in your insert statements
3. Make sure that your JOIN and WHERE clauses are SARGABLE
4. Prefer JOINS and derived tables to correlated sub-queries
5. Avoid the use of negative logic wherever possible (NOT EXISTS, NOT IN ...)
6. Avoid the use of LIKE in a JOIN clause
7. Use Query hints sparingly
7. Do not use Query hints without prior approval
8. Do not use scalar valued functions SELECT statements
9. Do not prefix search arguments in a LIKE clause with wildcards
9. Avoid prefixing search arguments in a LIKE clause with wildcards where possible
10. Do not use DISTINCT to clean up poorly written queries
11. Know the SQL windowing functions and how to use them

Procedure Design
1. Do not use both DML and DDL on temp tables within a stored procedure
1. Do not mix DDL within the DML of a stored procedure
2. Use SET NOCOUNT ON at the beginning of procedures
3. Use table variable for small data sets and temp tables for larger ones
3. Prefer temp tables over table variable unless there is a clear performance or functional advantage
4. Avoid the use of cursors and procedural code
Post #1363015
Posted Friday, September 21, 2012 3:58 PM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 1:12 AM
Points: 132, Visits: 934
1. Add a clustered index to each table
2. Keep indexes narrow, especially the clustered index
3. Make sure that the columns in the clustered index aren't updated
4. Make clustered indexes monotonically increasing whenever possible
5. Design the column order of indexes around how it will be used
6. Make and use covering indexes where possible


>4. Make clustered indexes monotonically increasing whenever possible
Thought I would double check, I'm assuming this clustered index will be used by the application to query data in DB?
Or is the clustered index column added in part to make a record have a unique key(per recommended practices)?

If it's the second one (auto-increment) I believe we could improve utilization of clustered index by:

Picking a unique column (e.g. SSN though not 100% unique)
Picking potentially even a DATE column, if this table might be used to query a lot of date ranges. This could save some IO.

Though as always "it depends"



















--------------------------------------------------
...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully
Post #1363024
Posted Friday, September 21, 2012 9:55 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Thursday, November 13, 2014 1:08 PM
Points: 990, Visits: 2,224
Recurs1on (9/21/2012)

Query Design ...
8. Do not use scalar valued functions SELECT statements

I don't think I understand this. Can you expand?


Recurs1on (9/21/2012)

Query Design ...
9. Do not prefix search arguments in a LIKE clause with wildcards

This is not always avoidable. Probably should be phrased like "Try not to..."


Recurs1on (9/21/2012)

Procedure Design
1. Do not use both DML and DDL on temp tables within a stored procedure

Really? I can't do this:
 
CREATE TABLE #Temp(...)
INSERT INTO #Temp ...


Post #1363063
Posted Friday, September 21, 2012 10:50 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 1:53 AM
Points: 35,540, Visits: 32,123
I'd remove the following because anytime you can get a Developer to comment code, you're well ahead of the game.
6. When commenting SQL code, use block comments instead of line comments


The following isn't a bad suggestion but, unless you ccome up with a reassonable standard snippet of code for use in the CATCH block, you're going to have an awful lot of errors that no one can determine the origin of nor even what the error actually is.
8 USE TRY...CATCH instead of @@ERROR for error handling


This is a great idea but has somme nasty caveates depending on which ORM you may be using. For example, if you create a VARCHAR(10) column and pass filters to queries through Linq2Sql, Linq2Sql will convert the filter to NVARCHAR(10) in the code it executes. Since NVARCHAR has a high data precedence than VARCHAR, the column being filtered will first convert to NVARCHAR which destroys any hope of getting an INDEX SEEK out of the code. Depending on how they might be used, I also have a personal dislike for BIT columns. For example, it's sometime much more expedient to SUM a flag column to get a count of rows where the flag is set and that's just not possible with the BIT datatype. I've also see certain types of columns that really should be TINYINT for future expansion (think "bullet proofing" code) that some might not consider at design time.
1. Use the smallest data type appropriate to the data you're storing


The following two suggestions seem to form a paradox. I'll also just about guarantee that Item 3 will be severely abused.
2. Keep tables narrow
3. Do not blindly normalize everything



They fixed it in 2012 but I recommend that you recommend that MAX datatypes should be avoided unless they are absolutely necessary. I know that goes along with using the smallest datatype possible but using MAX in anything less than 2012 has some real index maintenance issues with rebuilding indexes in an ONLINE fashion.

I'd specifically specify that you should make the clustered index UNIQUE along with the following because it has some very real performance implications.
4. Make clustered indexes monotonically increasing whenever possible



You might want to specify that you shouldn't blindly make indexes because they can really bog down inserts, updates, and deletes.

The following should probably be changed to "DO NOt USE QUERY OR TABLE HINTS WITHOUT PRIOR APPROVAL FROM THE DBA FOR EACH USE!".
7. Use Query hints sparingly


The following is a pretty bad misconception. It truly depends on what is being done for performance purposes. Both temp Tables and Table Variables start out in memory and both overflow to disk if they get too big. Table Variables are also evaluated as having just one row unless you do a recompile after they're loaded and they do not use statistics at all. Further, sectional troubleshooting where non-persistent data tables exist is a real bitch in SSMS. Consdier rewording this to say something like "Use Temp Tables in favor of Table Variables except for when the use of Table Variables provides a substantial increase in performance or the code doesn't work because of the Temp Tables when being called from external sources."
3. Use table variable for small data sets and temp tables for larger ones


The following is a bit confusing. I believe what you mean to say is to not intermingle DDL within the DML of a stored procedure.
1. Do not use both DML and DDL on temp tables within a stored procedure
















--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1363067
Posted Friday, September 21, 2012 10:51 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 1:53 AM
Points: 35,540, Visits: 32,123
Just my 2 cents on this...

Jeff Moden (9/21/2012)
I'd remove the following because anytime you can get a Developer to comment code, you're well ahead of the game.[quote]6. When commenting SQL code, use block comments instead of line comments


The following isn't a bad suggestion but, unless you ccome up with a reassonable standard snippet of code for use in the CATCH block, you're going to have an awful lot of errors that no one can determine the origin of nor even what the error actually is.
8 USE TRY...CATCH instead of @@ERROR for error handling


This is a great idea but has somme nasty caveates depending on which ORM you may be using. For example, if you create a VARCHAR(10) column and pass filters to queries through Linq2Sql, Linq2Sql will convert the filter to NVARCHAR(10) in the code it executes. Since NVARCHAR has a high data precedence than VARCHAR, the column being filtered will first convert to NVARCHAR which destroys any hope of getting an INDEX SEEK out of the code. Depending on how they might be used, I also have a personal dislike for BIT columns. For example, it's sometime much more expedient to SUM a flag column to get a count of rows where the flag is set and that's just not possible with the BIT datatype. I've also see certain types of columns that really should be TINYINT for future expansion (think "bullet proofing" code) that some might not consider at design time.
1. Use the smallest data type appropriate to the data you're storing


The following two suggestions seem to form a paradox. I'll also just about guarantee that Item 3 will be severely abused.
2. Keep tables narrow
3. Do not blindly normalize everything



They fixed it in 2012 but I recommend that you recommend that MAX datatypes should be avoided unless they are absolutely necessary. I know that goes along with using the smallest datatype possible but using MAX in anything less than 2012 has some real index maintenance issues with rebuilding indexes in an ONLINE fashion.

I'd specifically specify that you should make the clustered index UNIQUE along with the following because it has some very real performance implications.
4. Make clustered indexes monotonically increasing whenever possible



You might want to specify that you shouldn't blindly make indexes because they can really bog down inserts, updates, and deletes.

The following should probably be changed to "DO NOt USE QUERY OR TABLE HINTS WITHOUT PRIOR APPROVAL FROM THE DBA FOR EACH USE!".
7. Use Query hints sparingly


The following is a pretty bad misconception. It truly depends on what is being done for performance purposes. Both temp Tables and Table Variables start out in memory and both overflow to disk if they get too big. Table Variables are also evaluated as having just one row unless you do a recompile after they're loaded and they do not use statistics at all. Further, sectional troubleshooting where non-persistent data tables exist is a real bitch in SSMS. Consdier rewording this to say something like "Use Temp Tables in favor of Table Variables except for when the use of Table Variables provides a substantial increase in performance or the code doesn't work because of the Temp Tables when being called from external sources."
3. Use table variable for small data sets and temp tables for larger ones


The following is a bit confusing. I believe what you mean to say is to not intermingle DDL within the DML of a stored procedure.
1. Do not use both DML and DDL on temp tables within a stored procedure


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1363068
Posted Saturday, September 22, 2012 12:23 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Friday, October 31, 2014 12:07 PM
Points: 1,880, Visits: 3,463
5. Avoid the use of negative logic wherever possible (NOT EXISTS, NOT IN ...)

Why? What would you use instead when you need this form negative logic? When an inexperienced SQL developer sees this line he/she could end up using a LEFT JOIN negative_table with WHERE negative_table .column IS NULL combined with a GROUP BY or DISTINCT which will break one of your other best practices and would usually perform worse.

I would rather explain the difference between NOT EXISTS and NOT IN when the column in NOT IN is nullable.

Gail Shaw has some great articles (as usual) on this subject.

http://sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/
http://sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in/
Post #1363075
Posted Saturday, September 22, 2012 9:27 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, November 18, 2014 12:45 PM
Points: 327, Visits: 315
Thanks for the comments

rVadim (9/21/2012)
Recurs1on (9/21/2012)

Query Design ...
8. Do not use scalar valued functions SELECT statements

I don't think I understand this. Can you expand?

This should probably be restated. What we're after here is keeping people from using scalar valued functions in queries where possible because of the performance hit it takes. We've had a lot of code written that just throws them in there without considering the huge performance hit you can take with the RBAR that is often associated with them.



Recurs1on (9/21/2012)

Procedure Design
1. Do not use both DML and DDL on temp tables within a stored procedure

Really? I can't do this:
 
CREATE TABLE #Temp(...)
INSERT INTO #Temp ...


As someone said later on, this needs to be restated as "Don't intermingle DDL within your DML" so that we don't get recompiles all the time
Post #1363115
Posted Saturday, September 22, 2012 9:56 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, November 18, 2014 12:45 PM
Points: 327, Visits: 315
Jeff Moden (9/21/2012)


The following isn't a bad suggestion but, unless you ccome up with a reassonable standard snippet of code for use in the CATCH block, you're going to have an awful lot of errors that no one can determine the origin of nor even what the error actually is.
8 USE TRY...CATCH instead of @@ERROR for error handling



We do have one, and we're pretty good about using it. The other thing I forgot to mention is that this document will also be required reading for all new hires, so it's mostly in there for that purpose.


This is a great idea but has somme nasty caveates depending on which ORM you may be using. For example, if you create a VARCHAR(10) column and pass filters to queries through Linq2Sql, Linq2Sql will convert the filter to NVARCHAR(10) in the code it executes. Since NVARCHAR has a high data precedence than VARCHAR, the column being filtered will first convert to NVARCHAR which destroys any hope of getting an INDEX SEEK out of the code. Depending on how they might be used, I also have a personal dislike for BIT columns. For example, it's sometime much more expedient to SUM a flag column to get a count of rows where the flag is set and that's just not possible with the BIT datatype. I've also see certain types of columns that really should be TINYINT for future expansion (think "bullet proofing" code) that some might not consider at design time.
1. Use the smallest data type appropriate to the data you're storing


I didn't know that. We don't use an ORM, but that's some good advice to keep in mind if we ever decide to start. One of our biggest problems right now is the business has mandated we keep data "forever". We all know that eventually, that's not going to work, but right now, the difference between a tinyint and a smallint on some tables is 100's of MBS.

The following two suggestions seem to form a paradox. I'll also just about guarantee that Item 3 will be severely abused.
2. Keep tables narrow
3. Do not blindly normalize everything



They are kind of a paradox. There's often tension between normalizing data and performance. What I'm trying to get at here is that both need to be considered. Can you suggest a better way to express these competing goals?


They fixed it in 2012 but I recommend that you recommend that MAX datatypes should be avoided unless they are absolutely necessary. I know that goes along with using the smallest datatype possible but using MAX in anything less than 2012 has some real index maintenance issues with rebuilding indexes in an ONLINE fashion.


We can't rebuild indexes online because we don't have enterprise SQL. Do you think this is still a problem?


I'd specifically specify that you should make the clustered index UNIQUE along with the following because it has some very real performance implications.
4. Make clustered indexes monotonically increasing whenever possible




Good point.


You might want to specify that you shouldn't blindly make indexes because they can really bog down inserts, updates, and deletes.

Also a good point. We have had problems with this on a few tables before.


The following should probably be changed to "DO NOt USE QUERY OR TABLE HINTS WITHOUT PRIOR APPROVAL FROM THE DBA FOR EACH USE!".
7. Use Query hints sparingly


I like yours better


The following is a pretty bad misconception. It truly depends on what is being done for performance purposes. Both temp Tables and Table Variables start out in memory and both overflow to disk if they get too big. Table Variables are also evaluated as having just one row unless you do a recompile after they're loaded and they do not use statistics at all. Further, sectional troubleshooting where non-persistent data tables exist is a real bitch in SSMS. Consdier rewording this to say something like "Use Temp Tables in favor of Table Variables except for when the use of Table Variables provides a substantial increase in performance or the code doesn't work because of the Temp Tables when being called from external sources."
3. Use table variable for small data sets and temp tables for larger ones


I've read your's and others posts on table variable vs. temp table and that's exactly I was going after. What about
"Prefer temp tables over table variables unless there is a clear performance advantage."
and we'll flesh out the details in the body of the text that will go with each item?


The following is a bit confusing. I believe what you mean to say is to not intermingle DDL within the DML of a stored procedure.
1. Do not use both DML and DDL on temp tables within a stored procedure

Yup, that's what I was after.
Post #1363120
Posted Saturday, September 22, 2012 10:01 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Friday, October 31, 2014 12:07 PM
Points: 1,880, Visits: 3,463
Recurs1on (9/22/2012)We've had a lot of code written that just throws them in there without considering the huge performance hit you can take with the RBAR that is often associated with them.

Instead of just saying that scalar functions usually should be avoided, show them why. A screenshot from Profiler with thousands of call to the scalar function for the query it is used in is worth a thousand words.
Post #1363121
Posted Saturday, September 22, 2012 10:06 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, November 18, 2014 12:45 PM
Points: 327, Visits: 315
Nils Gustav Stråbø (9/22/2012)
5. Avoid the use of negative logic wherever possible (NOT EXISTS, NOT IN ...)

Why? What would you use instead when you need this form negative logic? When an inexperienced SQL developer sees this line he/she could end up using a LEFT JOIN negative_table with WHERE negative_table .column IS NULL combined with a GROUP BY or DISTINCT which will break one of your other best practices and would usually perform worse.

I would rather explain the difference between NOT EXISTS and NOT IN when the column in NOT IN is nullable.



Thanks for the articles. I hadn't even thought of that one, but I will add an item for this. The reason this item was here originally was because unless I'm mistaken, negative logic forces a table scan. Am I wrong on that one?
Post #1363122
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse