The Most Common Query Blunders...

Question

The Most Common Query Blunders...

Viewing 15 posts - 46 through 60 (of 73 total)

You must be logged in to reply to this topic. Login to reply

Alan Burstein SSC Guru Points: 61136 More actions · Answer 1

The most common/easiest to fix mistakes:

Someone beat me to it but NOLOCK table hints are the #1 mistake I have seen. I would go so far as to say it's become an epidemic in the industry. Real DBAs know better but a majority of "DBA"s think that throwing NOLOCK hints everywhere is a best practice.

Not enough I/O: tempdb, mdfs, ndfs, ldfs, OS, SQL Install spread across too few disks.

databases set to full recovery in places where transaction log backups are not happening. Common in many dev environments.

Nullable columns that always contain data (or where a blank or -1 would do).

No index on Foreign keys.

RBAR and hidden RBAR (not always as simple to fix/replace as what I've already mentioned but it's falls in the category of "most common query blunders". )

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 2

Alan.B (12/28/2015)
The most common/easiest to fix mistakes:
Someone beat me to it but NOLOCK table hints are the #1 mistake I have seen. I would go so far as to say it's become an epidemic in the industry. Real DBAs know better but a majority of "DBA"s think that throwing NOLOCK hints everywhere is a best practice.
Not enough I/O: tempdb, mdfs, ndfs, ldfs, OS, SQL Install spread across too few disks.
databases set to full recovery in places where transaction log backups are not happening. Common in many dev environments.
Nullable columns that always contain data (or where a blank or -1 would do).
No index on Foreign keys.
RBAR and hidden RBAR (not always as simple to fix/replace as what I've already mentioned but it's falls in the category of "most common query blunders". )

Not to mention nullable columns with default value constraint, does my head in and so does a non-unique index on unique column / column combination, why on earth? Another problem that often comes with the missing foreign key problem are disabled foreign keys, how about just rolling out the red carpet for the congestion, blocking and deadlocking instead.

😎

Alan Burstein SSC Guru Points: 61136 More actions · Answer 3

Eirikur Eiriksson (12/28/2015)
Alan.B (12/28/2015)
The most common/easiest to fix mistakes:
Someone beat me to it but NOLOCK table hints are the #1 mistake I have seen. I would go so far as to say it's become an epidemic in the industry. Real DBAs know better but a majority of "DBA"s think that throwing NOLOCK hints everywhere is a best practice.
Not enough I/O: tempdb, mdfs, ndfs, ldfs, OS, SQL Install spread across too few disks.
databases set to full recovery in places where transaction log backups are not happening. Common in many dev environments.
Nullable columns that always contain data (or where a blank or -1 would do).
No index on Foreign keys.
RBAR and hidden RBAR (not always as simple to fix/replace as what I've already mentioned but it's falls in the category of "most common query blunders". )
Not to mention nullable columns with default value constraint, does my head in and so does a non-unique index on unique column / column combination, why on earth? Another problem that often comes with the missing foreign key problem are disabled foreign keys, how about just rolling out the red carpet for the congestion, blocking and deadlocking instead.
😎

Yep, yep. Or that non-clustered index without any include columns, especially when you would only need one or two to avoid a key lookup.

I'm cleaning up some questionable code at the moment and another biggie is unnecessary ORDER BY clauses (regardless of what indexes are available). Or GROUP BY SomeFunction(blah blah)... or GROUP BY case statements... I'm looking at some of that right now.:hehe:

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Jeff Moden SSC Guru Points: 1003873 More actions · Answer 4

Alan.B (12/28/2015)
Someone beat me to it but NOLOCK table hints are the #1 mistake I have seen.

Heh... apparently, thousands of people agree with you. Instead, they use SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED to be sure they don't miss anything. ;-):-P:-D At least it's easy to get rid of when they do that.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

MMartin1 One Orange Chip Points: 27779 More actions · Answer 5

Jeff Moden (12/28/2015)
Alan.B (12/28/2015)
Someone beat me to it but NOLOCK table hints are the #1 mistake I have seen.
Heh... apparently, thousands of people agree with you. Instead, they use SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED to be sure they don't miss anything. ;-):-P:-D At least it's easy to get rid of when they do that.

That would be my favorite. I have seen experienced developers fall to this one.

Just to mention things others have probably have already:

* functions in a where clause

* functions in a join condition

* conversions in a where clause

* not using temp tables when the base table is a heap (especially a large one)

* left joining on a table only to filter in the where clause based on a column in the right table (makes it a inner join)

* delete from a large table (do in chunks or consider truncate)

* triggers everywhere for every individual column

* top n rows without an order by (results not guaranteed to be consistent)

* overuse of dynamic sql

* code to insert into a logging table blocks reached, to help capture where an error occurs in a SP (horrendous on performance and there are better ways)

I am sure there is more out there. As Jeff mentioned I like the idea of comments. I think it is better friend than being obsessive compulsive with indents (which do help, but pale in comparison to what comments can tell you).

----------------------------------------------------

MMartin1 One Orange Chip Points: 27779 More actions · Answer 6

Jeff Moden (12/25/2015)
I believe my favorite non-SARGable problems are due to the inappropriate use of "OR" and things like the following...
WHERE ISNULL(SomeNumericColumn,0) > 0
...
WHERE ISNULL(SomeCharacterColumn,' ') > ' '
... which, of course, is due to a simple lack of knowledge concerning NULLs and the fact that front-end code typically doesn't follow the same rules for NULLs as the soon-to-be-not-optional default rules that SQL Server follows.

Ive done that second one above in yesterdays. ISNULL(someCharacterColumn,'')>''. Heh, the results were correct. I guess you mean that

WHERE someCharacterColumn >''

is more sargable.

----------------------------------------------------

Jeff Moden SSC Guru Points: 1003873 More actions · Answer 7

MMartin1 (12/30/2015)
Jeff Moden (12/25/2015)
I believe my favorite non-SARGable problems are due to the inappropriate use of "OR" and things like the following...
WHERE ISNULL(SomeNumericColumn,0) > 0
...
WHERE ISNULL(SomeCharacterColumn,' ') > ' '
... which, of course, is due to a simple lack of knowledge concerning NULLs and the fact that front-end code typically doesn't follow the same rules for NULLs as the soon-to-be-not-optional default rules that SQL Server follows.
Ive done that second one above in yesterdays. ISNULL(someCharacterColumn,'')>''. Heh, the results were correct. I guess you mean that
WHERE someCharacterColumn >''
is more sargable.

Yes... that and the fact that NULLs aren't comparable to anything unless you've changed a setting that shouldn't have been changed (and has also been deprecated, BTW). Both find everything that is NOT NULL and NOT BLANK but, you're correct, the SomeCharacterColumn > '' is SARGable and the one with the ISNULL function is not (there is a "halo" exception... if the column has been defined as NOT NULL, the ISNULL is ignored). The other two WHERE clauses are not.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jason A. Long SSC-Insane Points: 23780 More actions · Answer 8

Jeff Moden (12/30/2015)
(there is a "halo" exception... if the column has been defined as NOT NULL, the ISNULL is ignored). The other two WHERE clauses are not.

Interesting... I didn't know that exception existed... Yet another difference between ISNULL and COALESCE.

Eric M Russell SSC Guru Points: 125523 More actions · Answer 9

Somewhat related to SQL Server, when writing remote queries or ETL packages against Oracle, try to remember that an empty string in PL/SQL is equivalent to NULL. At least understand that much before diving into the pool or you'll end up banging your head.

For example in Oracle:

('' IS NULL) True

('' = '') False

('A' > '') False

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Jeff Moden SSC Guru Points: 1003873 More actions · Answer 10

Heh... without going into too many details, I've found that one of the most common query blunders is the nearly religious belief in rote, so called "best practices" and some pretty ridiculous recommendations that are frequently recommended as if they were. Some examples...

"Never use Cursors. Use Temp Tables and While Loops, instead."

"Never use While Loops. Use recursive CTE's, instead." (Writing another article on that little fallacy as we speak).

"Never use Cursors or While Loops."

"To be SET BASED, always write your code as a single query."

"Never use Temp Tables. Use CTEs, Derived Tables, and Views, instead."

"Never use xp_CmdShell. It's a security risk." (one of my favorites)

"Always write portable/ANSI/ISO code." (Close cousin of "Never use proprietary features because they destroy portability." (my second favorite)

"Avoid stored procedures. They're not portable."

"Never put business logic in a table or anywhere else in SQL. It doesn't belong there."

"Never us Foreign Keys. They slow down development and it's a form of business logic that doesn't belong in a database."

"Always put the PK/Clustered Index on an IDENTITY column."

"Never put the PK/Clustered Index on an IDENTITY column."

"Never use IDENTITY columns."

"Never use GUIDs."

"It's ok to use sequential GUIDs for a Clustered Index."

"It's ok to write code using <insert some slow method here> because it will only be used for a small number of rows."

"It must be true because it's written in this book by ..."

"Save time. Use better hardware."

"Upgrading your hardware will make your code run faster."

"Spindles don't matter on a SAN".

"Disk level fragmentation doesn't matter on a SAN".

"Put your database on SSDs. All of your queries will run much faster."

"Never do any type of formatting in T-SQL. "

"Never do ETL in T-SQL. It wasn't built for that. Use SSIS (or some other tool) instead."

"Use PowerShell to manage and execute all of your backups on all of your servers."

"If you can't figure something out, it can't be done using SET BASED code."

"Never do direct date/time math."

"Never use partitioned views on the Enterprise Edition."

"All triggers are evil. Avoid them at all costs."

"You should always design the clustered index based on how most queries will use it."

"Partitioning improves query performance."

"You shouldn't use UPDATE for joined updates. Beginners might do it wrong."

"You shouldn't use "0" as a datetime. Beginners might not know what it means."

The list goes on but here's my all time favorite...

"Just because you can do something in T-SQL, doesn't mean you should."

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Ed Wagner SSC Guru Points: 287024 More actions · Answer 11

Jeff Moden (1/1/2016)
Heh... without going into too many details, I've found that one of the most common query blunders is the nearly religious belief in rote, so called "best practices" and some pretty ridiculous recommendations that are frequently recommended as if they were. Some examples...
"Never use Cursors. Use Temp Tables and While Loops, instead."
"Never use While Loops. Use recursive CTE's, instead." (Writing another article on that little fallacy as we speak).
"Never use Cursors or While Loops."
"To be SET BASED, always write your code as a single query."
"Never use Temp Tables. Use CTEs, Derived Tables, and Views, instead."
"Never use xp_CmdShell. It's a security risk." (one of my favorites)
"Always write portable/ANSI/ISO code." (Close cousin of "Never use proprietary features because they destroy portability." (my second favorite)
"Avoid stored procedures. They're not portable."
"Never put business logic in a table or anywhere else in SQL. It doesn't belong there."
"Never us Foreign Keys. They slow down development and it's a form of business logic that doesn't belong in a database."
"Always put the PK/Clustered Index on an IDENTITY column."
"Never put the PK/Clustered Index on an IDENTITY column."
"Never use IDENTITY columns."
"Never use GUIDs."
"It's ok to use sequential GUIDs for a Clustered Index."
"It's ok to write code using <insert some slow method here> because it will only be used for a small number of rows."
"It must be true because it's written in this book by ..."
"Save time. Use better hardware."
"Upgrading your hardware will make your code run faster."
"Spindles don't matter on a SAN".
"Disk level fragmentation doesn't matter on a SAN".
"Put your database on SSDs. All of your queries will run much faster."
"Never do any type of formatting in T-SQL. "
"Never do ETL in T-SQL. It wasn't built for that. Use SSIS (or some other tool) instead."
"Use PowerShell to manage and execute all of your backups on all of your servers."
"If you can't figure something out, it can't be done using SET BASED code."
"Never do direct date/time math."
"Never use partitioned views on the Enterprise Edition."
"All triggers are evil. Avoid them at all costs."
"You should always design the clustered index based on how most queries will use it."
"Partitioning improves query performance."
"You shouldn't use UPDATE for joined updates. Beginners might do it wrong."
"You shouldn't use "0" as a datetime. Beginners might not know what it means."
The list goes on but here's my all time favorite...
"Just because you can do something in T-SQL, doesn't mean you should."

So many of these contain "always", "never" and "can't". My ears perk up any time I see those and there's usually an exception. "It depends" comes up so much because it's so very true.

Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 12

Jeff Moden (1/1/2016)
Heh... without going into too many details, I've found that one of the most common query blunders is the nearly religious belief in rote, so called "best practices" and some pretty ridiculous recommendations that are frequently recommended as if they were. Some examples...

A personal favorite "It says so in BOL"

😎

Ed Wagner SSC Guru Points: 287024 More actions · Answer 13

Eirikur Eiriksson (1/2/2016)
Jeff Moden (1/1/2016)
Heh... without going into too many details, I've found that one of the most common query blunders is the nearly religious belief in rote, so called "best practices" and some pretty ridiculous recommendations that are frequently recommended as if they were. Some examples...
A personal favorite "It says so in BOL"
😎

Especially when it doesn't really say so. 😉

Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions

Alex Gay SSCrazy Points: 2211 More actions · Answer 14

Hi Jason,

One of the things I see regularly (and used to do myself until I realised how expensive it was) is the use of non-deterministic functions in a query. The most obvious of these is GetDate(). It might be quick to run, but if it only takes 1 ms but you are retrieving 50K rows, you have added 50 seconds to your query. Most of the time people only want the date part anyway, so unless you are running something regularly at 23:58 why query it for every row returned. Store it once in a variable, at the top of the procedure and use that instead. As there are normally a few functions wrapping the GetDate() call, you usually increase your time saving.

The other big problem I see is people selecting more columns than they need, usually with a SELECT *, and then throwing away most of the data later. Small data sets are quicker. This is usually caused by the developer not really understanding the data or the question, and then not going back and tidying up when they have a satisfactory answer.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 15

Alex Gay (1/4/2016)
The most obvious of these is GetDate(). It might be quick to run, but if it only takes 1 ms but you are retrieving 50K rows, you have added 50 seconds to your query. Most of the time people only want the date part anyway, so unless you are running something regularly at 23:58 why query it for every row returned. Store it once in a variable, at the top of the procedure and use that instead.

Most system functions, even getdate() are evaluated once and the value used in all rows. The exception to that is NEWID(), which is evaluated on all rows.

You can see this by using GETDATE() or its datetime2 version SYSDATETIME() in a large resultset. The times on each row will be identical. If it was evaluated on every row, rows processed later would have later timestamps to rows processed earlier

To test: (I used AdventureWorks because it's handy. SalesOrderDetail has 1.9 million rows in my copy of AW)

DECLARE @t TABLE (SomeID INT, SomeDate DATETIME2)

INSERT INTO @t (SomeID, SomeDate)

SELECT sod.SalesOrderID, SYSDATETIME()

FROM Sales.SalesOrderDetail AS sod;

GO 10

DECLARE @t TABLE (SomeID INT, SomeDate DATETIME2)

DECLARE @SomeDate DATETIME2 = SYSDATETIME();

INSERT INTO @t (SomeID, SomeDate)

SELECT sod.SalesOrderID, @SomeDate

FROM Sales.SalesOrderDetail AS sod;

GO 10

The inserts into the table variable are to remove any effect from transmitting and displaying the data (which SSMS is really slow at)

CPU and duration as caught by an Extended Events session

When averaged...

Function in the query

CPU: 519.5ms

Duration 523.6 ms

Function assigned to variable, variable used in query

CPU: 525.6

Duration: 523.9

User-defined scalar functions are another matter. They really are evaluated on every row and they can have a huge effect. On 600 000 rows I saw a difference between 300ms and 3 seconds.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass