Tips to optimize your SQL statements

  • Nadrek (7/30/2010)


    ...

    When writing new SQL, why not take a little more time and do it well, instead of randomly? For existing bad SQL, most often I see that a few minutes rewriting is extremely beneficial, and has no bad side effects.

    ...

    That may be fine if you're the one in control of the code being written. However, if you're like the majority of DBA's in the workplace, quite often that is not the case. You'll have to deal with development team and/or external vendors. Maybe even get managers approval for changing work priorities and any additional charges.

    Case in point, a vendor application we have running had a report that took 10-15 minutes to run. The end-user spoke directly to the vendor who quoted 4 hours of work to fix the report. This work was done and implemented and the report took just a few minutes. In the next invoicing cycle the extra 4 hour charge was rejected as it hadn't been approved. Naturally the vendor wasn't pleased about this and they reversed their change and left the user with the slow report.

    At this stage the user decided to speak with me and within 5 minutes I had identified a lack of indexes (there were none on the reporting tables). After creating some indexes the same report took 2-3 minutes to run. Later that week, when I had some free time, I decrypted the vendors procedure, pulled apart the mess and got the report to run in less than 30 secs. Naturally being vendor code I couldn't deploy the changes I'd made, but given I have a good relationship with them, I sent the changes through their developer.

    Three months later when we performed an upgrade on the application, the report that was running in 2-3 minutes started to run in less than 30 sec.

    Nadrek (7/30/2010)


    ...

    Essentially: Why add cost in one place to reduce cost in another before you attempt to reduce cost without adding any?

    Easy answer, add the minimal (yes minimal, despite your assertions) cost in the short term (ie: getting the end-user a report they require), then see about change approval, testing, etc... of re-written code.

    --------------------
    Colt 45 - the original point and click interface

  • Thanks for the article.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • http://blogs.msdn.com/b/sqlazure/archive/2010/07/27/10043069.aspx

    Just a note, non-clustered indexes (what you make when you do a covered index) reduce the performance of your writes. This is because on insertion or updates, the indexes need to be updated. So you need to balance your covered index creation with the ratio of reads and writes to your database. Databases with a disproportion amount of reads to writes gain more performance from covered indexes.

    Indexes DO affect your write performance. It is not theoretical. You have to seek a balance between reducing I/O by adding indexes and reducing insert efficiency by adding indexes. If you add a new covering index for every bad query it will have significant effects. If you can improve the query without changing the indexes, it is a win/win. In my shop the DBAs monitor poor-performing queries and make recommendations to the devs for re-writes and/or indexes. Some common problems in queries that can be fixed in the code: non-SARGable clauses, writing a where clause that tests where a or b instead of writing two selects with simple where clauses that are unioned together, cursor usage, not specifying a known column in the where clause when that would let you take advantage of a clustered index.

  • Thanks for the info

  • wbrianwhite (7/31/2010)


    http://blogs.msdn.com/b/sqlazure/archive/2010/07/27/10043069.aspx

    Just a note, non-clustered indexes (what you make when you do a covered index) reduce the performance of your writes. This is because on insertion or updates, the indexes need to be updated. So you need to balance your covered index creation with the ratio of reads and writes to your database. Databases with a disproportion amount of reads to writes gain more performance from covered indexes.

    Indexes DO affect your write performance. It is not theoretical. You have to seek a balance between reducing I/O by adding indexes and reducing insert efficiency by adding indexes. If you add a new covering index for every bad query it will have significant effects. If you can improve the query without changing the indexes, it is a win/win. In my shop the DBAs monitor poor-performing queries and make recommendations to the devs for re-writes and/or indexes. Some common problems in queries that can be fixed in the code: non-SARGable clauses, writing a where clause that tests where a or b instead of writing two selects with simple where clauses that are unioned together, cursor usage, not specifying a known column in the where clause when that would let you take advantage of a clustered index.

    As previously stated, the cost of adding and updating the indexes is actually a lot smaller than it's made out to be. This is the same sort of thing spouted by people that say you should always rebuild the indexes when generally updating statistics is all thats required.

    More often than not, the problem with a poorly performing query are the cases you've specified, but what if you're dealing with a vendor application where code changes could take months, if they ever happen at all?

    Better to wear the impact of applying an index rather than leaving the query to perform poorly.

    --------------------
    Colt 45 - the original point and click interface

  • Nice article, thx !! Althought all tips are pretty well known, it's nice have them at one place and check if one didn't forget anything:)

  • philcart (8/3/2010)


    As previously stated, the cost of adding and updating the indexes is actually a lot smaller than it's made out to be. This is the same sort of thing spouted by people that say you should always rebuild the indexes when generally updating statistics is all thats required.

    More often than not, the problem with a poorly performing query are the cases you've specified, but what if you're dealing with a vendor application where code changes could take months, if they ever happen at all?

    Better to wear the impact of applying an index rather than leaving the query to perform poorly.

    I agree that it's all a matter of the precise situation one is in at the time, and the likely future.

    I would argue that all other things being equal, tune the query before adding an index. When all other things are not equal (lousy vendor SQL), make the best compromise within your constraints as possible, whether that be adding indexes or buying SSD's or moving tables around filegroups or changing configuration options or whatever else is within your power.

    For indexes, I still stand by the premise that in at least some situations, the additional index maintenance, fill factor tuning, drive space required, and backup/restore time can be significant. I would agree that in most cases the added time for updates and inserts is minimal; but a large index created with the default fillfactor of 100 on key columns that are often changed or added to essentially randomly (date of birth, SSN, address and name fields are good examples here) can grow very quickly due to fragmentation; that's one cost to be aware of - either the fragmentation and space taken up, or the time the regular maintenance of the index takes, or the time taken to correctly set the fillfactor, or more likely some combination of the those.

    In all cases indexes take space - if you have a bad vendor database with lousy SQL and a poor design where everything is a CHAR(too many) and there are millions of rows, then indexes on that will take a fair bit of space, too.

  • Interesting that you think Developers should be the people responsible for performing the very useful tips you recommend. Everywhere I've worked it has always fallen to DBAs to check execution plans, create suitable indexes, review stored procs etc.

    I think there is a divide between Developers and DBAs which both sides need to work to close; we can both learn from each other.

    Regards

    Lempster

  • Database Developers and those developing on databases, should know enough to code efficiently without having it "...always fall to the DBA."

    If you're in the position of being the sole database person fighting against a slew of developers, then educate them. It takes next to no time for developers to look at an execution plan and flag and possibly correct the obvious issues.

    One developer I worked with actually found it easier to work through the execution plan than try to create the best query from scratch. From his point of view, the execution plan explained the query procedurally, just like his .Net code.

    --------------------
    Colt 45 - the original point and click interface

  • It takes next to no time for developers to look at an execution plan and flag and possibly correct the obvious issues.

    I don't disagree with the general sentiment that developers should be encouraged and educated to think about execution plans, indexes, SARGable clauses etc., but I do take issue with your statement above; they would need to know how to read an execution plan and look for areas of improvement and that's a skill that takes time to accquire.

    It can be very difficult to initiate a change in the division of tasks between developers and DBAs if working practices have 'always been done this way'. I'm not saying it's right, it's just the way it is sometimes!

    Regards

    Lempster

  • So I read the article and ran the query for missing indexes ordered by performance impact.

    The first 4 missing indexes were for the same table. Take a look at the attachment. The recommendation is for 4 amazing similar and mostly overlapping indexes. In this case I would be inclined to create an index on IWNROV, IWSVCD, IWNBFY, and include columns IWCNBR, IWCEK6, IWSPY8, IWDRAT, IWAKBN, IWBICH

    I would appreciate comments on the appropriateness of my approach.

    Thanks

  • 5) Check for any table or Index scans. Using the execution plan, it's easy to identify if the SQL statement is performing any table or index scans. In the majority of cases (assuming that statistics are up to date), this indicates that an index is missing. These three DMVs might prove useful to identify missing indexes:

    This should be reworded to Check for any table or Clustered Index Scans. Regular non-clustered index scans can happen for many other reasons, including pulling a wide enough range of data from the table that an index scan is warranted, somebody using a non-deterministic function in a where clause, etc.

    Creating a new index in an attempt to cover an index that an existing index scan is likely serving won't necessarily generate an index seek afterwards.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Good article. I would only mention that in order to fix ordering issues you don't necessarily need to change the clustered index. ALL indexes are sorted, so a correct index may remove the sort and changing the query may remove the sort operation as well. For instance if you have an index on LastName, FirstName but then you sort on FirstName, LastName, the index may still be used but you'd have to have the sort operation in the query, where an index on FirstName, LastName would likely serve the results up in ordered fashion so the sort operation would be unnecessary.

  • Daniel Bowlin (2/10/2012)


    So I read the article and ran the query for missing indexes ordered by performance impact.

    The first 4 missing indexes were for the same table. Take a look at the attachment. The recommendation is for 4 amazing similar and mostly overlapping indexes. In this case I would be inclined to create an index on IWNROV, IWSVCD, IWNBFY, and include columns IWCNBR, IWCEK6, IWSPY8, IWDRAT, IWAKBN, IWBICH

    I would appreciate comments on the appropriateness of my approach.

    Thanks

    First, I'd recommend creating a separate thread for this conversation as I foresee it being possible lengthy.

    Second, before I'd add an index I'd evaluate the existing indexes to see if there is one I could alter to meet this need. I've been planning on working on some code that does a comparison between the missing index information and existing indexes. I've seen many times where there is an existing index with a different order or just missing one of the columns. I'd also evaluate the queries that are causing the missing index recommendations. It may be that you don't really need all the included columns for the majority of the columns. While eliminating bookmark lookups is often a quick performance enhancer, it may not always be the best thing.

  • 1) Check if you're missing any required table joins. This can easily happen and the output will be a Cartesian join ...

    2) Check if you're missing any required WHERE clause. A missing WHERE clause will return more data than needed ...

    It's interesting that your first two reccomendations would be to confirm that the query has the required JOIN and WHERE clause criteria. This type of thing is so obvious that most performance tutorials jump right into analyzing wait states, query execution plans, and indexing. In other words, often times we start with the assumption that the SQL is logically correct, and our optimization is focussed just on returning the same resultset in a more efficient way.

    However, having worked for a company that completed mergers with several other companies in the industry, a big part of my job has been refactoring legacy databases. I've seen a lot of SQL code from different teams; and much of it poorly written. The scenario, where a developer basically wraps a DISTINCT clause around a SELECT statement performing what amount to a cartesian join; I see that a lot.

    Another scenario is where the developer includes un-needed table joins and returns additional columns that are never referenced by the application or reporting tool. I've even seen procedures that query an interim result into a temp table, which is then never used. This is often the result of a developer wrting a new stored procedure by copy/pasting from an existing stored procedure, and then failing to remove those table joins, columns, and code that arn't needed.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Viewing 15 posts - 16 through 30 (of 41 total)

You must be logged in to reply to this topic. Login to reply