Home Forums SQL Server 2008 SQL Server Newbies Join performance gets much worse after combining two selects that are each fairly fast RE: Join performance gets much worse after combining two selects that are each fairly fast

  • tmoleary (9/7/2012)


    MainDocumentAndAddendaActualPlan.sqlplan is from the first query

    MainDocumentAndDiagnosesActualPlan.sqlplan is from the second query

    MainDocumentAddendaAndDiagnosesEstimatedPlan.sqlplan is from the third query

    Starting with these, the first problem is the 286 billion row estimate going into the select at the tail of the combined query.

    It's relatively alright coming off the ICD table scan (Heap table) to include the vDiagnosis, Visit_IDs, and PatientMap_2_patients tables, but then it self destructs including information fron TIUDocument_8925 and VHAEnterpriseStdTitle_Filter. It's full level statement optimization, so it's possible the statistics are just horribly stale here. I'd still triple check your joins, and take a good look at ICD to determine if some indexes will make sense there, particularly a sound clustered index.

    MainDocumentAndAddendaIntoTempTableActualPlan.sqlplan is from saving the first query results in a temp table

    MainDocumentAndDiagnosesIntoTempTableActualPlan.sqlplan is from saving the second query results in a temp table

    These should just be repetitions from the first set but using an INSERT instead of a select at the tail. They aren't, but nothing in them during a fast-glance review showed that there was any significant concerns. If anything, they looked a little more efficient, even if there are more components.

    However:

    MainDocumentAddendaAndDiagnosesFromTempTablesActualPlan.sqlplan is like the third query, but it uses the temp tables. That query looks like this:

    SELECT doc_ad.TIUDocumentSID, doc_ad.ScrSSN, doc_ad.TIUDocumentSID, doc_dx.ICDCode

    FROM #DocAndAdID AS doc_ad

    LEFT JOIN #DocAndDxID AS doc_dx

    ON doc_ad.TIUDocumentSID = doc_dx.TIUDocumentSID

    ORDER BY doc_ad.TIUDocumentSID, doc_ad.TIUAddendumSID

    Nice, clean, and orderly. You could even possibly align the index on one of the #tmps to the rest of the data's indexes and only need a single #tmp insert, lowering the # of statements in the process. How long did the multiple temp table build take to run end to end? Btw, if you include an index on both temps for TIUDocumentSID you'll get yourself a merge join, and if you include TIUAddendumSID in the index for #DocAndAdIDs you should be able to get it pre-sorted before this query runs.

    I would also recommend clustered indexing ICD on ICDSID for cleaner queries. That heap can't be helping matters here. At the very least, a non-clustered index on ICDSID including ICDCode as a leaf level inclusion would allow for that index to be used instead of the full table.

    Also, when you get some maintenance time, get a statistics update with fullscan going, and while you're at it check to make sure your defraggers are running as well (just always worth a look). My guess for this is query complexity + stale statistics are the primary cause of the issue, compounded by a heap in the middle. Temp table divide and conquer is not bad practice however so I would certainly approach it from that perspective unless you aren't getting good time runs from the practice.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA