Help with slow query

  • I have a query that is taking 1 minute to complete. I would like to imporve its performance. I have tried creating a CTE from the same query but I am getting similar results.

    The query is:

    Select S.IDNo, S.PickDt, E.FirstName, E.LastName, P.ProductDesc, H.HouseID, C.CustomerName, E.EmpIDNo

    From GMFPick S

    Inner Join GMFEmployee E On S.EmployeeIDNo = E.IDno

    Inner Join GMFProduct P on S.ProductIDNo = P.IDNo

    Inner Join GMFHouse H On S.HouseIDNo = H.IDNo

    Inner Join GMFCustomer C On S.CustomerIDNo = C.IDNo

    Where S.IDNo >= 600 and S.IDNo <= 1200

    Order by S.IDNo

    The CTE is:

    with PickCTE (IDNo, PickDt, EmployeeIDNo,ProductIDNo,HouseIDNo,CustomerIDNo)

    as

    (

    Select IDNo, PickDt,EmployeeIDNo,ProductIDNo,HouseIDNo,CustomerIDNo

    From GMFPick

    Where IDNo >= 600 and IDNo <= 1200

    )

    Select pickcte.IDNo, E.FirstName,E.Lastname,E.EmpIDNo, P.ProductDesc, H.HouseID, C.CustomerName from PickCTE

    Inner Join GMFEmployee E On PickCTE.EmployeeIDNo = E.IDno

    Inner Join GMFProduct P on PickCTE.ProductIDNo = P.IDNo

    Inner Join GMFHouse H On PickCTE.HouseIDNo = H.IDNo

    Inner Join GMFCustomer C On PickCTE.CustomerIDNo = C.IDNo

    The number of records per table are:

    GMFPick 1,513,953

    GMFEmployee 51

    GMFProduct 10

    GMFHouse 29

    GMFCustomer 4

    The tables contain the following indexes:

    GMFPick

    1. NonClustered - PickDt(Asc datetiem), IDNo(bigint)

    2. NonClustered - EmployeeID(Bigint), ProductIDNo(bigint)

    3. NonClustered - ProductIDNo(bigint)

    4. NonClustered - CropIDNo(bigint)

    5. NonClustered - HouseIDNo(bigint)

    6. NonClustered - CustomerIDNo(bigint)

    7. NonClustered - IDNo(bigint)

    GMFEmployee

    1. Clustered - IDNo(bigint)

    GMFProduct

    1. Clustered - IDNo(bigint)

    GMFHouse

    1. Clustered - IDNo(Bigint)

    GMFCustomer

    1. Clustered - IDNo(bigint)

    Normally, I would supply the code to create the tables and populate them with sample data. However, I am not sure how meaningful that would be in this instance since I would not supply 1.5 million sample records. But if the consensus is that I need to supply some records, just let me know and I will get it done.

    Also, I have attached Estimated execution plans for both the query and the CTE.

    Would someone please assit me in obtaining better performance for this query? And just as important, would be to explain to me where I have gone wrong?

    Thanks,

    Pat

    PS I may be slow in responding. It is 10 AM outside of Philadelphia and the eye of hurricane Sandy is due to pass overhead in about 10 hours. I expect to be without power or internet for some time (possibley 1 to 2 days) as the day progresses.

  • I suppose the first question is what state are the Indexes in, do you run regular Db maintentance jobs on the indexes?

    If they are highly fragmented, then you could get very poor query result time, check them with DBCC SHOWCONTIG statement for each table/index if theres a low Scan density then a reindex might help increase performance.

    The RID lookup to GMFPick isnt helping either so you might want to look at including the EmployeeIdNo,ProductIdNo, HouseIdNo, CustomerIdNo on the Nonclustered index (7 in your list), on the GMFPick Table.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • I have 2 suggestions for you (none of them include an index because I'm no expert on them).

    1. Include the actual execution plan files (.sqlplan) that contain more information than the screenshots of the estimated execution plans show.

    2. Instead of using a CTE, use a temp table (#Table). This might improve the performance of your query.

    I hope you stay fine with the arrival of Sandy.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • For that specific query your index design doesn't make much sense. Also I wouldn't create indexes on tables with only a few rows. You may want to drop your indexes and create a new one something like

    CREATE NONCLUSTERED INDEX ix_GMFPick_IdNo ON GMFPick (IDNo) INCLUDE (EmployeeIDNo,PickDt,ProductIDNo,CustomerIDNo,HouseIDNo)

  • Jason,

    I have executed Reindex on all the indexes. Currently, fragmentation is < .02 or = 0 on every index.

    I included the fields suggested in the non clustered Row Index lookup (RID). This did not affect performace.

    Luis,

    I am a VB programmer who dabbles in SQL when I need to. I know I can get better performance by returning the 600 records from GMFPick and doing the subsequent lookups. But I am trying to understand what I am going wrong here with SQL.

    I have attached the Actual SQLPlan for the CTE.

    Please let me know if other information would be helpful.

  • AFAIK, and someone can correct me if I'm wrong, SQL Server has some problems optimizing plans for CTEs and table variables. Using a temp table might improve the performance, but it's not a sure thing. That's why you need to test.

    I'll try to find a reference to support what I'm saying here, but I'm a little busy, so it won't be immediate.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • I'd not worry about using a CTE as in this case it really doesnt add anything to the query performance.

    Thanks for the plan, Looking at the plan the index the engine is choosing is 6 in your list (IX_GMFPick_5) which is the one based on CustomerIdNo, this wasnt clear in the original and so I assumed it was using the IDNo index so adding the include columns to the IdNo index probably wouldnt have helped, but adding them to the CustomerIdNo index should.

    (I take it you are doing this on a development box not a production box.)

    PS: hope you dont get hit as hard as everyone forecasting.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • Jason,

    Thanks for your reply. I do not understand "the index the engine is choosing is 6 in your list (IX_GMFPick_5)". I see several indexes and a Heap. How/why did you identify _5 as "THE" index? Or, how about a source where I could read about how to interpret SQL Execution plans?

    You were 100% correct. I deleted the included fields from the one index. I added the same fields to _5 index, restarted the SQL engine and the query ran in 4 seconds. That is more inline with what I expected. And yes my development box.

    New actual execution plan is attached.

    I do appreciate you and everyone else putting the time and effort into this.

    I don't see a way to mark this topic as complete.

    thanks,

    pat

    PS. I am in good shape, on a hill, good drainage, new house (15 yrs) and a new roof (lost in a hurricane 8 years ago. Defective bldg materials so the replacement was free). But please keep the thousands in my area who are not so fortunate in your thoughts.

  • You didnt really need to restart the SQL Engine.

    There are a few good resources abuiot, I have the MS press book :MS SQL Server Performance Tuning and optimisation, its a good starting point, another is Itzik Ben-Gans Inside MS SQL Server 2008, T-SQL querying, which gives a good over view of query tuning and how indexes work in Chapter 4.

    There are plenty more but I've not read them.

    If you hover over the Index Seek on the GMFPick.IX_GMFPick_5 icon on the plan, it shows you the columns in the Seek predicates list at the bottom, in this case CustomerIdNo.

    As for the performance increase most of this will be down to the removal of the RID lookup which would have been very costly, especially with it being a heap lookup.

    You might benefit from a clustered index on the IDNo on GMFPIck , but that could change the whole plan can leave you with other problems.

    PS: Glad to hear you're safe (ish), I think a lot of people on this side (European) of the atlantic remember seeing the pictures after Katrina and wouldnt wish a similar event on anyone else.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • This is probably not helping to solve your problem with performance in mind, but WHY do you use BIGINTs as IDs on ALL your tables? I mean you have a table of 4 values and you used a BIGINT as a DataType for it.

    BIGINT uses double the amount of storage than normal INT data types.

    Change all your BIGINTs to INTs (even that 1,5 Million rows... It's not big enough to use BIGINTs.

    I think there should be a performance increase using only INTs as a datatype, because it will make your indexes smaller, it will make your database much smaller and all this means you use less pages to query from.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply