Query Optimization using Indexes

  • Hi,

    I am working with sql server 2000 and 2005,could any one suggest me how to optimize a select query which is also using aggregate function on some of its columns and performing join on two tables.

    I have tried the following alternatives,

    In Attachments

    Orders table.JPG and Orderlines.jpg are images of table design.

    table's Scripts.doc is table's Script.

    and Join Query.doc is Join Query which is needed to be optimize.

    Step -1. I have created indexes on the tables which are using the column order same as in where clause(named IX_Prodline).but in this case RID/KEY lookup is taking primary key index and consuming most of the cost of the QUERY EXECTION.

    IX_Prodline.doc in below attachments is script for Index.

    and Join_PK_ExecutionPlan.sqlplan is Execution Plan after Step-1.

    Step-2. To avoid RID/KEY lookup,I have used INCLUDE Index(named IX_P),So that all the non primary keys are also included in index.In this case Sort Operator is consuming too much time and again orders table took primark Key index.

    IX_P_INCUDEINDEX.doc in below attachments is script for Incude Index Used.

    and Join_INCUDE_ExecutionPlan.sqlplan is Execution Plan after Step-2.

    And I have also try all the above steps,In Query without join, In that case In step-2 Sort Operator is taking too much time to execute, Most of the CPU Cost is consumed By sort operator,So as a result Execution Time is not reduced even for Selecting data from single table.

    For removing sort operator i have also tried order by with the same query,but it not works.

    Execution plan for only select query on orderlines table is shown in

    WithoutJoin.sqlplan attachment

  • Bearing in mind that you are aggregating quite a few columns from a million or so rows down to 140,000, this query will always take some time to run. However, I reckon you will get a significant performance lift by preaggregating the order lines table before joining back to the orders table. Check the following script. You will almost certainly require a second aggregation in the main query. Try it as it is before playing with index hints. I'd also suggest that you optimise the order lines (sub)query alone before adding the orders table to the mix.

    SELECT

    [Customer Name] = o.custname,

    [Stage]= o.stagedesc,

    [Ship To Name]= o.shiptonm,

    --[Inv Amt] = cast(sum(o.totinvamt) as decimal(19,2)),

    [Req Ship Date] = convert(varchar, o.reqshipdt , 101),

    [Promise Date]= isnull(convert(varchar, o.promisedt, 101),''),

    [State]= o.[state],

    [City]= o.city,

    [Zipcode]= o.zipcode ,

    ol.[Order#],

    ol.[Cust#],

    ol.[OE Type],

    ol.[Trans],

    ol.[Whse],

    ol.[Ship To ],

    ol.[Qty Ordered],

    ol.[Qty Shipped],

    ol.[Line Amount],

    ol.[Ordered Amt],

    ol.[Shipped Amt],

    ol.[Avg Cost],

    ol.[STD Cost],

    ol.[Addon Cost],

    ol.[Lnd Cost],

    ol.[Profit],

    ol.[Profit%],

    ol.[Enter Date],

    ol.[Oper ID],

    ol.[SalesRep In],

    ol.[SalesRep Out],

    ol.[TakenBy],

    ol.[ProdLine]

    FROM orders o

    INNER JOIN (

    SELECT

    [Order#]= ordernumber,

    [Cust#]= custno,

    [OE Type]= DBO.GETOETYPE(oetype),

    [Trans]= transtype,

    [Whse]= whse,

    [Ship To ]= isnull(convert(varchar,shipto,101),''),

    [Qty Ordered]= sum(qtyord),

    [Qty Shipped]= sum(qtyshp),

    [Line Amount]= cast(sum(lineamt) as decimal(19,2)),

    [Ordered Amt]= cast(sum(ordamt) as decimal(19,2)),

    [Shipped Amt]= cast(sum(shpamt) as decimal(19,2)),

    [Avg Cost]= CAST(sum(avgcost) as money),

    [STD Cost]= CAST(sum(stdcost) as money),

    [Addon Cost]= CAST(sum(addcost) as money),

    [Lnd Cost]= CAST(sum(landcost) as money),

    [Profit]= cast((sum(lineamt) - sum(landcost)) as decimal(19,2)),

    [Profit%]= cast (

    (select case

    when sum(lineamt) = 0 and sum(landcost) <> 0 then -100

    when sum(landcost) = 0 then NULL

    else (1 - sum(landcost)/sum(lineamt)) * 100 end)

    as decimal(19,2)),

    [Enter Date]= convert(varchar, enterdt, 101),

    [Oper ID]= operid,

    [SalesRep In]= slsrepin,

    [SalesRep Out]= slsrepout,

    [TakenBy]= takenby,

    [ProdLine]= prodline

    FROM orderlines

    WHERE cono = '10'

    and enterdt >= '1/1/2008 12:00:00 AM'

    and enterdt < '10/25/2008'

    and whse in ('100','101','102','103','104','105','106','107','108','120','121','122','151')

    and prodline in ('####','100','101','3M','80NIPA','80NIPN','80TOE','80UNI','AGRIFI','BF','BRNIP','BRPIPE','BULBS','BWF','CALS','COEXPI','COMCOU','CONPIP','COPFIT','CORFIT',

    'CORPIP','CV/UCV','DRFIT','DRFITS','DRITUB','DRPIPE','ECOBV','ENCLOS','FERT','FIXT','FLAG','FLG','FLOCON','GALFIT','GALNIP','GALPIP','GASFIT','HONWEL','HYDSAF','IMPBRS',

    'INSFIT','JCM','KCKSC','LP','LVWIRE','MCWIR','MCWIRC','OLYM','PARTS','PTS','PULBOX','PVCPIP','RBMAXI','RBPUMP','REBAR','RECPIP','RMCENT','RTPIPE','S40FIT','S80FIT','SATASS'

    ,'SCWIRE','SEED','SENN','SLIFIX','SOCK','SOD','STAR','STDASS','STPWIR','TIMER','TRAN','UVRFIT','UVRPIP','VITPRO','WATBOX')

    and oetype = 'R'

    and transtype not in ('qu','RA','BR','CR','ST')

    GROUP BY ordernumber, oetype, prodline, custno, transtype, whse, shipto, enterdt, operid, slsrepin, slsrepout, takenby

    ) ol ON ol.ordernumber = o.ordernumber

    -- aggregate here also if required

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Your query worked great when I tried with distinct.

    But still its taking 83 Seconds to execute.

    Do you have any other idea to make it more optimize.

    Thanks in advance

  • adlakha.22 (10/14/2011)


    Your query worked great when I tried with distinct.

    But still its taking 83 Seconds to execute.

    Do you have any other idea to make it more optimize.

    Thanks in advance

    Post the new actual execution plan.

  • Ninja's_RGR'us (10/14/2011)


    adlakha.22 (10/14/2011)


    Your query worked great when I tried with distinct.

    But still its taking 83 Seconds to execute.

    Do you have any other idea to make it more optimize.

    Thanks in advance

    Post the new actual execution plan.

    There may also be some mileage left in the aggregation. Try this:

    SELECT

    [Order#] = ordernumber,

    [Cust#] = MAX(custno),

    [OE Type] = DBO.GETOETYPE(MAX(oetype)),

    [Trans] = MAX(transtype),

    [Whse] = MAX(whse),

    [Ship To ] = isnull(convert(varchar,MAX(shipto),101),''),

    [Qty Ordered]= sum(qtyord),

    [Qty Shipped]= sum(qtyshp),

    [Line Amount]= cast(sum(lineamt) as decimal(19,2)),

    [Ordered Amt]= cast(sum(ordamt) as decimal(19,2)),

    [Shipped Amt]= cast(sum(shpamt) as decimal(19,2)),

    [Avg Cost] = CAST(sum(avgcost) as money),

    [STD Cost] = CAST(sum(stdcost) as money),

    [Addon Cost]= CAST(sum(addcost) as money),

    [Lnd Cost] = CAST(sum(landcost) as money),

    [Profit] = cast((sum(lineamt) - sum(landcost)) as decimal(19,2)),

    [Profit%] = cast (

    (select case

    when sum(lineamt) = 0 and sum(landcost) <> 0 then -100

    when sum(landcost) = 0 then NULL

    else (1 - sum(landcost)/sum(lineamt)) * 100 end)

    as decimal(19,2)),

    [Enter Date]= convert(varchar, MAX(enterdt), 101),

    [Oper ID] = MAX(operid),

    [SalesRep In]= MAX(slsrepin),

    [SalesRep Out]= MAX(slsrepout),

    [TakenBy] = MAX(takenby),

    [ProdLine] = MAX(prodline)

    FROM orderlines

    WHERE cono = '10'

    and enterdt >= '1/1/2008 12:00:00 AM'

    and enterdt < '10/25/2008'

    and whse in ('100','101','102','103','104','105','106','107','108','120','121','122','151')

    and prodline in ('####','100','101','3M','80NIPA','80NIPN','80TOE','80UNI','AGRIFI','BF','BRNIP','BRPIPE','BULBS','BWF','CALS','COEXPI','COMCOU','CONPIP','COPFIT','CORFIT',

    'CORPIP','CV/UCV','DRFIT','DRFITS','DRITUB','DRPIPE','ECOBV','ENCLOS','FERT','FIXT','FLAG','FLG','FLOCON','GALFIT','GALNIP','GALPIP','GASFIT','HONWEL','HYDSAF','IMPBRS',

    'INSFIT','JCM','KCKSC','LP','LVWIRE','MCWIR','MCWIRC','OLYM','PARTS','PTS','PULBOX','PVCPIP','RBMAXI','RBPUMP','REBAR','RECPIP','RMCENT','RTPIPE','S40FIT','S80FIT','SATASS'

    ,'SCWIRE','SEED','SENN','SLIFIX','SOCK','SOD','STAR','STDASS','STPWIR','TIMER','TRAN','UVRFIT','UVRPIP','VITPRO','WATBOX')

    and oetype = 'R'

    and transtype not in ('qu','RA','BR','CR','ST')

    GROUP BY ordernumber --, oetype, prodline, custno, transtype, whse, shipto, enterdt, operid, slsrepin, slsrepout, takenby

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Here it is Execution Plan after applying Chris 1st alternative.

  • @chris-2 After applying ur second solution,it seems query is not genrating the right results.

    and also took the same time to execute and returning less numbers of rows.

  • No problem, it's worth testing to see the results. Try trimming the columns from the GROUP BY one at a time to establish which if any can be dispensed with by using MAX() around the column in the output list, whilst retaining the original rowcount.

    You need an index which covers most or all of the columns in your WHERE clause;

    [cono],[oetype],[transtype],[whse],[enterdt]

    Also, with a long list matching to prodline, you may get some benefit from setting the list up as a temp table.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • @chris-2 I did't get u on this

    "to establish which if any can be dispensed with by using MAX() around the column in the output list, whilst retaining the original rowcount."

    Could u Please Explain this.

    Thx in advance

  • adlakha.22 (10/17/2011)


    @Chris I did't get u on this

    "to establish which if any can be dispensed with by using MAX() around the column in the output list, whilst retaining the original rowcount."

    Could u Please Explain this.

    Thx in advance

    Sure. The GROUP BY aggregating the order lines is quite extensive, 12 columns:

    group by ol.ordernumber, ol.oetype, ol.prodline, ol.custno, ol.transtype, ol.whse, ol.shipto, ol.enterdt, ol.operid, ol.slsrepin, ol.slsrepout, ol.takenby

    There's a good chance that many of these columns aren't influencing the level of aggregation at all, they're in the GROUP BY list so that they can appear in the output list. Find out what columns you really need to aggregate by i.e. match the business case.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thx for reply chris

    Now Query is Grouped By, 5 columns and it took 60 sec to execute complete join Query.

    As u suggest i have created index with columns in where clause(Named IX_Prodline) but now the same Query taking

    168 sec to execute .Bcz the same reason I have discussed in my 1st post

    "Step -1. I have created indexes on the tables which are using the column order same as in where clause(named IX_Prodline).but in this case RID/KEY lookup is taking primary key index and consuming most of the cost of the QUERY EXECTION."

  • adlakha.22 (10/17/2011)


    Thx for reply chris

    Now Query is Grouped By, 5 columns and it took 60 sec to execute complete join Query.

    As u suggest i have created index with columns in where clause(Named IX_Prodline) but now the same Query taking

    168 sec to execute .Bcz the same reason I have discussed in my 1st post

    "Step -1. I have created indexes on the tables which are using the column order same as in where clause(named IX_Prodline).but in this case RID/KEY lookup is taking primary key index and consuming most of the cost of the QUERY EXECTION."

    Let's try working with the order lines subquery for a while.

    How long does it take to run?

    Can you post the actual plan?

    It's quite likely that it could be speeded up by streaming the results of the order lines subquery into a temp table then indexing the temp table on order header, before joining to the orders table.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Order lines SubQuery is Executed in 40 sec without using Index,

    and after using index it took 134 sec.

    Here it is Execution Plan For Both.

    * By mistake I have attached join Qry,Please findout the Sub Query from attached file.:(

  • Take the index hint out of the query. Index hints are very rarely used because the optimiser will almost always choose the best index for the job. In this case, the index "Ix_Prodline" isn't the best choice and forcing SQL Server to use it reduces performance.

    Create a new index as follows:

    CREATE NONCLUSTERED INDEX [IX_cono_oetype_transtype_whse_enterdt]

    ON [dbo].[orderlines] ([cono],[oetype],[transtype],[whse],[enterdt])

    Run the query again and post the actual plan.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Now Query along with index ,is executed in 56 sec.

    Please find the actual exection paln and Index used is:

    CREATE NONCLUSTERED INDEX [IX_New] ON [dbo].[orderlines]

    (

    [Cono] ASC,

    [oetype] ASC,

    [transtype] ASC,

    [Whse] ASC,

    [enterdt] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

Viewing 15 posts - 1 through 15 (of 110 total)

You must be logged in to reply to this topic. Login to reply