Query Performance Improvement

  • Hi All,
    I'm very new to PI and sorry if this is a silly question, but was wondering if I can reduce the logical read/ improve the query performance slightly.

    The query is ; Select col1,Col2 from tbl which has 373,059 rows in it and  now takes 3 seconds to run.. This is the base query from where I built other stuff and thought it would be a good start. 
    Note: A non clustered index is created on columns col1,Col2, covering Col3,Col4,Col5
           
    Scan count 1, logical reads 5296, physical reads 0, read-ahead reads 0, lob logical             reads 0, lob physical reads 0, lob read-ahead reads 0.

    Can anyone please advise.

    Kind regards
    thenewBee

  • thenewbee - Wednesday, October 3, 2018 10:49 AM

    Hi All,
    I'm very new to PI and sorry if this is a silly question, but was wondering if I can reduce the logical read/ improve the query performance slightly.

    The query is ; Select col1,Col2 from tbl which has 373,059 rows in it and  now takes 3 seconds to run.. This is the base query from where I built other stuff and thought it would be a good start. 
    Note: A non clustered index is created on columns col1,Col2, covering Col3,Col4,Col5
           
    Scan count 1, logical reads 5296, physical reads 0, read-ahead reads 0, lob logical             reads 0, lob physical reads 0, lob read-ahead reads 0.

    Can anyone please advise.

    Kind regards
    thenewBee

    Not sure what you're trying to do, you haven't given much information, but
    CREATE CLUSTERED INDEX IX_TBL_1 ON tbl1(col1,col2)
    might give a bit better performance.

  • This query is too simple to improve the performance on.  You have none of the major factors that contribute to poor performance, so there is no way to reduce or remove those factors.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Hi Jonathan,

    Not sure I’m greedy but was trying to reduce the execution time from 3 secs to 1 sec. to reduce the logical reads from 5296

    Regards

    theNewbee

  • thenewbee - Wednesday, October 3, 2018 10:49 AM

    Hi All,
    I'm very new to PI and sorry if this is a silly question, but was wondering if I can reduce the logical read/ improve the query performance slightly.

    The query is ; Select col1,Col2 from tbl which has 373,059 rows in it and  now takes 3 seconds to run.. This is the base query from where I built other stuff and thought it would be a good start. 
    Note: A non clustered index is created on columns col1,Col2, covering Col3,Col4,Col5
           
    Scan count 1, logical reads 5296, physical reads 0, read-ahead reads 0, lob logical             reads 0, lob physical reads 0, lob read-ahead reads 0.

    Can anyone please advise.

    Kind regards
    thenewBee

    It's not possible to give great advice here without seeing actual queries and execution plans (note the link in my subject line: Best practices for getting help on SQLServerCentral.

    What I can say from looking at what you posted - you are getting a nonclustered index scan, what you want is an index seek... unless you really need all those rows.  Does your query join to other tables? Is there a WHERE clause? Remember that the purpose of indexes is to reduce the need to perform sorts and make it possible to seek for records instead of scanning entire data structures. The index you posted won't/isn't do/doing either. If you are not doing any filtering or joining and, instead are just doing a SELECT col1, col2 FROM myTable - then an index won't do much for you. Let's say, however, you were filtering; for example - say you had a column named isActive and your query looked like this:

    SELECT t.col1, t.col2
    FROM table1 AS t
    WHERE t.isActive = 1;

    An index to support this query would be:
    CREATE NONCLUSTERED INDEX nc_table1_isActive ON table1(isActive)
    INCLUDE (col1, col2);

    Then the optimizer can do a seek, therefore processing less rows (thus reducing I/O and improving performance). The INCLUDE(col1,col2) makes it possible for the index to handle the query without also accessing the clustered index. The nonclustered index you posted would only be helpful if you are sorting or grouping on col1 and col2 (in that order). For example: your index would help with a query that looked like this:

    SELECT t.col1, t.col2, SUM(t.col3), MAX(t.col4), MIN(t.col5)
    FROM  table1 AS t
    GROUP BY t.col1, t.col2;

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Jonathan AC Roberts - Wednesday, October 3, 2018 11:06 AM

    Not sure what you're trying to do, you haven't given much information, but
    CREATE CLUSTERED INDEX IX_TBL_1 ON tbl1(col1,col2)
    might give a bit better performance.

    How might this improve performance?

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Hi Drew,

    Can you advise what changes have to be made to improve performance

    Kind Regards

    thenewbee

  • thenewbee - Wednesday, October 3, 2018 11:55 AM

    Hi Drew,Can you advise what changes have to be made to improve performance Kind Regards thenewbee

    What part of "none" don't you understand?

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • The only thing I see that could help would be a separate index on just:
    (col1, col2)
    If possible, use compression on the index, at least row but page if you have the CPU available and the compression savings estimate shows that it would save a meaningful amount of space.

    If those are the only two columns you're using in the query, that would be the shortest possible index to have to read, and thus require the fewest logical I/Os.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Alan.B - Wednesday, October 3, 2018 11:52 AM

    thenewbee - Wednesday, October 3, 2018 10:49 AM

    Hi All,
    I'm very new to PI and sorry if this is a silly question, but was wondering if I can reduce the logical read/ improve the query performance slightly.

    The query is ; Select col1,Col2 from tbl which has 373,059 rows in it and  now takes 3 seconds to run.. This is the base query from where I built other stuff and thought it would be a good start. 
    Note: A non clustered index is created on columns col1,Col2, covering Col3,Col4,Col5
           
    Scan count 1, logical reads 5296, physical reads 0, read-ahead reads 0, lob logical             reads 0, lob physical reads 0, lob read-ahead reads 0.

    Can anyone please advise.

    Kind regards
    thenewBee

    It's not possible to give great advice here without seeing actual queries and execution plans (note the link in my subject line: Best practices for getting help on SQLServerCentral.

    What I can say from looking at what you posted - you are getting a nonclustered index scan, what you want is an index seek... unless you really need all those rows.  Does your query join to other tables? Is there a WHERE clause? Remember that the purpose of indexes is to reduce the need to perform sorts and make it possible to seek for records instead of scanning entire data structures. The index you posted won't/isn't do/doing either. If you are not doing any filtering or joining and, instead are just doing a SELECT col1, col2 FROM myTable - then an index won't do much for you. Let's say, however, you were filtering; for example - say you had a column named isActive and your query looked like this:

    SELECT t.col1, t.col2
    FROM table1 AS t
    WHERE t.isActive = 1;

    An index to support this query would be:
    CREATE NONCLUSTERED INDEX nc_table1_isActive ON table1(isActive)
    INCLUDE (col1, col2);

    Then the optimizer can do a seek, therefore processing less rows (thus reducing I/O and improving performance). The INCLUDE(col1,col2) makes it possible for the index to handle the query without also accessing the clustered index. The nonclustered index you posted would only be helpful if you are sorting or grouping on col1 and col2 (in that order). For example: your index would help with a query that looked like this:

    SELECT t.col1, t.col2, SUM(t.col3), MAX(t.col4), MIN(t.col5)
    FROM  table1 AS t
    GROUP BY t.col1, t.col2;

    Wow, that was good for a start... 
    What I learned from your post is ...  Non clustered index has to be created on columns used for JOIN and WHERE and sort columns can be mentioned in the INCLUDE clause

    Below I have pasted the full query and I would get all of your suggestions for an improvement. The data was huge and hence I created a subset of it.

    -- DROP if TABLE EXISTS

    IF OBJECT_ID('tbl_bridge', 'U') IS NOT NULL
    DROP TABLE dbo.tbl_bridge

    --CREATE TABLE [tbl_bridge]
    CREATE TABLE [dbo].[tbl_bridge](
        [SCHEME_CODE] [int] NULL,
        [PLAN_ID] [int] NULL,
        [STAT_KEY] [nvarchar](255) NOT NULL
    CONSTRAINT [PK_TBL_BRIDGE] PRIMARY KEY CLUSTERED
    (
        [STAT_KEY] ASC
    )
    )

    -- Create Indexes
    IF EXISTS(SELECT * FROM sys.indexes WHERE object_id = object_id('tbl_bridge') AND NAME ='IDX_JB_PLAN_ID')
    DROP INDEX [IDX_JB_PLAN_ID] ON [tbl_bridge]

    CREATE NONCLUSTERED INDEX [IDX_JB_PLAN_ID] ON [dbo].[tbl_bridge]
    (
        [PLAN_ID] ASC
    )
    INCLUDE (     [SCHEME_CODE],
        [STAT_KEY])
    GO

    IF EXISTS(SELECT * FROM sys.indexes WHERE object_id = object_id('tbl_bridge') AND NAME ='IDX_JB_SCHEME_CODE')
    DROP INDEX [IDX_JB_PLAN_ID] ON [tbl_bridge]

    CREATE NONCLUSTERED INDEX [IDX_JB_SCHEME_CODE] ON [dbo].[tbl_bridge]
    (
        [SCHEME_CODE] ASC
    )
    INCLUDE (     [PLAN_ID],
        [STAT_KEY])
    GO

    IF EXISTS(SELECT * FROM sys.indexes WHERE object_id = object_id('tbl_bridge') AND NAME ='IDX_JB_STAT_KEY')
    DROP INDEX [IDX_JB_PLAN_ID] ON [tbl_bridge]

    CREATE NONCLUSTERED INDEX [IDX_JB_STAT_KEY] ON [dbo].[tbl_bridge]
    (
        [STAT_KEY] ASC
    )
    INCLUDE (     [SCHEME_CODE],
        [PLAN_ID])
    GO

    -- INSERT DATA

    INSERT INTO tbl_bridge (SCHEME_CODE,PLAN_ID,STAT_KEY)
    SELECT '4339','4260973','NS4260973' UNION ALL
    SELECT '626','4917629','NS4917629' UNION ALL
    SELECT '1037','3902909','NS3902909' UNION ALL
    SELECT '6003','5414982','NS5414982' UNION ALL
    SELECT '54764','3903795','NS3903795' UNION ALL
    SELECT '7565','4662354','NS4662354' UNION ALL
    SELECT '55976','6205794','NS6205794' UNION ALL
    SELECT '12292','4412016','NS4412016' UNION ALL
    SELECT '56362','4444741','NS4444741' UNION ALL
    SELECT '1037','5286303','NS5286303'

    -------------------------------------------

    -- DROP TABLE tbl_Posting_Details IF EXISTS
    IF OBJECT_ID('tbl_Posting_Details', 'U') IS NOT NULL
    DROP TABLE tbl_Posting_Details

    --CREATE TABLE [tbl_Posting_Details]
    CREATE TABLE [dbo].[tbl_Posting_Details](
        [POSTING_ID] [nvarchar](255) NOT NULL,
        [StatementKey] [nvarchar](255) NULL,
        [MoneyType] [nvarchar](255) NULL,
        [Date] [datetime] NULL,
        [Amount] [money] NULL,
    CONSTRAINT [PK_tbl_Posting_Details] PRIMARY KEY CLUSTERED
    (
        [POSTING_ID] ASC
    )
    )

    GO
    IF EXISTS(SELECT * FROM sys.indexes WHERE object_id = object_id('tbl_Posting_Details') AND NAME ='[IX_stat]')
    DROP INDEX [IX_stat] ON tbl_Posting_Details

    GO

    CREATE NONCLUSTERED INDEX [IX_stat] ON [dbo].[tbl_Posting_Details]
    (
        [Date] ASC
    )
    INCLUDE (     [StatementKey],
        [POSTING_ID],
        [Amount])

    --
    insert into tbl_Posting_Details
    SELECT '179939624.0','NS4260973','Resubmission','May 3 2007 12:00AM','2803.05' UNION ALL
    SELECT '179939627.0','NS4260973','Resubmission','May 3 2007 12:00AM','-2803.05' UNION ALL
    SELECT '179992192.0','NS4917629','Resubmission','May 4 2007 12:00AM','723.16' UNION ALL
    SELECT '179992195.0','NS4917629','Resubmission','May 4 2007 12:00AM','-723.16' UNION ALL
    SELECT '180150448.0','NS3902909','Resubmission','May 8 2007 12:00AM','709.70' UNION ALL
    SELECT '180150451.0','NS3902909','Resubmission','May 8 2007 12:00AM','-709.70' UNION ALL
    SELECT '185037079.0','NS5414982','Resubmission','Jun 6 2007 12:00AM','467.50' UNION ALL
    SELECT '185037082.0','NS5414982','Resubmission','Jun 6 2007 12:00AM','-467.50' UNION ALL
    SELECT '185037111.0','NS3903795','Resubmission','Jun 6 2007 12:00AM','274.36' UNION ALL
    SELECT '185037114.0','NS3903795','Resubmission','Jun 6 2007 12:00AM','-274.36' UNION ALL
    SELECT '185037519.0','NS4662354','Resubmission','Jun 6 2007 12:00AM','782.24' UNION ALL
    SELECT '185037522.0','NS4662354','Resubmission','Jun 6 2007 12:00AM','-782.24' UNION ALL
    SELECT '185049674.0','NS6205794','Resubmission','Jun 7 2007 12:00AM','1737.24' UNION ALL
    SELECT '185049677.0','NS6205794','Resubmission','Jun 7 2007 12:00AM','-1737.24' UNION ALL
    SELECT '185049923.0','NS4412016','Resubmission','Jun 7 2007 12:00AM','582.75' UNION ALL
    SELECT '185049926.0','NS4412016','Resubmission','Jun 7 2007 12:00AM','-582.75' UNION ALL
    SELECT '186161272.0','NS4444741','Resubmission','Jun 19 2007 12:00AM','605.13' UNION ALL
    SELECT '186161275.0','NS4444741','Resubmission','Jun 19 2007 12:00AM','-605.13' UNION ALL
    SELECT '190886999.0','NS5414982','Resubmission','Jul 4 2007 12:00AM','683.58' UNION ALL
    SELECT '190887002.0','NS5414982','Resubmission','Jul 4 2007 12:00AM','-683.58' UNION ALL
    SELECT '190887700.0','NS4662354','Resubmission','Jul 4 2007 12:00AM','921.50' UNION ALL
    SELECT '190887703.0','NS4662354','Resubmission','Jul 4 2007 12:00AM','-921.50' UNION ALL
    SELECT '192220525.0','NS4444741','Resubmission','Jul 18 2007 12:00AM','605.13' UNION ALL
    SELECT '192220528.0','NS4444741','Resubmission','Jul 18 2007 12:00AM','-605.13' UNION ALL
    SELECT '197718772.0','NS5286303','Resubmission','Aug 7 2007 12:00AM','709.71' UNION ALL
    SELECT '197718775.0','NS5286303','Resubmission','Aug 7 2007 12:00AM','-709.71' UNION ALL
    SELECT '197719710.0','NS6205794','Resubmission','Aug 7 2007 12:00AM','446.86' UNION ALL
    SELECT '197719713.0','NS6205794','Resubmission','Aug 7 2007 12:00AM','-446.86' UNION ALL
    SELECT '203294789.0','NS4260973','Resubmission','Sep 5 2007 12:00AM','988.91' UNION ALL
    SELECT '203294792.0','NS4260973','Resubmission','Sep 5 2007 12:00AM','-988.91' UNION ALL
    SELECT '212389910.0','NS6205794','Resubmission','Nov 6 2007 12:00AM','396.58' UNION ALL
    SELECT '212389913.0','NS6205794','Resubmission','Nov 6 2007 12:00AM','-396.58' UNION ALL
    SELECT '217697969.0','NS6205794','Resubmission','Dec 11 2007 12:00AM','396.58' UNION ALL
    SELECT '217697972.0','NS6205794','Resubmission','Dec 11 2007 12:00AM','-396.58' UNION ALL
    SELECT '224615066.0','NS6205794','Resubmission','Jan 7 2008 12:00AM','793.16' UNION ALL
    SELECT '224615069.0','NS6205794','Resubmission','Jan 7 2008 12:00AM','-793.16' UNION ALL
    SELECT '229966567.0','NS6205794','Resubmission','Feb 6 2008 12:00AM','793.16' UNION ALL
    SELECT '229966570.0','NS6205794','Resubmission','Feb 6 2008 12:00AM','-793.16'

    -- And the QUERY

    SELECT PK_StatmentKey,Posting_Dt.PostingID_Trimmed,Posting_Dt.Amount,Bridge.SCHEME_CODE
    FROM TBL_bridge Bridge
    INNER JOIN
    (        -- Find the Unique amount for each posting. There can be different postings eg 123.1,123.2,123.4 -
            SELECT CASE WHEN PATINDEX('%.%',tbl_Posting_Details.[POSTING_ID]) >1
                        Then SUBSTRING(tbl_Posting_Details.[POSTING_ID],0,PATINDEX('%.%',tbl_Posting_Details.[POSTING_ID]))
                        ELSE tbl_Posting_Details.[POSTING_ID]
            END AS PostingID_Trimmed
            ,StatementKey AS PK_StatmentKey
            ,[Date]
            ,MIN(Amount) AS Amount
            FROM tbl_Posting_Details
            GROUP BY
            CASE WHEN PATINDEX('%.%',tbl_Posting_Details.[POSTING_ID]) >1 Then SUBSTRING(tbl_Posting_Details.[POSTING_ID],0,PATINDEX('%.%',tbl_Posting_Details.[POSTING_ID])) ELSE tbl_Posting_Details.[POSTING_ID] END
            ,StatementKey,[Date]
            ) Posting_Dt
    ON Posting_Dt.PK_StatmentKey = bridge.STAT_KEY

    And All I want to achieve is avoid the Index scans and better ways to do

    Thanks in Advance
    theNewBee

  • thenewbee - Friday, October 5, 2018 10:28 AM

    SELECT PK_StatmentKey,Posting_Dt.PostingID_Trimmed,Posting_Dt.Amount,Bridge.SCHEME_CODE
    FROM TBL_bridge Bridge
    INNER JOIN
    (        -- Find the Unique amount for each posting. There can be different postings eg 123.1,123.2,123.4 -
            SELECT CASE WHEN PATINDEX('%.%',tbl_Posting_Details.[POSTING_ID]) >1
                        Then SUBSTRING(tbl_Posting_Details.[POSTING_ID],0,PATINDEX('%.%',tbl_Posting_Details.[POSTING_ID]))
                        ELSE tbl_Posting_Details.[POSTING_ID]
            END AS PostingID_Trimmed
            ,StatementKey AS PK_StatmentKey
            ,[Date]
            ,MIN(Amount) AS Amount
            FROM tbl_Posting_Details
            GROUP BY
            CASE WHEN PATINDEX('%.%',tbl_Posting_Details.[POSTING_ID]) >1 Then SUBSTRING(tbl_Posting_Details.[POSTING_ID],0,PATINDEX('%.%',tbl_Posting_Details.[POSTING_ID])) ELSE tbl_Posting_Details.[POSTING_ID] END
            ,StatementKey,[Date]
            ) Posting_Dt
    ON Posting_Dt.PK_StatmentKey = bridge.STAT_KEY

    And All I want to achieve is avoid the Index scans and better ways to do

    Thanks in Advance
    theNewBee

    This query has no where clause - either on the bridge table or the posting_dt derived table.  There is no way you can get anything other than a scan since the query requires reading every row in the table/index from both tables.

    The only non-clustered index I see that would help would be an index on tbl_Posting_Details(StatementKey) which should be created to support the join anyways.  None of the other non-clustered indexes will be utilized for this query.

    Now - if you were filtering either of these by some value then you could create non-clustered indexes on those columns and possibly create some covering indexes for this specific query.  It might even help to change the clustered index - depending on whether or not those columns are almost always utilized for the queries...but for this query there is really nothing else that I can see other than indexes to support the joins.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply