which is the better way to improve performance

  • Hi all,

    In my database, there is transaction table with two million records inserting every day.

    So currently I am taking backup of that table every day and deleting those records to reduce number of records for imporving performance.

    Is there anything that I can do apart from this.?

    Is table partitioning helpful in this case?

    ResultID bigint (PK)

    ScheduleUserID bigint

    TopicID bigint

    QuestionID bigint

    MaxScore numeric

    Score numeric

    Attempted bit

    HintsUsed tinyint

    AnswerCorrect bit

    QuestionSerialNumberint

    ManuallyEvaluated bit

    EvaluatedDate datetime

    MailSentAfterOfflineEvaluation bit

  • Do you have indexes to support the frequently run queries? Can the queries use those indexes?

    2 million rows isn't that much. SQL can handle that with ease. If you're having performance problems, can you be more specific about what the problem is.

    Partitioning may help, or it may not. Depends on the problems you're having and the specifics of your system.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks Gail,

    There are couple of indexes which will boost query performance.

    But daily there will be upto two million records insertion. This will become huge in couple of weeks.

  • I've worked with unpartitioned tables with over 150 million rows in. That's equivalent to 2 and a half months of your data. It's manageable and a few million is not beyond SQL's capabilities.

    What help are you looking for? There's not enough information here to make recommendations regarding partitioning. There's not enough info to help you with performance problems.

    What's the required retention period of this data?

    Where are you encountering performance problems? Insert, querying or deleting?

    What are the common queries against the table (and which, if any, are slow)

    What does the data look like?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks Gail,

    Ours is a online assessment application where students will take tests everyday.

    Suppose there are 100 questions in one test, then 100 records will get inserted to the mentioned table.

    Table structure

    ------------------

    ResultID-bigint (pk)

    ScheduleUserID-bigint (nonclustered index)

    TopicID-bigint

    QuestionID-bigint (nonclustered index)

    MaxScore-numeric

    Score-numeric

    Attempted-bit

    HintsUsed-tinyint

    AnswerCorrect-bit

    QuestionSerialNumber-int

    ManuallyEvaluated-bit

    EvaluatedDate-datetime

    MailSentAfterOfflineEvaluation-bit

    If 100000 students take the test, there will be 100000 X 100 = 1,00,00,000 rows will get inserted at the starting of test and 50% of rows will be updated once test is submitted. This updation is for the score that the student has taken for each question.

    After this student can view his/her report by selecting the data from "Results" table.

    This is an itterative process for 25 to 50 tests.

    Student has option to check the result of any of his/her test.

    And test will be conducted for every week. The concurrency level is 1500 users.

    1. Problem is while udpating, its taking lot of time

    2. Common queries are update and select

    3. Can i have a denormalized table for viewing previous results?

    Please let me know anything else requried here.

  • One more thing I would like to add...

    There will be one new test for one week.

    Tests will be conducted throughout the week.

  • You still haven't mentioned the retention period. How long does the data have to be kept? Is older data changed, or is it read-only?

    You say updates are slow. Have you checked the execution plans? Are they using indexes?

    Is there blocking? Are there waits and if so on what resource?

    Are the selects slow? If so, what does the exec plan look like?

    Yes, I'm sure it's possible to have a denormalised table for historical results. Maybe a job once a week to update one will work?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Retention period will be 2 months.

    Selection of records is little bit slower, but this can be overcome by having one denormalized table.

    and here is the execution plan

  • just my 2ct:

    - your rows indeed aren't that wide, so this number of rows should be manageble.

    - does your table have a clustering index (I guess it does, if you created the pk just using the defaults)

    - My guess, you only indexed the FKs for this table and off course the (identity or so) PK.

    (why isn't TopicID indexed)

    - Just a shot in the dark : you need an index on EvaluatedDate to determine "previous"

    - As Gail stated, we need "typical" (update) queries to assist on these performance troubles.

    - other factors that will influence performance :

    -- how does your application connect (isolation level !)

    -- Are you avoiding cursors ( if not, what kind of cursors are being used)

    -- are the bit-columns used for filtering ?

    -- what's of box is this hosted on (#cores / disks / ram)

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • another guess:

    Wat's the relation for ScheduleUserID to the actual user and the actual test (s)he is conducting ?

    Can a person take the same test more than once ?

    You update query is missing the "conducting test" filter if these are not 1 to 1 !

    Provide as much filer info as you can to assist your query !

    e.g.

    - topicid

    - quesionid

    - EvaluatedDate (IX !)

    UPDATE R

    SET score = T.Score

    , attempted = 1

    , AnswerCorrect = T.Answercorrect

    FROM results R with ( nolock )

    INNER JOIN (

    select ( CASE WHEN QA.Choice = URT.UserResponse

    AND QA.IsCorrect = 1 THEN QA.Score

    ELSE QA.NegativeScore

    END ) Score

    , ( CASE WHEN QA.Choice = URT.UserResponse

    AND QA.IsCorrect = 1 THEN 1

    ELSE 0

    END ) AnswerCorrect

    , Q.Questionid

    , URT.ScheduleUserID

    , URT.UserResponse

    from [UserResponse_Temp] URT with ( nolock )

    inner join Question Q with ( nolock )

    on URT.QuestionGUID = Q.QuestionGUID

    INNER JOIN QuestionAnswers_temp QA with ( nolock )

    ON QA.QuestionID = Q.QuestionId

    where QA.Choice = URT.UserResponse

    and URT.ScheduleUserID = @ScheduleUserID

    /* here and extra index may help out ! */

    and [UserResponse_Temp].EvaluatedDate >= dateadd(d,datediff(d, 0,getdate())-1 ,0) -- -1 added to avoid datechange issues

    ) T

    ON R.scheduleuserid = T.ScheduleUserID

    and R.questionid = T.questionid

    AND R.scheduleuserid = @ScheduleUserID

    /* an extra index may help out ! */

    and R.EvaluatedDate = T.EvaluatedDate

    /* an extra index may help out ! */

    where R.EvaluatedDate >= dateadd(d,datediff(d, 0,getdate()) -1 ,0) -- -1 added to avoid datechange issues

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • That exec plan looks pretty optimal. Not much that can be done other than widening indexes to get rid of the lookups.

    How long is that taking?

    Is there blocking?

    What are the IO and Time statistics for it?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi,

    This is the update statement

    ----------------------

    UPDATE results SET score = T.Score,

    attempted = 1,

    AnswerCorrect = T.Answercorrect

    FROM results R with(nolock)

    INNER JOIN

    (

    select

    (CASE WHEN QA.Choice = URT.UserResponse AND QA.IsCorrect=1 THEN QA.Score ELSE QA.NegativeScore END) Score,

    (CASE WHEN QA.Choice = URT.UserResponse AND QA.IsCorrect=1 THEN 1 ELSE 0 END) AnswerCorrect,

    Q.Questionid,URT.ScheduleUserID,URT.UserResponse from

    [UserResponse_Temp] URT with(nolock)

    inner join Question Q with(nolock) on URT.QuestionGUID=Q.QuestionGUID

    INNER JOIN QuestionAnswers_temp QA with(nolock) ON QA.QuestionID=Q.QuestionId

    where URT.ScheduleUserID = @ScheduleUserID AND QA.Choice = URT.UserResponse

    )

    T ON R.scheduleuserid = T.ScheduleUserID AND

    R.scheduleuserid=@ScheduleUserID

    and R.questionid = T.questionid

    --

    Table is having clustered index on primary key

    As you said we can have index on Topic ID, but EvaluatedDate is not currently being used.

    1. Isolation level - Read Commited

    2. There are no cursors

    3. Bit columns are used for filtering

    4. The db is deployed on amazon cloud server. ( 15 GB RAM and 4 Core processor)

    And this is the select statement

    -----------------

    SELECT DISTINCT T.SectionID,T.SectionOrder,TB.BlockName,

    (SELECT COUNT(QuestionId) FROM Results WHERE QuestionId IN(SELECT QuestionId FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration

    WHERE SectionId=TB.BlockId))AND ScheduleUserId=@ScheduleUserId)AS QuestionCount,

    (SELECT COUNT(QuestionId) FROM Results WHERE ScheduleuserId=@ScheduleuserId AND QuestionId IN(SELECT QuestionId

    FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration

    WHERE SectionId=TB.BlockId))AND Attempted=1) AS SectionNoQuestionsAttempted,

    (SELECT COUNT(QuestionId) FROM Results WHERE ScheduleuserId=@ScheduleuserId AND QuestionId IN(SELECT QuestionId

    FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration WHERE

    SectionId=TB.BlockId)) AND Attempted=0 ) AS SectionNoQuestionsNotAttempted,

    (SELECT COUNT(QuestionId) FROM Results WHERE ScheduleuserId=@ScheduleuserId AND QuestionId IN(SELECT QuestionId

    FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration

    WHERE SectionId=TB.BlockId)) AND AnswerCorrect=1 ) AS SectionCorrectAnswered,

    (SELECT SUM(MaxScore) FROM Results WHERE ScheduleuserId=@ScheduleuserId AND QuestionId IN(SELECT QuestionId

    FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration

    WHERE SectionId=TB.BlockId))) AS SectiontotalMarks,

    (SELECT SUM(Score) FROM Results WHERE ScheduleuserId=@ScheduleuserId AND QuestionId IN(SELECT QuestionId

    FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration

    WHERE SectionId=TB.BlockId))) AS MarksScored

    FROM tblassessmentconfiguration T

    INNER JOIN tblAssessmentBlock TB ON T.SectionID=TB.BlockID

    WHERE T.AssessmentID=@AssessmentId

    GROUP BY SectionId,SectionOrder,BlockName,TB.BlockId

    ORDER BY SectionOrder

  • this is the time and io statistics

    Table 'ScheduleUser'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Table 'ScheduleDetails'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Table '#3A4CA8FD'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'tblAssessmentConfiguration'. Scan count 1, logical reads 1814, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'tblAssessmentQuestion'. Scan count 18, logical reads 5418, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Results'. Scan count 18, logical reads 3447, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'tblAssessmentBlock'. Scan count 0, logical reads 150, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 546 ms, elapsed time = 559 ms.

    Table '#3A4CA8FD'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Table '#3B40CD36'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'tblAssessmentBlock'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Table '#3B40CD36'. Scan count 0, logical reads 75, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'QuestionAnswers'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Question'. Scan count 0, logical reads 163, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'tblAssessmentQuestion'. Scan count 75, logical reads 150, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'tblAssessmentBlock'. Scan count 0, logical reads 150, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'tblAssessmentConfiguration'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 4 ms.

    Table '#3B40CD36'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 8 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 562 ms, elapsed time = 571 ms.

  • manohar (8/31/2009)


    this is the time and io statistics

    Table 'ScheduleUser'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Table 'ScheduleDetails'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    Table '#3A4CA8FD'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'tblAssessmentConfiguration'. Scan count 1, logical reads 1814, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'tblAssessmentQuestion'. Scan count 18, logical reads 5418, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Results'. Scan count 18, logical reads 3447, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'tblAssessmentBlock'. Scan count 0, logical reads 150, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 546 ms, elapsed time = 559 ms.

    Well this looks like the problematic part, and this does not match the execution plan that you posted earlier. In fact, none of these execution characteristics appear to match the execution plan that you posted earlier (which has tables UserResponse_Temp, Results, QuestionAnswersTemp and Question)

    Can you post the execution plan that matches these execution characteristics?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • can this be your Select query ... a little bit rewritten to avoid the row level count/sum selects ?

    SELECT T.SectionID

    , T.SectionOrder

    , TB.BlockName

    , COUNT( distinct TheTests.QuestionId) as QuestionCount

    , sum(TheTests.SectionNoQuestionsAttempted) as SectionNoQuestionsAttempted

    , sum(TheTests.SectionNoQuestionsNotAttempted) as SectionNoQuestionsNotAttempted

    , sum(TheTests.SectionCorrectAnswered) as SectionCorrectAnswered

    , SUM(TheTests.MaxScore) as SectiontotalMarks

    , SUM(TheTests.Score) as MarksScored

    , TheTests.ScheduleuserId

    FROM tblassessmentconfiguration T

    INNER JOIN tblAssessmentBlock TB

    ON T.SectionID = TB.BlockID

    /* can be altered to INNER JOIN if you don't want scheduleuserid that didn't conduct tests as nulls */

    left join (

    SELECT CI.SectionId, R.ScheduleuserId, R.QuestionId, R.MaxScore, R.Score

    , sum( case Attempted when 1 then 1 else 0 end )SectionNoQuestionsAttempted

    , sum( case Attempted when 0 then 1 else 0 end ) as SectionNoQuestionsNotAttempted

    , sum( case AnswerCorrect when 1 then 1 else 0 end ) AS SectionCorrectAnswered

    FROM Results R

    inner join TblAssessmentQuestion TAQ

    on TQA.QuestionId = R.QuestionId

    inner join ConfigurationId CI

    on CI.ConfigurationId = TAQ.ConfigurationId

    WHERE ScheduleuserId = @ScheduleuserId

    group by CI.SectionId, R.ScheduleuserId, R.QuestionId, R.Score ) TheTests

    on TheTests.SectionId = TB.BlockId

    WHERE T.AssessmentID = @AssessmentId

    GROUP BY SectionId

    , SectionOrder

    , BlockName

    , TB.BlockId

    ORDER BY SectionOrder

    If it is... I think it is more readable and may outperform your initial one 😉

    Then again... I may be wrong :Whistling:

    edited:

    and maybe it would even be better to inclode TB.BlockId in your result just in case you get duplicate BlockNames ! You may even get some advantage with selecting min(BlockName) in stead of adding it to the group by.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply