Query optimisation

Question

Query optimisation

richardmgreen1

SSChampion

Points: 10399
More actions
May 19, 2020 at 2:29 pm

#3752893
Hi all

I've got a query that has to update a table containing several million records.
I've attached the create statements for the table and the stored procedure in question.

Because of the nature of the query, we have to run it against every record (which I think is the bit that has slowed everything down).

Can anyone see any improvements that can be made?
I've checked the execution plan and it's not recommending any indexes.

Any help anyone can give would be greatly appreciated.

We're running SQL2016 Developer and this is to be deployed to SQL2016 Enterprise when it's ready.
::edit::
I've had to rename the files with a .txt extension (from the original .sql extension) as they wouldn't upload.

TIA
Richard
- This topic was modified 5 years, 2 months ago by richardmgreen1. Reason: Issues with file uploads
Attachments:
You must be logged in to view attached files.

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

ScottPletcher SSC Guru Points: 101192 More actions · Answer 1

I think you do need at least a couple of indexes. Also, you should capture both pat and ref flags in the same pass of the table and update them both in one UPDATE statement. I don't have time to post more details on how to do that now, but I will do so later.

The temp table should be clustered on the CodedResponseKey, not a separate identity column (there is no commandment that all tables must have an identity column, let alone cluster on it!). That could allow a more efficient join. Like this:

CREATE TABLE #pat_flags

(

,CodedResponseKey BIGINT NOT NULL PRIMARY KEY

,IsFirstRecordedCodePatient INT NOT NULL

,IsFirstRecordedTemplatePatient INT NOT NULL

,IsLatestRecordedCodePatient INT NOT NULL

,IsLatestRecordedTemplatePatient INT NOT NULL

);

CREATE UNIQUE NONCLUSTERED INDEX Fact_Community_Coded_Responses__IX_CTV3CodeKey 
    ON dbo.Fact_Community_Coded_Responses ( CTV3CodeKey, CurrentPatientKey, CareContactDateKey, CareContactTimeKey, pkCodedResponseKey )
    INCLUDE ( ReferralKey )
    WITH ( DATA_COMPRESSION = ROW, FILLFACTOR = 98 /*change fillfactor if/as needed*/, SORT_IN_TEMPDB = ON )
    ON [PRIMARY] /*change filegroup if/as needed*/


CREATE UNIQUE NONCLUSTERED INDEX Fact_Community_Coded_Responses__IX_CommunityTemplateKey
    ON dbo.Fact_Community_Coded_Responses ( CommunityTemplateKey, CurrentPatientKey, CareContactDateKey, CareContactTimeKey, pkCodedResponseKey )
    INCLUDE ( ReferralKey )
    WITH ( DATA_COMPRESSION = ROW, FILLFACTOR = 98 /*change fillfactor if/as needed*/, SORT_IN_TEMPDB = ON )
    ON [PRIMARY] /*change filegroup if/as needed*/

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jonathan AC Roberts SSCoach Points: 18407 More actions · Answer 2

It would be helpful if you knew which particular statements within the stored procedure are the long running ones. The first thing to do is identify which statement(s) are taking the most time.

Also, you are updating a table from a LEFT JOIN. This makes no sense to me. Why are you using a LEFT JOIN not an INNER JOIN in the following two statements?

UPDATE fccr
   SET IsFirstRecordedCodePatient      = f.IsFirstRecordedCodePatient,
       IsFirstRecordedTemplatePatient  = f.IsFirstRecordedTemplatePatient,
       IsLatestRecordedCodePatient     = f.IsLatestRecordedCodePatient,
       IsLatestRecordedTemplatePatient = f.IsLatestRecordedTemplatePatient
  FROM #pat_flags f
  LEFT JOIN dbo.Fact_Community_Coded_Responses fccr
         ON fccr.pkCodedResponseKey = f.CodedResponseKey
 WHERE f.ID BETWEEN @ID AND @ID + 999999;


UPDATE fccr
   SET IsFirstRecordedCodeReferral      = f.IsFirstRecordedCodeReferral,
       IsFirstRecordedTemplateReferral  = f.IsFirstRecordedTemplateReferral,
       IsLatestRecordedCodeReferral     = f.IsLatestRecordedCodeReferral,
       IsLatestRecordedTemplateReferral = f.IsLatestRecordedTemplateReferral
  FROM #ref_flags f
  LEFT JOIN dbo.Fact_Community_Coded_Responses fccr
         ON fccr.pkCodedResponseKey = f.CodedResponseKey
 WHERE f.ID BETWEEN @ID AND @ID + 999999;

I would create indexes on the two temporary tables in the stored procedure immediately after you have populated them:

CREATE INDEX IX_#pat_flags_1 ON #pat_flags
(
    ID,
    CodedResponseKey
) 
INCLUDE 
(
    IsFirstRecordedCodePatient,
    IsFirstRecordedTemplatePatient,
    IsLatestRecordedCodePatient,
    IsLatestRecordedTemplatePatient
);

CREATE INDEX IX_#ref_flags_1 ON #ref_flags
(
    ID,
    CodedResponseKey
) 
INCLUDE 
(
    IsFirstRecordedCodeReferral,
    IsFirstRecordedTemplateReferral,
    IsLatestRecordedCodeReferral,
    IsLatestRecordedTemplateReferral
);

I'm not 100% about the order of ID, CodedResponseKey so it would be worth experimenting with it in the opposite order, i.e. CodedResponseKey, ID in the leading part of the indexes.

There is no need to have DISTINCT in statements like: WHERE colX IN (SELECT DISTINCT myCol...

as SQL server will work out you don't need duplicated values.

richardmgreen1 SSChampion Points: 10399 More actions · Answer 3

Thanks both

It looks like it's the ROW_NUMBER() section that's taking the time to work out.

Is there any sort of alternative to that bit?

I need to get the records in order and then put a marker against the first and last ones.

richardmgreen1 SSChampion Points: 10399 More actions · Answer 4

I've tried recoding the query with *some* success (it's now down to 42 minutes on a single run from an hour).

Can someone have anther look and check I've got the indexes on the right things please?

I'm also open to other suggestions for speeding it up.

The new code is now just a SELECT rather than an update (but I can soon change that).

TIA

Attachments:

You must be logged in to view attached files.

ScottPletcher SSC Guru Points: 101192 More actions · Answer 5

richardmgreen1 wrote:

Thanks both
It looks like it's the ROW_NUMBER() section that's taking the time to work out.
Is there any sort of alternative to that bit?
I need to get the records in order and then put a marker against the first and last ones.

I don't know if you have alternatives, I don't know your data that well. But I suggested you needed the indexes to address the ROW_NUMBER issue. The indexes will (should) avoid the sort for two of the ROW_NUMBERs and reduce the overhead of the other two sorts. Sorting is a very expensive operation.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Query optimisation

Attachments:

Attachments:

Cookies on SQLServerCentral