Understanding execution plan query

Question

Understanding execution plan query

as_1234

SSCrazy

Points: 2875
More actions
October 19, 2020 at 5:03 pm

#3798924
All,
Firstly to put this into context - I'm trying to improve my knowledge on reading execution plans, there isn't a particular business issue that I need to solve.
Secondly I think I've provided all the information required. Apologises if I've missed anything?
DDL:
```
/****** Object:  Table [dbo].[Posts]    Script Date: 19/10/2020 14:25:24 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[Posts](
[Id] [int] IDENTITY(1,1) NOT NULL,
[AcceptedAnswerId] [int] NULL,
[AnswerCount] [int] NULL,
[Body] [nvarchar](max) NOT NULL,
[ClosedDate] [datetime] NULL,
[CommentCount] [int] NULL,
[CommunityOwnedDate] [datetime] NULL,
[CreationDate] [datetime] NOT NULL,
[FavoriteCount] [int] NULL,
[LastActivityDate] [datetime] NOT NULL,
[LastEditDate] [datetime] NULL,
[LastEditorDisplayName] [nvarchar](40) NULL,
[LastEditorUserId] [int] NULL,
[OwnerUserId] [int] NULL,
[ParentId] [int] NULL,
[PostTypeId] [int] NOT NULL,
[Score] [int] NOT NULL,
[Tags] [nvarchar](150) NULL,
[Title] [nvarchar](250) NULL,
[ViewCount] [int] NOT NULL,
[CreationYear] [int] NULL,
[CreationMonth] [int] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO


/****** Object:  Index [idxPostCDID]    Script Date: 19/10/2020 14:26:03 ******/
CREATE UNIQUE CLUSTERED INDEX [idxPostCDID] ON [dbo].[Posts]
(
[CreationDate] DESC,
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
USE [StackOverflow]
GO


/****** Object:  Index [idxPostID]    Script Date: 19/10/2020 14:26:15 ******/
CREATE UNIQUE NONCLUSTERED INDEX [idxPostID] ON [dbo].[Posts]
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
```
My reading of the attached execution plan is as follows:
1. Node id 4 scans the idxPostCDID clustered index and outputs the OwnerUserID
2. Node id 9 scans the same index with the predicate 'ownerID is null'.
3. The outputs from these nodes get refined by other operators then and then joined in node ID 1.
4. Node ID 4 and 9 use 47% of the query each.
I would appreciate any advice/help on the following:
The index is far too large to fit into memory. Will the system ensure that when a page is loaded into memory it is used by both node 4 and 9 before being released from memory or could it load a page, remove it from the buffer cache and then have to load it again? Could it go a step further by reading a page from the buffer cache and using it for both nodes?
The reason for my question is as follows:
Initially I was surprised to see two operators reading through the same index. I think the optimiser is trying to take advantage of parallelism? However the I/O system is slow (it's just a training setup) so reading the pages into memory twice would be far slower than any advantage gained by parallelism. Statistics are up to date and I noticed it shows 'optimisation level: full' for the plan.
If it does co-ordinate the pages then is the plan working out the percentage for nodes 4 and 9 as follows?:
Reading the index will take 94% of the plan, it's used by two operators so show it as 47% each.
I appreciate that a post execution plan (or to use a, perhaps, newer phrasing - execution plan with run time metrics) is better for performance tuning, I'm more focusing on understanding plans rather than performance tuning in this instance.
Thanks
Editor 2 - Adding a line here
Editor: Testing the edits
- This topic was modified 4 years, 9 months ago by as_1234. Reason: Small clarification and renaming the attachment to .txt so it wil upload
- This topic was modified 4 years, 9 months ago by Steve Jones - SSC Editor. Reason: testing
- This topic was modified 4 years, 9 months ago by Steve Jones - SSC Editor. Reason: Including a line
Attachments:
You must be logged in to view attached files.

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

ScottPletcher SSC Guru Points: 101192 More actions · Answer 1

When using NOT IN, you should make sure a NULL value does not appear in the results. For example, like below. Try that first and see if it cleans up the query plan.

select id from users

where id not in (

select posts.OwnerUserId from posts where posts.OwnerUserId is not null )

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

as_1234 SSCrazy Points: 2875 More actions · Answer 2

Thank you for your help. That did improve the plan.

I would also be interested to know if my reading of the original plan was correct.

ScottPletcher SSC Guru Points: 101192 More actions · Answer 3

Yes, I believe you reading of the original plan was correct.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Understanding execution plan query

Attachments:

Cookies on SQLServerCentral