SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


ORMs and Indexes with Included Columns


ORMs and Indexes with Included Columns

Author
Message
LightningFingers
LightningFingers
SSC Rookie
SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)

Group: General Forum Members
Points: 46 Visits: 65
I'm using LINQ to SQL, but I think this question may apply to any ORM tool.

L2S entities are mapped from each database table one-to-one. When you retrieve an entity (i.e. Customer) you get the data from each column for the requested row and depending on your DataLoadOptions, you may also retrieve related entities (i.e. Addresses, Orders, etc.).

So let's say I have a Customer table like this:


TABLE Customer (
[Id] [int] IDENTITY(1,1) NOT NULL,
[StatusId] [int] NOT NULL,
[CustomerNumber] [nvarchar](50) NOT NULL,
[FirstName] [nvarchar](30) NOT NULL,
[LastName] [nvarchar](40) NOT NULL,
[EmailAddress] [nvarchar](100) NULL,
)



Now lets say I have a non-clustered index on EmailAddress with included columns: FirstName and LastName.

When L2S gets a Customer (or Customers) by EmailAddress it looks at this index, but since the entity returned includes the data from each column (not just those in the included columns) how does it retrieve the StatusId and CustomerNumber columns. Should I just put all of the columns in the included columns? Notice that the above is just an example and most of my entities are much more complex than this, so have all of the indexes include all columns seems like a poor decision.

Thanks for any help.

Edit: Maybe this would be better stated as pure SQL. I'll be honest when I say that I don't understand indexes as thoroughly as I'd like to. But after re-reading my post I think it could be stated simply as, "What's the difference between the following queries (with respect to performance and the above index)"


-- Index seems tuned for this query
SELECT FirstName, LastName
FROM Customer
WHERE EmailAddress = ?




-- L2S will actually do something like this
SELECT Id, StatusId, CustomerNumber, FirstName, LastName, EmailAddress
FROM Customer
WHERE EmailAddress = ?


GilaMonster
GilaMonster
SSC Guru
SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)

Group: General Forum Members
Points: 87727 Visits: 45272
mikesigsworth (12/1/2010)
Should I just put all of the columns in the included columns?


No, absolutely not. By doing that you will be duplicating the table. Think of the impact on database size, backup size, backup duration, reindex time, etc.

There might be some specialised cases where this is a good idea, but they're edge cases (and probably data warehouse type systems)

Edit: Maybe this would be better stated as pure SQL. I'll be honest when I say that I don't understand indexes as thoroughly as I'd like to. But after re-reading my post I think it could be stated simply as, "What's the difference between the following queries (with respect to performance and the above index)"


-- Index seems tuned for this query
SELECT FirstName, LastName
FROM Customer
WHERE EmailAddress = ?




-- L2S will actually do something like this
SELECT Id, StatusId, CustomerNumber, FirstName, LastName, EmailAddress
FROM Customer
WHERE EmailAddress = ?



For the first the index is covering, for the second it's not. Depending how many rows will be returned, SQL may or may not use the index. If email address is unique (or nearly so), the index will still be used.

As for indexes...

http://www.sqlservercentral.com/articles/Indexing/68439/ (3 parts)
http://www.sqlservercentral.com/articles/Indexing/68439/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


LightningFingers
LightningFingers
SSC Rookie
SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)

Group: General Forum Members
Points: 46 Visits: 65
Thanks so much for the link. That was a great introduction to indexes, something I sorely needed!

I do have a couple questions however. The 3rd part on non-clustered indexes and covering says:

...the index must be either covering or return sufficiently small number of rows that the required lookups to the clustered index/heap (to retrieve the remainder of the columns) is not considered too expensive by the optimiser. If the required lookups are considered too expensive then the index will not be used... If a nonclustered index is to be used to evaluate a query it needs to be covering or sufficiently selective that the costs of the lookups aren’t deemed to be too high.


The problem with our datalayer is that LINQ to SQL queries typically look like this:

public Entity GetById(int id) {
...
Entity e =
(from x in context.SomeTable
where x.Id == id
select x).FirstOrDefault();

return e;
}



The main problem with this is the "select x" line is essentially converted to SELECT *. Our app doesn't use any sort of LINQ projections, for reasons I don't want to get into, other than it was a bad choice made too long ago. The above query would just use the clustered index though. I'm more focused on creating non-clustered indexes right now to fix performance issues. The problem is, even in "find" queries we don't use any form of projection. For example,


public List<Entity> FindByName(string name) {
...
List<Entity> listOfE =
(from x in context.SomeTable
where x.Name == name
select x).ToList();

return listOfE;
}



So from what I understand, if I want a covering index for this query it either has to include all of the columns in SomeTable to avoid a lookup, or I need to keep the # of included columns minimal so that the cost of the lookup isn't considered too expensive and the index isn't used at all?

I think what's really confusing me is the included columns. In my situation, should I bother with them at all? And on a query like the "find" one above, the lookups are done per row found?

Thanks again for all your help.
GilaMonster
GilaMonster
SSC Guru
SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)

Group: General Forum Members
Points: 87727 Visits: 45272
So from what I understand, if I want a covering index for this query it either has to include all of the columns in SomeTable to avoid a lookup, or I need to keep the # of included columns minimal so that the cost of the lookup isn't considered too expensive and the index isn't used at all?


Including some columns is as good as including none. The number of columns is irrelevant. Fro SQL, looking up 1 column is like looking up 20. It's the number of rows that it must look up that's important

You're said nothing about the selectivity of those queries. Quoting the portion of the article that you quoted (emphasis mine)
...the index must be either covering or return sufficiently small number of rows that the required lookups to the clustered index/heap (to retrieve the remainder of the columns) is not considered too expensive by the optimiser.


So... How many rows are those returning? 1? 0.001% of the table? 50% of the table?

And on a query like the "find" one above, the lookups are done per row found?


Lookups are always done per row.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


LightningFingers
LightningFingers
SSC Rookie
SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)

Group: General Forum Members
Points: 46 Visits: 65
Well the find query that I am currently optimizing is paged so it returns at most 20 rows on a 2.5million row table. So I guess by using paging I can reduce the result set to the required size in order to use the index.

I didn't really get what "selectivity" meant. I do now.

You are a king among men. Thanks a ton!
GilaMonster
GilaMonster
SSC Guru
SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)

Group: General Forum Members
Points: 87727 Visits: 45272
mikesigsworth (12/2/2010)
Well the find query that I am currently optimizing is paged so it returns at most 20 rows on a 2.5million row table. So I guess by using paging I can reduce the result set to the required size in order to use the index.


That's well within the range where the lookups will be done.

The other point is, have you tested and does SQL use the index? If you haven't tested, do so. You cannot put indexes onto tables based on what you think will happen. You have to test and make sure that what you expect to happen really will.

You are a king among men.


Err, I'll take that as figurative, not literal ;-)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


LightningFingers
LightningFingers
SSC Rookie
SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)SSC Rookie (46 reputation)

Group: General Forum Members
Points: 46 Visits: 65
I agree. You should never assume that any optimization has actually optimized anything until you test it. Testing L2S is a pain in the *** because you basically need to use profiler to pull out the generated SQL and then run it manually to see the execution plan. Which I will do, it's just not fun.

A true "King Among Men" is defined as being absolutely awesome in everything done.

So no, I'm not about to swear fealty. But you rock!
charlesmartinez276
charlesmartinez276
Grasshopper
Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)

Group: General Forum Members
Points: 18 Visits: 3
I would like to use ORM with Bonfire, but the module cannot find my DM class. Any help appreciated. w00t

Daytona Beach
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86574 Visits: 41098
LightningFingers (12/2/2010)
I agree. You should never assume that any optimization has actually optimized anything until you test it. Testing L2S is a pain in the *** because you basically need to use profiler to pull out the generated SQL and then run it manually to see the execution plan. Which I will do, it's just not fun.

A true "King Among Men" is defined as being absolutely awesome in everything done.

So no, I'm not about to swear fealty. But you rock!


Heh... despite her avatar, GilaMonster is Gail Shaw. She's not only a lady, but a very smart lady that has tested out on and earned the MCM in SQL Server.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86574 Visits: 41098
LightningFingers (12/1/2010)
I'm using LINQ to SQL, but I think this question may apply to any ORM tool.


Then, you have a much more insideous problem to watch for. If you do nothing to prevent it, LINQ to SQL has been known to pass all character based parameters as NVARCHAR. If you play those against a table that has VARCHAR columns, then the entire column will be implicitly converted to NVARCHAR which also makes it impossible to do an index SEEK.

It's also tough for auditing. It's also been known not to pass certain parameters if the value it's passing during an update is the same as the current value especially during the same session. That means you can get a NULL in your INSERTED "Modified By" columns in an audit trigger.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search