Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««1234»»»

clustered index Expand / Collapse
Author
Message
Posted Tuesday, April 15, 2008 9:45 AM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Tuesday, August 13, 2013 11:43 AM
Points: 242, Visits: 433
This site (link) states that DB2 only guarantees that a clustered index is initially clustered, clustering is not maintained. While this saves time on inserts (no page splits), I'm glad SQL Server allows index fragmentation instead.

Additionally, the MDC is primarily useful for low cardinality indices, which are not considered good candidates for a normal clustered index.


I agree that there are situations where this option would be useful. Do you know if there's a request ticket on Connect for this feature?



Puto me cogitare, ergo puto me esse.
I think that I think, therefore I think that I am.
Post #485130
Posted Tuesday, April 15, 2008 3:14 PM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, January 30, 2014 10:08 PM
Points: 1,038, Visits: 444
The suggestion from another person about using indexed views for the purpose is probably the best. It's a separate copy of the data (with extras thrown in if needed) that's maintained automatically for you. You should ensure that the appropriate SET options (such as ANSI defaults, arithabort, etc) are enabled on your server.

You could also create some non-clustered covering indices on your table although the size of an index in SQL Server is limited, but in 2005 you can "incldue" columns outside of the index key.



Post #485317
Posted Wednesday, April 16, 2008 3:07 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, July 14, 2014 10:33 PM
Points: 85, Visits: 260
It would help to know what he thought multiple clustered indexes would help with ...

I never would have guessed that the primary key did not effect the row order in data pages in a non clustered table.
Up until this point I thought a index defrag on a non-clustered table reorged the data pages by primary key order. Stuck in the monotonically increasing primary key paradigm I guess. When would you want a primary key rather than a uniqueness constraint on a non clustered table ?

I am curious as to the difference between DB2 as SQL servers index implementation.

As the existing index pages fill and we get page splits are we saying DB2 and SQL server manage this differently ? I would have expected a new page to be allocated to the index linked between the two entries that previously bounded the new insert and the higher level index pages be updated recursively. Obviously I am missing something.
Post #485484
Posted Wednesday, April 16, 2008 3:33 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 3:37 PM
Points: 40,210, Visits: 36,619
Primary key never affects the row order in data pages. It's the clustered index that does that. Now, by default the primary key is a clustered index, but that is not required.

Why would you not want a primary key on a table with no clustered index?

What SQL does when a page fills and needs to be split is to allocate a new page somewhere in the data file. Where is not important. It then adjusts the next and previous pointers of the original page so that the new one is linked in in the correct logical order. The physical order of the pages may well not correspond to the ogical order. That's fragmentation

Let's say we have a table with 5 pages, with a clustered index (hence row order matters). (Under 8 pages, so mixed extents

1st page - pageID 250, previous page pointer = null, next page pointer = 251
2nd page - pageID 251, previous page pointer = 250, next page pointer = 252
3rd page - pageID 252, previous page pointer = 251, next page pointer = 264
4th page - pageID 264, previous page pointer = 252, next page pointer = 275
5th page - pageID 275, previous page pointer = 264, next page pointer = null.

Now, page 3 is full and needs, for whatever reason, to be split. SQL allocates a new page. Say everything below page 300 is used, so the new page is 301. Now the table looks like this.

1st page - pageID 250, previous page pointer = null, next page pointer = 251
2nd page - pageID 251, previous page pointer = 250, next page pointer = 252
3rd page - pageID 252, previous page pointer = 251, next page pointer = 301
4th page - pageID 301 (new), previous page pointer = 252, next page pointer = 264
5th page - pageID 264, previous page pointer = 301, next page pointer = 275
6th page - pageID 275, previous page pointer = 264, next page pointer = null.

Only the pages before and after the new page have to change. It would take waaay too long to adjust all the pages in the table (imagine a table with a few hundred thousand pages)

When you have a heap (no cluster) the order of rows has no meaning, and new rows are just added to the last page of the table.

Does that help?



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #485499
Posted Wednesday, April 16, 2008 3:37 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Wednesday, October 22, 2014 4:18 AM
Points: 2,886, Visits: 3,256
The approach to clustering is different between SQL Server and DB2.

As most SQL people know, SQL Server will always physically insert a new row in the correct place in a cluster index. This is because the bottom-level leaf pages of a SQL Server cluster index are also the data pages.

DB2 has 2 types of cluster index, both of which work differently to cluster indexes in SQL Server.

The original DB2 cluster index (written in the 1970s) most closely compares to a SQL non-clustered index on a table with no cluster index. In other words, the bottom level of the cluster index just has RID pointers into the table. When a row is inserted DB2 will place it in the first available slot, with the index updated as normal. DB2 keeps a statistic called clustered% to show how close the actual physical order of rows is to the clustered index definition. Most DB2 DBAs would rebuild a cluster index when the cluster% drops below 95%. When a DB2 cluster index is rebuilt, the table rows are sorted into the sequence of the cluster index. Immediately after a DB2 cluster index is rebuilt, both the index and the data are in the same physical sequence. There are advantages and disadvantages of the DB2 approach compared to the SQL Server approach.

DB2 Multiple Dimension Clustering (MDC) indexes work in a different way. See my previous post in this thread for an introduction to the MDC concept, which came into DB2 about 2001. When a new row is inserted in a table with a MDC index, it is stored in the correct extent for the key entry. In this way, a MDC index is self-maintaining, and a MDC index rebuild is only necessary when it is desired to reclaim space.


Original author: SQL Server FineBuild 1-click install and best practice configuration of SQL Server 2014, 2012, 2008 R2, 2008 and 2005. 18 October 2014: now over 31,000 downloads.
Disclaimer: All information provided is a personal opinion that may not match reality.
Concept: "Pizza Apartheid" - the discrimination that separates those who earn enough in one day to buy a pizza if they want one, from those who can not.
Post #485500
Posted Wednesday, April 16, 2008 4:38 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, February 27, 2014 3:44 AM
Points: 179, Visits: 331
The point in question is that how many copies of data you want to keep...if it is one....you can store it in only one order physically.....now by definition a clustered index is associated with the physical order.....the index illustrated by EdVassie is merely a clustered index with two dimensions...
as Last Name, then First name in the case of an address book....wherein for his case it is type of object, then objectid or something.....these are basics friends.....

well....logically or functionally you can have n number of indexes at place for your database, but physically it depends upon how many copies of data you want to keep....

For more complex ways of data storage and retrieval, check out cubes in SQL Server Analysis Services.


It is good to argue, but it is useless if you don't accept.


Chandrachurh Ghosh
DBA – MS SQL Server
Ericsson India Global Services Limited
Quality is not an act, it is a habit.
Post #485534
Posted Wednesday, April 16, 2008 4:41 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, February 27, 2014 3:44 AM
Points: 179, Visits: 331
we cannot have any non-clustered index if we don't have a clustered index on a table....also a primary key is by default clustered indexed and by definition unique.....

well...from your discussion I got that you need to retrieve data as fast as a clustered index but in different ways.....MS is clever enough to let you features to handle situations like this.......if all the columns required in a query is present in a non-clustered index the SQL server fetches the values from the NC index pool itself. It does not require to go to the main table. Also, we can include columns in NC indexes which also has the same effect. NC indexes are more versatile than indexed views.

Hope this helps.....


Chandrachurh Ghosh
DBA – MS SQL Server
Ericsson India Global Services Limited
Quality is not an act, it is a habit.
Post #485536
Posted Wednesday, April 16, 2008 5:37 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 3:37 PM
Points: 40,210, Visits: 36,619
chandrachurhghosh (4/16/2008)
we cannot have any non-clustered index if we don't have a clustered index on a table


That's not the case. There's no requirement in SQL server for a table to have a clustered index. Nonclustered indexes can be created on a heap just as they can on a cluster.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #485561
Posted Wednesday, April 16, 2008 8:20 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, July 14, 2014 10:33 PM
Points: 85, Visits: 260
GilaMonster (4/16/2008)
Primary key never affects the row order in data pages.

*Snip*

Does that help?


A lot thanks I assumed that was what primary key implied.

I understand the index page id doesn't need to be sequential ( on index defrag it will end up being though ? ) and it is better to link in a new page than shuffle all entries down their corresponding pages. I also understand that if the index pages are mostly cached not physically retrivied whether it is page id 301 or 62 doesn't have much effect on the performance retrieval using that index.

I was confused about srienstr's post and the different B-Tree implmentation in DB2 and SQL server

This site (link) states that DB2 only guarantees that a clustered index is initially clustered, clustering is not maintained. While this saves time on inserts (no page splits), I'm glad SQL Server allows index fragmentation instead.


Why wouldn't the index fragment in exactly the same manner over time for DB2 Type-1 and 2 indexes or is he talking specifically about MDC ?

Probably worth drawing a distinction between the clustering of the index only pages inside a SQL server clustered or non-clustered index and the clustering of the data pages as being two different things.
Post #486069
Posted Wednesday, April 16, 2008 10:57 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Friday, October 18, 2013 4:06 PM
Points: 471, Visits: 589
in 2005, you don't need an indexed view to get what amounts to a second clustered index on a table - you can get the same effect with included columns (new feature in 2005).

create a non-clustered index that also includes every non-key column in the table. This will be a copy of the table that's physically stored in key order.

not sure if anyone mentioned this about included columns already - didn't see it anyway.


---------------------------------------
elsasoft.org
Post #486109
« Prev Topic | Next Topic »

Add to briefcase ««1234»»»

Permissions Expand / Collapse