SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Help resolving duplicates with reindexing logic


Help resolving duplicates with reindexing logic

Author
Message
Adam Bean
Adam Bean
SSCertifiable
SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)

Group: General Forum Members
Points: 6646 Visits: 2222
Hey all,

Recently discovered my reindex routine is causing dupes. I was able to track it down to when/if an index has multiple allocation unit types. For example my particular index I'm playing with has both an IN_ROW_DATA and LOB_DATA.

I've been reading some random posts here and elsewhere as well as reviewing others scripts and the common theme I'm finding is that most simply limit their query to say WHERE alloc_unit_type_desc = 'IN_ROW_DATA'. I'm just curious if this is the proper thing to do and why? This post and this one do help; however I am simply not yet sure I fully understand.

For my particular index that I'm focusing on, as stated I have two different alloc_unit_type's; however they're both index_level 0 with two different index_depths. The LOB_DATA is at depth 1 and IN_ROW_DATA at depth 3. Is there logic to determine at what depth or level to focus on, or do you simply focus on IN_ROW_DATA only?

What happens if you have an index (I can't find any yet) that all the fragmentation exists outside of an IN_ROW_DATA? I read here that LOB allocations will never get fragmented; thus you need not worry.

So I know I can fix it; however I'd really like to better understand this first.

Thanks

---
SQLSlayer
Making SQL do what we want it to do.
Matt Crowley
Matt Crowley
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1125 Visits: 469
LOB_DATA is not normally re-indexed, as it it can not really be indexed (except by full text indexing). Are these maybe included columns?
GilaMonster
GilaMonster
SSC Guru
SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)SSC Guru (225K reputation)

Group: General Forum Members
Points: 225946 Visits: 46322
LOB_DATA is your varchar(max), nvarchar(max), varbinary(max), XML and a few other data types. They have special storage.

Typically you want to filter for 'in row data' when reindexing because reindexing affects only the in row data pages (reorg can compress lob data too, but that's not fragmentation in the traditional sense). If you don't, you can end up rebuilding the same index up to 3 times, which is a waste of time.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Adam Bean
Adam Bean
SSCertifiable
SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)

Group: General Forum Members
Points: 6646 Visits: 2222
Hi Matt, thanks for the prompt response!

As I'm looking at my environment at large, I'm finding multiple incidents where I've been doubling down on the work to do because of this logic.

This particular index, it's a clustered index, on one nvarchar(32) column. So this one in particular, how/why does it have a LOB_DATA?

Either way, I'm veering off topic vs. the overall inquiry of attempting to understand how to focus a reindex routine. Is there a general rule/reasoning to what to focus on and why? Based on most existing routines out there that I've reviewed, most are simply using a where clause for in_row_data.

---
SQLSlayer
Making SQL do what we want it to do.
Adam Bean
Adam Bean
SSCertifiable
SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)

Group: General Forum Members
Points: 6646 Visits: 2222
GilaMonster (1/21/2014)
LOB_DATA is your varchar(max), nvarchar(max), varbinary(max), XML and a few other data types. They have special storage.

Typically you want to filter for 'in row data' when reindexing because reindexing affects only the in row data pages (reorg can compress lob data too, but that's not fragmentation in the traditional sense). If you don't, you can end up rebuilding the same index up to 3 times, which is a waste of time.


Exactly which is what I was doing! Yeah, I realize LOB_DATA is all the lob data types (I exclude those based on sys.columns); however in this particular case as shown above ... it's not using a LOB data type, so now I'm more confused.

Regardless, you're stating though to truly only focus on IN_ROW_DATA then?

---
SQLSlayer
Making SQL do what we want it to do.
Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)

Group: General Forum Members
Points: 95437 Visits: 38968
Can you post the links you have that relate to reindexing causing duplicate records? I have never experienced this issue.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Matt Crowley
Matt Crowley
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1125 Visits: 469
At a leaf level, a clustered index is the table. If you have a primary key defined on a column defined as nvarchar(32), and your table has a LOB column in it, you will probably see LOB storage at the leaf level of the output of sys.dm_db_index_physical_stats (I don't have an example handy, so some testing will need to be done to confirm).
Adam Bean
Adam Bean
SSCertifiable
SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)

Group: General Forum Members
Points: 6646 Visits: 2222
Lynn Pettis (1/21/2014)
Can you post the links you have that relate to reindexing causing duplicate records? I have never experienced this issue.


Hey Lynn, it's not the act of reindexing that's causing dupes, it's the logic to query the data from dm_db_index_physical_stats.

---
SQLSlayer
Making SQL do what we want it to do.
Adam Bean
Adam Bean
SSCertifiable
SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)SSCertifiable (6.6K reputation)

Group: General Forum Members
Points: 6646 Visits: 2222
Matt Crowley (1/21/2014)
At a leaf level, a clustered index is the table. If you have a primary key defined on a column defined as nvarchar(32), and your table has a LOB column in it, you will probably see LOB storage at the leaf level of the output of sys.dm_db_index_physical_stats (I don't have an example handy, so some testing will need to be done to confirm).


Ah, ok, that makes sense (duh). You are correct, this particular table does have an ntext column, so that would make sense why I have a LOB_DATA then. So again in this particular case (I have well over 50,000 more incidents globally), it would make sense to only focus on IN_ROW_DATA.

I'm simply trying to understand what if any ramifications exist by limiting myself to IN_ROW_DATA only across the board. Keeping in mind, I have to manage over 5,000 databases ... so I try to handle all scenario's properly. I just worry that by changing my ~7 year old logic that I am going to begin missing indexes out there.

---
SQLSlayer
Making SQL do what we want it to do.
Lynn Pettis
Lynn Pettis
SSC Guru
SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)SSC Guru (95K reputation)

Group: General Forum Members
Points: 95437 Visits: 38968
Adam Bean (1/21/2014)
Lynn Pettis (1/21/2014)
Can you post the links you have that relate to reindexing causing duplicate records? I have never experienced this issue.


Hey Lynn, it's not the act of reindexing that's causing dupes, it's the logic to query the data from dm_db_index_physical_stats.


That makes more sense. Your original post made it seem like your reindexing was causing duplicate records to be created.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search