Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Help resolving duplicates with reindexing logic Expand / Collapse
Author
Message
Posted Tuesday, January 21, 2014 8:52 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, June 16, 2014 6:53 AM
Points: 1,145, Visits: 1,932
Hey all,

Recently discovered my reindex routine is causing dupes. I was able to track it down to when/if an index has multiple allocation unit types. For example my particular index I'm playing with has both an IN_ROW_DATA and LOB_DATA.

I've been reading some random posts here and elsewhere as well as reviewing others scripts and the common theme I'm finding is that most simply limit their query to say WHERE alloc_unit_type_desc = 'IN_ROW_DATA'. I'm just curious if this is the proper thing to do and why? This post and this one do help; however I am simply not yet sure I fully understand.

For my particular index that I'm focusing on, as stated I have two different alloc_unit_type's; however they're both index_level 0 with two different index_depths. The LOB_DATA is at depth 1 and IN_ROW_DATA at depth 3. Is there logic to determine at what depth or level to focus on, or do you simply focus on IN_ROW_DATA only?

What happens if you have an index (I can't find any yet) that all the fragmentation exists outside of an IN_ROW_DATA? I read here that LOB allocations will never get fragmented; thus you need not worry.

So I know I can fix it; however I'd really like to better understand this first.

Thanks


---
SQLSlayer
Making SQL do what we want it to do.

Post #1533167
Posted Tuesday, January 21, 2014 8:55 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Today @ 1:29 PM
Points: 236, Visits: 429
LOB_DATA is not normally re-indexed, as it it can not really be indexed (except by full text indexing). Are these maybe included columns?
Post #1533172
Posted Tuesday, January 21, 2014 8:57 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 2:44 PM
Points: 42,443, Visits: 35,498
LOB_DATA is your varchar(max), nvarchar(max), varbinary(max), XML and a few other data types. They have special storage.

Typically you want to filter for 'in row data' when reindexing because reindexing affects only the in row data pages (reorg can compress lob data too, but that's not fragmentation in the traditional sense). If you don't, you can end up rebuilding the same index up to 3 times, which is a waste of time.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1533177
Posted Tuesday, January 21, 2014 9:01 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, June 16, 2014 6:53 AM
Points: 1,145, Visits: 1,932
Hi Matt, thanks for the prompt response!

As I'm looking at my environment at large, I'm finding multiple incidents where I've been doubling down on the work to do because of this logic.

This particular index, it's a clustered index, on one nvarchar(32) column. So this one in particular, how/why does it have a LOB_DATA?

Either way, I'm veering off topic vs. the overall inquiry of attempting to understand how to focus a reindex routine. Is there a general rule/reasoning to what to focus on and why? Based on most existing routines out there that I've reviewed, most are simply using a where clause for in_row_data.


---
SQLSlayer
Making SQL do what we want it to do.

Post #1533187
Posted Tuesday, January 21, 2014 9:02 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, June 16, 2014 6:53 AM
Points: 1,145, Visits: 1,932
GilaMonster (1/21/2014)
LOB_DATA is your varchar(max), nvarchar(max), varbinary(max), XML and a few other data types. They have special storage.

Typically you want to filter for 'in row data' when reindexing because reindexing affects only the in row data pages (reorg can compress lob data too, but that's not fragmentation in the traditional sense). If you don't, you can end up rebuilding the same index up to 3 times, which is a waste of time.


Exactly which is what I was doing! Yeah, I realize LOB_DATA is all the lob data types (I exclude those based on sys.columns); however in this particular case as shown above ... it's not using a LOB data type, so now I'm more confused.

Regardless, you're stating though to truly only focus on IN_ROW_DATA then?


---
SQLSlayer
Making SQL do what we want it to do.

Post #1533189
Posted Tuesday, January 21, 2014 9:06 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 1:59 PM
Points: 23,002, Visits: 31,490
Can you post the links you have that relate to reindexing causing duplicate records? I have never experienced this issue.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1533193
Posted Tuesday, January 21, 2014 9:09 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Today @ 1:29 PM
Points: 236, Visits: 429
At a leaf level, a clustered index is the table. If you have a primary key defined on a column defined as nvarchar(32), and your table has a LOB column in it, you will probably see LOB storage at the leaf level of the output of sys.dm_db_index_physical_stats (I don't have an example handy, so some testing will need to be done to confirm).
Post #1533198
Posted Tuesday, January 21, 2014 9:09 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, June 16, 2014 6:53 AM
Points: 1,145, Visits: 1,932
Lynn Pettis (1/21/2014)
Can you post the links you have that relate to reindexing causing duplicate records? I have never experienced this issue.


Hey Lynn, it's not the act of reindexing that's causing dupes, it's the logic to query the data from dm_db_index_physical_stats.


---
SQLSlayer
Making SQL do what we want it to do.

Post #1533199
Posted Tuesday, January 21, 2014 9:12 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, June 16, 2014 6:53 AM
Points: 1,145, Visits: 1,932
Matt Crowley (1/21/2014)
At a leaf level, a clustered index is the table. If you have a primary key defined on a column defined as nvarchar(32), and your table has a LOB column in it, you will probably see LOB storage at the leaf level of the output of sys.dm_db_index_physical_stats (I don't have an example handy, so some testing will need to be done to confirm).


Ah, ok, that makes sense (duh). You are correct, this particular table does have an ntext column, so that would make sense why I have a LOB_DATA then. So again in this particular case (I have well over 50,000 more incidents globally), it would make sense to only focus on IN_ROW_DATA.

I'm simply trying to understand what if any ramifications exist by limiting myself to IN_ROW_DATA only across the board. Keeping in mind, I have to manage over 5,000 databases ... so I try to handle all scenario's properly. I just worry that by changing my ~7 year old logic that I am going to begin missing indexes out there.


---
SQLSlayer
Making SQL do what we want it to do.

Post #1533204
Posted Tuesday, January 21, 2014 9:15 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 1:59 PM
Points: 23,002, Visits: 31,490
Adam Bean (1/21/2014)
Lynn Pettis (1/21/2014)
Can you post the links you have that relate to reindexing causing duplicate records? I have never experienced this issue.


Hey Lynn, it's not the act of reindexing that's causing dupes, it's the logic to query the data from dm_db_index_physical_stats.


That makes more sense. Your original post made it seem like your reindexing was causing duplicate records to be created.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1533208
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse