|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: 2 days ago @ 2:07 PM
Points: 2,386,
Visits: 2,656
|
|
|
|
|
|
Mr or Mrs. 500
      
Group: General Forum Members
Last Login: Saturday, April 16, 2011 10:29 AM
Points: 502,
Visits: 399
|
|
Good article.
In problem resolution I often find techniques that work on small databases do not scale that well to VLDBs.
Another solution could have been to simply update (on the principal) the rows that were corrupted? Update in a manner that doesn't really change the data, and then update again to remove the change. Or alternatively, just script out the in the row, delete it and re-insert.
|
|
|
|
|
SSCommitted
      
Group: General Forum Members
Last Login: Tuesday, May 14, 2013 1:23 PM
Points: 1,905,
Visits: 1,601
|
|
Hmm - there are some nasty potential problems with doing what you describe and I would not recommend it in production:
1) what if the clustered index being rebuilt is very large? How to cope with the resulting potential backlog of transactions on the principal, and probable large REDO queue on the mirror? What about the transasction log growth on the principal from having to cope with the fully-logged index rebuild? What about the knock-on effect on log shipping, transactional replication, etc? 2) what if the I/O subsystem on the (new) mirror is damaged and the rebuild cannot be replayed? What do you suggest as the way forward if the mirror stops with a failure during replay of one of the log records from the index rebuild?
And apart from that, you don't go into details of how to make sure the problem won't happen again (i.e. root cause analysis of the original failure).
Depending on database size and network bandwidth, my recommendation may be to break the mirroring partnership, do root-cause analysis to make sure the I/O subsystem on the old principal is sound, and then re-initialize the partnership.
It's a neat idea that you're proposing, but you need to think through all the consequences and potentialities for VLDBs and for further failures before recommending to others.
Thanks
Paul Randal CEO, SQLskills.com: Check out SQLskills online training! Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 3:10 AM
Points: 37,642,
Visits: 29,896
|
|
Paul Randal (5/9/2010) 2) what if the I/O subsystem on the (new) mirror is damaged and the rebuild cannot be replayed? What do you suggest as the way forward if the mirror stops with a failure during replay of one of the log records from the index rebuild?
Would that send the mirror suspect?
Gail Shaw Microsoft Certified Master: SQL Server 2008, MVP SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
We walk in the dark places no others will enter We stand on the bridge and no one may pass
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 1:05 PM
Points: 1,408,
Visits: 4,505
|
|
Paul Randal (5/9/2010) Hmm - there are some nasty potential problems with doing what you describe and I would not recommend it in production:
1) what if the clustered index being rebuilt is very large? How to cope with the resulting potential backlog of transactions on the principal, and probable large REDO queue on the mirror? What about the transasction log growth on the principal from having to cope with the fully-logged index rebuild? What about the knock-on effect on log shipping, transactional replication, etc? 2) what if the I/O subsystem on the (new) mirror is damaged and the rebuild cannot be replayed? What do you suggest as the way forward if the mirror stops with a failure during replay of one of the log records from the index rebuild?
And apart from that, you don't go into details of how to make sure the problem won't happen again (i.e. root cause analysis of the original failure).
Depending on database size and network bandwidth, my recommendation may be to break the mirroring partnership, do root-cause analysis to make sure the I/O subsystem on the old principal is sound, and then re-initialize the partnership.
It's a neat idea that you're proposing, but you need to think through all the consequences and potentialities for VLDBs and for further failures before recommending to others.
Thanks
if you have limited bandwidth and a VLDB then running a backup over the WAN and restoring it along with all the transaction logs would be very time consuming. and from a DR perspective dangerous since you won't have a copy of the data in a DR location during the process.
there are risks with this solution, but in a lot of environments they are probably worth it compared to breaking your DR process and reinitializing it. and working around business hours for the backup/restore process
on some of our larger tables in the 200 million row range if we had to do this we would drop all indexes, rebuild the clustered and then the other indexes. it would take 10-20 minutes per index. maybe 60 for the clustered index.
https://plus.google.com/100125998302068852885/posts?hl=en http://twitter.com/alent1234 x-box live gamertag: i am null [url=http://live.xbox.com/en-US/MyXbox/Profile[/url]
|
|
|
|
|
SSCommitted
      
Group: General Forum Members
Last Login: Tuesday, May 14, 2013 1:23 PM
Points: 1,905,
Visits: 1,601
|
|
@Gail No @Alex Yes, which is why I said it would depend on database size and network bandwidth. My point was that the potential risks need to be understood before doing this. And you're missing the point about dropping and rebuilding indexes with synchronous mirroring running - essentially all the new indexes would be sent across the wire to the mirror - that may be just as much data, and much slower than reinitializing from a backup.
Paul Randal CEO, SQLskills.com: Check out SQLskills online training! Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
|
|
|
|
|
SSCoach
         
Group: General Forum Members
Last Login: Thursday, May 16, 2013 1:46 PM
Points: 18,732,
Visits: 12,329
|
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: 2 days ago @ 2:07 PM
Points: 2,386,
Visits: 2,656
|
|
I agree that I should have included a disclaimer about using this method. However, my intent was to share my experience and a unique solution to a corruption issue. The intent was not to present a cure all solution.
The root cause analysis was done by the storage team along with our storage vendors. Since this is outside my expertise I can only give the 30,000 foot view. It basically boiled down to a disk failure that was not handled correctly by the SAN causing the disk controller to freeze.
Rebuilding the 190GB clustered index on a single table was a faster solution (2 hrs) in this particular case than rebuilding mirroring for a 3TB database. As Paul indicated there was a large redo queue on the mirror side. This solution was tested in our QA system to estimate timing and was implemented during a quiet system time.
|
|
|
|