SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Is SAN admin correct? Windows 500 GB limitation on cluster failover?


Is SAN admin correct? Windows 500 GB limitation on cluster failover?

Author
Message
NJDave
NJDave
Old Hand
Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)

Group: General Forum Members
Points: 363 Visits: 598
Hello

I was recently pulled into an ongoing project that is running into problems. The old project manager let me know that there were problems failing over more than 500 GB and the new project manager has it listed as a "Windows problem". His boss is calling it a "SQL Server problem". I heard from the old project manager that there are SAN/Hitachi replicator issues. So the storage team is looking to blame SQL somehow.

Is it true that a failover cluster has problems at the SQL or Windows level when doing a failover cluster for a database > 500 GB? How about 1TB or 10TB?

Does anyone have a good article for this - its hard to find documentation of something that is possibly not true.

Any help is appreciated.

Thanks
Dave
GilaMonster
GilaMonster
SSC Guru
SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)

Group: General Forum Members
Points: 233555 Visits: 46361
I've done a failover cluster with an instance that had 1 600GB database and 1 1TB database. No storage problems (well, other than a badly designed very slow SAN)

The size of the database isn't a factor in a cluster failover, the storage is shared, the ownership of the storage switches from one server to the other, there's nothing copied.

Could you describe the problems more?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


NJDave
NJDave
Old Hand
Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)

Group: General Forum Members
Points: 363 Visits: 598
I think it is mostly a communication issue - I'm attending a meeting this afternoon and heard different reports between the old and new project managers.

They don't have the cluster built - they are planning but I think the SAN people had problems with their hitachi replicator and trying to shift the focuse to SQL.

Or, it could have been misinterpreted by the new pm.

The statement you wrote below is what I will be armed with - I needed verifiction from people with more experience, thanks.

"The size of the database isn't a factor in a cluster failover, the storage is shared, the ownership of the storage switches from one server to the other, there's nothing copied."
GilaMonster
GilaMonster
SSC Guru
SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)SSC Guru (233K reputation)

Group: General Forum Members
Points: 233555 Visits: 46361
Hitatchi Replicator? Are you doing a geo-dispersed cluster with non-shared storage or something? Or SAN replication to a failover SAN/DR site?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


RP_DBA
RP_DBA
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1557 Visits: 1070
It's been a while but I think we had a similar issue. Might look into this:

from dbforums.com: SQL Service starts before your SAN is available.

We had this problem, too. According to our SAN vendor, the fix is as follows:
Open iSCSI Initiator
Click the 'Bound Volumes/Devices' tab
Click 'Bind All"
Click OK.

This will force the iSCSI Initiaotr to mount all the volumes before it relinquishes control to other processes, such as SQL Server.


_____________________________________________________________________
- Nate

@nate_hughes
NJDave
NJDave
Old Hand
Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)

Group: General Forum Members
Points: 363 Visits: 598
The plan is for SAN replication to a failover SAN/DR site.

But they don't have the SAN or the power at the failover site anyway.

For now, they would set up a local cluster and the database could grow up to 10 TB.

The databases are up and running now on a single physical server. SLQ 2008 R2 Enterprise

Thanks
Dave
Kendra Little
Kendra Little
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: General Forum Members
Points: 6 Visits: 220
I agree with Gail-- you can cluster very large databases (VLDBs) very successfully, and Windows Failover cluster can be great at providing high availability for these databases.

It's also possible to have a lot of issues with Windows Failover Clustering if you don't follow best practices, take shortcuts, or have issues in your network environment or storage subsystem. That's true no matter how large your database is. There are settings and configuration issues in SQL Server which can make failover slow at times, too-- it does get pretty complex.

So, in short, I would suggest:
* Making sure High availability (same datacenter) and disaster recovery (remote datacenter) requirements are defined appropriately
* Define a build and migration plan in stages with good rollback and a testing plan
* Implement everything incrementally (which it sounds like you're on track to do as it sounds like you're talking about getting HA set up in the local datacenter before moving on)

For the SAN replication itself, much depends on the version of the hardware, the type of replication (sync or async), the communication path between the datacenters, etc. It can be great, or it can cause a lot of problems depending on implementation.
Sailorking
Sailorking
SSC-Enthusiastic
SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)

Group: General Forum Members
Points: 145 Visits: 126
SQL has zero problems failing over VLDB's and I can contest to that since I have 2 node clusters that fail over 1,380 databases ranging from 100GB to 1.5TB without a problem. Just like everyone else has said this isn't a SQL thing since files are not copied; the volumes are shared.

As for the SAN I can see this being a problem with their replication service over WAN but again it wouldn't be a file size problem but more of a LUN size problem since the replication on the SAN is done at the bit level.

We use EqualLogic and DELL switches and we replicate 9TB without a hitch so i'm guessing their talking about geo-replication over WAN or something.

my 3 cents

-King
NJDave
NJDave
Old Hand
Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)Old Hand (363 reputation)

Group: General Forum Members
Points: 363 Visits: 598
During the meetings it went from a SQL problem to a Windows problem to a 10 TB LUN problem. It seems political to me - it was easier to tell the audience that there was a windows or SQL issue rather than say he made a 10 TB LUN.

The SAN admin is pushing to have around 20 databases spread across 5 2TB LUNS than 1 10 TB LUN. It seems things wreent planned properly between the project manager, vendor, and the SAN admin - this project started without me over 2 years ago, I'm just coming into it now - and learning quickly.

Thanks for your help
Dave
sql-lover
sql-lover
SSCarpal Tunnel
SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)SSCarpal Tunnel (4.7K reputation)

Group: General Forum Members
Points: 4689 Visits: 1930
NJDave (3/26/2013)
Hello

I was recently pulled into an ongoing project that is running into problems. The old project manager let me know that there were problems failing over more than 500 GB and the new project manager has it listed as a "Windows problem". His boss is calling it a "SQL Server problem". I heard from the old project manager that there are SAN/Hitachi replicator issues. So the storage team is looking to blame SQL somehow.

Is it true that a failover cluster has problems at the SQL or Windows level when doing a failover cluster for a database > 500 GB? How about 1TB or 10TB?

Does anyone have a good article for this - its hard to find documentation of something that is possibly not true.

Any help is appreciated.

Thanks
Dave


That sounds familiar to me, lol ... I mean, pointing fingers that way. Do you work for the famous company that make printers and PCs? DO NOT REPLY! lol ...

MS-SQL 2008 and above (do not remember SQL 2005) does not have such limitation. You can put 32k databases if you want, but the problem is how much RAM they need, so they can run properly.

Also, I faced an issue where the SAN was able to allocate up to 500GB max only. I do not remember the specifics, but it was a SAN hardware limitation. So managing the databases was a little bit tricky as we were forced to use that Data LUN only.

Now, I am also familiarized with Veritas Cluster (not SQL failover). Because the SAN to SAN replication across regions (one was in Texas, the other one in GA I think), we limited the amount of data that we put there. But that's because the huge amount of data that has to be moved in case of a crash. However, we were able to fail-over using Veritas in a matter of minutes, which it is actually amazing good for mission critical databases.

Bottom line, most recent SQL versions do not have such limitation, but SAN and replication may affect that.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search