Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

scalable hardware solution for 10TB now to 100TB in 3 years Expand / Collapse
Author
Message
Posted Monday, December 20, 2010 2:03 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, October 20, 2011 4:32 AM
Points: 10, Visits: 132
Hello experts,

we are looking for a server solution for the next three years - at least. We collected 6TB in the last 3 years. We are using two direct attached disk boxes with 25 disks each connected to a dual socket quad core server. Currently RAID 5 due to space requirements. Storage and performance needs are growing with an estimate target capacity of 50TB ... 100 TB in the next 3 years. Any query should take same or less time than now, using the complete history of collected data.
We are skeptic about a SAN solution using a NetApp filer since we dont believe that a SAN solution is able to deliver the same performance as direct attached storage. We also fear that a netapp SAN system could be 5 to 10 times more expensive. Whats yor opinion and experience with SAN performance? How good do SANs scale wrt. performance and capacity? SAN backup is extremely expensive due to strange political internal contracts (imho). Expert opinions wanted!
Post #1037393
Posted Wednesday, December 22, 2010 8:55 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, April 03, 2014 12:46 PM
Points: 1,413, Visits: 4,531
where i work we're on HP and you should be able to do this with Proliant servers. Get a DL 380 with 2-3 P812 RAID controllers. for the storage get a MSA 60 with the 2TB SATA drives. it holds 12 drives so it's 24TB raw storage, 22TB after RAID5 and i'm not sure how much after formatting. this is based on one RAID5 per MSA.

Each P812 has 2 external connectors and you can cascade something like 8 MSA's per connector

your big headache is going to be backup. for something like this you will need LTO-5 tapes or a D2D solution with a lot of storage. I would make sure that your DB server and backup servers are 10 gigabit and 6G SATA/SAS. A Proliant DL 380 G5 will be too slow due to it's older I/O channel

Check out the Proliant DL 380 G7. G8 will be out April to June of 2011.


https://plus.google.com/100125998302068852885/posts?hl=en
http://twitter.com/alent1234
x-box live gamertag: i am null
[url=http://live.xbox.com/en-US/MyXbox/Profile[/url]
Post #1038321
Posted Wednesday, December 22, 2010 11:47 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, October 20, 2011 4:32 AM
Points: 10, Visits: 132
Hello,

thank you for your ideas.

A few additional questions: For better IO/s performance we think about using more drives with 1GB (or less). So we get 11TB in a RAID5. This gives us 8x11=88TB for a single server machine which is near our capacity targets -- great. Backup will be done with a second identical server+storage system.

Now the next question: Given this machine with 88TB, how can we further scale capacity and performance when this system is not enough. This will be 3 years from now.

Post #1038425
Posted Wednesday, December 22, 2010 2:28 PM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, April 03, 2014 12:46 PM
Points: 1,413, Visits: 4,531
they will probably have higher capacity hard drives by then that you can replace into your existing RAID5

just make sure you're running Windows 2008 R2


https://plus.google.com/100125998302068852885/posts?hl=en
http://twitter.com/alent1234
x-box live gamertag: i am null
[url=http://live.xbox.com/en-US/MyXbox/Profile[/url]
Post #1038510
Posted Thursday, December 23, 2010 3:21 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, October 20, 2011 4:32 AM
Points: 10, Visits: 132
ok, the disks will have more capacity, but speed probably wont be enough. All queries will take twice the time for twice the data. Everybody says the number of drives or spindles limits the IO/s. We need a solution that can grow wrt. capacity AND PERFORMANCE. We also would need twice the CPU peformance, so a new server will be necessary. It may be necessary to store 500TB in less than 3 years. How could we do that? One server wont have enough performance.
Post #1038676
Posted Wednesday, December 29, 2010 8:07 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, April 14, 2014 7:46 PM
Points: 381, Visits: 535
1. If you want manageable, extensible storage then this is what SANs are for. The one you must avoid like the plague is NAS as this will force all your I/O though the network cards.
2. If you want the data to be accessible and writable at the same speed then you need to start considering different architecture as RAID 5 slows down for writes when you add more spindles.

Suggestion:
Since you are posting on an SQL based forum I'm going to assume that there is an SQL database over the top of this.

Once you have separated log files onto different physical spindles, you may want to consider using a mix of RAID 5 and RAID 1+0. If you can define a Primary Key which can separate the data based on its age then you can put older data onto a RAID 5 array and use the Sliding Window Partition (see the SQL Server Central article here for explanations, and place historical data onto Filegroups that are on RAID 5, and current data onto a filegroup on the RAID 1+0 array. When the time comes to move a filegroup from Current to Historical you can shut down the SQL and relocate the file in a scheduled maintenance window.

HP / Compaq also make quite a nice solution called the StorageWorks 4000 which is highly extensible. I'm sure that IBM have something and EMC/Clariion will have a solution too.
Post #1040683
Posted Wednesday, December 29, 2010 9:00 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Yesterday @ 4:31 PM
Points: 32,780, Visits: 14,941
I tend to agree with Toby. R5 has a write penalty, and to get large storage, you need to move to a SAN, both for IOPS performance and capacity.








Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1040690
Posted Friday, December 31, 2010 10:50 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, October 20, 2011 4:32 AM
Points: 10, Visits: 132
Thank you for your helpful suggestions. I understand that a san like storageworks could be a solution. I'm not a SAN expert, but I doubt that such a system would deliver the performance we are looking for. The FC network between SAN storage and servers would be a performance bottleneck. We want to scan the *complete data in less than 1 hour* - which is possible with our direct attached storage now. We have approximately 1Gbytes per second now with a 2 Quadcore CPU server . In the future we will have 10 t
o 100 times more data, so we would need a network for 10 GBytes per second to 100GBytes per second and we would need 20 to 200 quadcore CPUs for that. Could you give me an example how this can be done? Even if the SAN can do this, we would need a cluster of 10 to 100 servers for the needed number of CPUs.
Post #1041505
Posted Monday, January 03, 2011 3:05 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, April 14, 2014 7:46 PM
Points: 381, Visits: 535
We want to scan the *complete data in less than 1 hour* - which is possible with our direct attached storage now


I question this, but will bow to your experience. Ultimately *everything* which can be achieved using local attached storage can be achieved using a SAN, and more. SAN is more expensive that Direct Attached, but is significantly more flexible and capable.

In the future we will have 10 to 100 times more data, so we would need a network for 10 GBytes per second to 100GBytes per second and we would need 20 to 200 quadcore CPUs for that.


At this point my recommendation becomes really simple. If you are going to be getting that much hardware then get the pre-sales storage architects from HP / IBM / EMC into your office and explain that you are about to spend a LOT of money with them. When they have finished drooling over you, sit them down and explain the problem. Get them to provide a solution with a written performance guarantee.

Could you give me an example how this can be done? Even if the SAN can do this, we would need a cluster of 10 to 100 servers for the needed number of CPUs.


Not without spending a lot more time on this, and I would still recommend getting one of the SAN vendors in.
Post #1042084
Posted Monday, January 03, 2011 7:32 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:47 PM
Points: 35,959, Visits: 30,252
mlbauer (12/23/2010)
All queries will take twice the time for twice the data.


Not really true and "It Depends" pevails a whole lot in that area. Partitioned tables can really help in that area as can properly written code and proper indexing as well as a nice solid database design and effective/regular maintenance of the database.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1042123
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse