5000+ stores connecting - What type of architecture?

Question

5000+ stores connecting - What type of architecture?

ImYourDBA

SSCertifiable

Points: 5654
More actions
February 6, 2013 at 4:21 pm

#273921

We have a very large client that will have 5000+ stores that will be connecting in on a web app and I'm looking for the best architectural design to make everything run super efficiently. Putting aside how it's coded, etc. what are some recommendations for best performance when considering the high number of incoming connections?
The application is 80% reads and about 20% writes/updates. We batch insert new data in at night or through the day at regular intervals.
We have thought about possibly doing a partitioned approach (10 servers) using a load balancer to point certain store number ranges to certain servers (ie - at the network level (reading packet to get first digit of the store number or something like that). Or we could think about letting all connections come in to one server for reads and another for writes and then doing merge or peer to peer replication between them. Or do we just keep everything in one really beefy server and use snapshot isolation?
Any recommendations?
Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

Steven Willis SSCrazy Eights Points: 9893 More actions · Answer 1

SQLJocky (2/6/2013)
We have a very large client that will have 5000+ stores that will be connecting in on a web app and I'm looking for the best architectural design to make everything run super efficiently. Putting aside how it's coded, etc. what are some recommendations for best performance when considering the high number of incoming connections?
The application is 80% reads and about 20% writes/updates. We batch insert new data in at night or through the day at regular intervals.
We have thought about possibly doing a partitioned approach (10 servers) using a load balancer to point certain store number ranges to certain servers (ie - at the network level (reading packet to get first digit of the store number or something like that). Or we could think about letting all connections come in to one server for reads and another for writes and then doing merge or peer to peer replication between them. Or do we just keep everything in one really beefy server and use snapshot isolation?
Any recommendations?

I'm not a hardware guy but I managed an ecommerce project a few years ago for a store doing $50 million in annual sales and thousands of online orders per day. We estimated that the store would lose $10,000 per hour if the system was down during prime sales hours. Now I realize this is slightly different than a B2B environment but the cost of downtime is probably on a similar scale.

That client--like most clients--had a hard time seeing the potential dangers of failing to prepare for disaster. When a clumsy backhoe operator somewhere down the road from the co-lo facility dug up the city's main internet trunk line 2/3rds of the city was without internet access for 8+ hours in the middle of a work week. That was $100,000 in revenue the client lost. The backup plan finally assembled cost half that much.

The original architecture with load balanced production servers, an equal number of load-balanced backup servers updated via transactional replication, separate development servers including a dedicated pre-production staging server, backup plans, and backups to the backups made everyone feel safe. Until the "Day of the Backhoe."

After that they found the money to create a parallel geo-located system in another state. The next time we had a major failure (there's always a next time), we had everything rolled over and running off the out-of-state server location in less than 10 minutes. I know this doesn't answer your question specifically but I wanted to point out the major danger of putting so much firepower in one place. Once you do get a good environment setup you should consider strongly setting up a disaster recovery site somewhere far away. Actually, it should all be part of your original plan and tightly integrated right from the start. There are many good co-lo facilities around the country that will monitor such a site for you like the one we use: Peak10. I'm sure there are thousands of companies in the northeast right now who wish they'd had a disaster recovery site that was out of Hurricane Sandy's path.

ImYourDBA SSCertifiable Points: 5654 More actions · Answer 2

Thanks for the response. Yes, the company has a DR site already setup with VMWare infrastructure and are planning for everything. We just need to determine what the best production environment would look like first. Something highly scalable.

Looks like we're going to do a two phased approach now. We're going to build out for approximately 1000+ stores now and then scale up to the 5000+ in the next few years. So I'm seriously thinking that the right approach is to do a multi-node "partitioned writes" environment and use peer-to-peer replication so that as we need to add more nodes, we can do so with ease (in a manner of speaking). In this case we would have the application layer partition the stores to write against one server and read from another and replicate between them. Then I might have a third server that I can database mirror to just for offloading the reporting process.

Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/

Steven Willis SSCrazy Eights Points: 9893 More actions · Answer 3

SQLJocky (2/6/2013)
Thanks for the response. Yes, the company has a DR site already setup with VMWare infrastructure and are planning for everything. We just need to determine what the best production environment would look like first. Something highly scalable.
Looks like we're going to do a two phased approach now. We're going to build out for approximately 1000+ stores now and then scale up to the 5000+ in the next few years. So I'm seriously thinking that the right approach is to do a multi-node "partitioned writes" environment and use peer-to-peer replication so that as we need to add more nodes, we can do so with ease (in a manner of speaking). In this case we would have the application layer partition the stores to write against one server and read from another and replicate between them. Then I might have a third server that I can database mirror to just for offloading the reporting process.

Keep us informed of how this goes for you. It looks like I might have a new client with appx 1000 franchises that wants to set up B2B with locally-branded sites for each franchise as well as online retail. Life stays interesting! 🙂