Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««1234»»»

Guest Editorial: That ain't a Database, it's a Spreadsheet Expand / Collapse
Author
Message
Posted Thursday, December 18, 2008 8:21 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, May 12, 2014 1:27 PM
Points: 1,386, Visits: 824
While i agree with blandry that having a solid idea of data throughput is nearly impossible, i don't think that invalidates Phil's point. I think what Shaun was talking about is the perfect example:

Shaun McGuile (12/18/2008)
In my case the 3 million rows was full population of the database for six months worth of usage.

I had specifc fixed limits in the system I was testing and actual data would be at most 2 million rows, given the nature of the application. So doing all the load testing with 4 - 6 times the allowed data volumes was a good test.

hey thats the beauty of redgate sql data generator though. ;)


He had an inkling of what production numbers would look like, but the more important thing was performing the test itself. If you can't know how much data is going to be in the database (and you have data generator...), filling it up with a ludicrous number of rows and doing performance tests is still valid.
Post #622137
Posted Thursday, December 18, 2008 8:32 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Today @ 4:03 AM
Points: 587, Visits: 2,525
When I worked as a technical architect, I would never dream of letting a project go into development unless we had carefully specified the loading (e.g. transactions per second) and the data metrics (e.g. no. of records) that the system had to deal with. This makes sense because then the developers can do their unit testing to those figures (with allowance for the inherent optimism of humanity), and the testers can test to those figures. Then the project can be signed off as being fit for production service.

If you don't have those figures, how then can you DBAs specify the hardware, or produce/cost out the Service Level agreements? How can you developers do unit testing? In my example, the company hadn't really thought it through, but we very quickly had all those figures calculated and the hardware in place. I think that my response to those who don't have a proper specification of the system is just to express huge waves of sympathy. I feel your pain. You're more firefighters than developers, or DBAs



Best wishes,

Phil Factor
Simple Talk
Post #622154
Posted Thursday, December 18, 2008 8:40 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Wednesday, May 9, 2012 10:26 AM
Points: 891, Visits: 1,958
Sorry, BLandry, but I have to respectfully disagree. I think you can do reasonable estimates on important systems as to how big it can get. Calculating table and index size is almost trivial, but everything depends. If the company is a brand-new startup, then doing accurate estimates will be difficult, especially if the company takes off and ramps up at an unanticipated and explosive rate. But that reinforces Phil's point: test under heavy (or monstrous) load; it will perform well with a light load and you have confidence of acceptable performance under heavy load.

It's definitely harder to accurately forecast size and load for new projects, much easier for redevelopments and conversions.
Post #622172
Posted Thursday, December 18, 2008 8:56 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Yesterday @ 8:00 PM
Points: 33,204, Visits: 15,354
Phil,

Thanks for the break and guest editorial. I had a good time skiing with the kids yesterday.

I agree that you want to have an idea of scale and build development systems to that level. There are times you have an inkling of scale, and you should test to that level.

For most of my career, as I've built systems, we had no idea of the load. So we've not been sure where to test, though we've always tried to account for scale and load early on. You don't want to get wrapped up in it, but take a good guess and test beyond that guess. Then plan for things to get busier, or spend a few minutes making sure you haven't prevented things from evolving.

To some extent, experience teaches you how to build small systems that can grow. Or it should.







Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #622190
Posted Thursday, December 18, 2008 9:45 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, February 18, 2014 7:14 AM
Points: 1,344, Visits: 1,983
Phil Factor (12/18/2008)
When I worked as a technical architect, I would never dream of letting a project go into development unless we had carefully specified the loading (e.g. transactions per second) and the data metrics (e.g. no. of records) that the system had to deal with.
You've obviously worked in companies with very much more static businesses that I have then!

Although, in an ideal world, it would be great to have the numbers for the exact number or records, transactions, queries, etc, in my experience the best you get is usually very rough and likely to change depending on business performance. The best I've usually managed is to get the most up to date estimates and predictions you can, then assume that there going to be off by at least a factor of two. If the system can copy with that, there's a chance the production system wll be OK for a while!

On another point, I would agree with the idea of multiple levels of testing during development. When doing early testing, use small samples of data so you get fast turnaround; once things are beleived to be OK, step up to realistic levels so you can see what it's really going to be like; finally, before release, try it with what you consider ridiculously high levels to see how it's likely to degrade. Chances are
the real world will turn out somewhere between your realistic and ridiculous estimates.

Of course, this also relates to developers always wanting the latest/fastest hardware when they should have some of the oldest/slowest...

If they can produce acceptable performance on that old junk...


Derek
Post #622246
Posted Thursday, December 18, 2008 10:34 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Friday, August 29, 2014 9:39 AM
Points: 2,917, Visits: 468
nice eitorial today Phil. we recently experienced that very scenario, where the design did not scale up very well. when we were finally called into action to investigate, our reaction was "...surprised that this ever worked on any scale..." needless to say, a little extra unit testing would have been nice.

and the system is (probably) going to be stressed more in the next few months, hopefully someone finds the time to do some testing before they flip another switch.


----------------------
http://thomaslarock.com
Post #622305
Posted Thursday, December 18, 2008 1:37 PM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, August 10, 2012 6:08 PM
Points: 1,156, Visits: 801
The statement has been made in essence that one cannot assume to know the real size of an intended data system during dev and test.I must disagree. The business of the client almost always defines the needs, whether those in charge recognize those needs or not. Again, the business, not the representative or owner of the business, defines the needs. In both examples given as examples in the previous post, one could have managed a small development spike (borrowing from Extreme lingo here) to assess an average of at least a percentage of the share involved.For example: I was involved in a 6 month contract to manage the entire data flow from man-hours, to production costs, to delivery bill-of-lading documents, for a specific buy-out and deployment of product to over 300 new locations. The company had no idea even if it would be a profitable venture, let alone how they would manage production output. So we took a sampling from about 10% of the locations, obtained data for those, assesed cost on each basic service, associated man-hours for each service, and was able to report the expected value, cost, and man-hours of the entire job, let alone the expected size of the data project itself. The data applied to both the database and controlled the actual output flow of the work, and gave the client specific estimates on what man-power was required to do the job.One can and must be able to estimate the general size of the project. One may not be able to put a real count on number of rows or relative disk size. However one, with enough experience should very well know the sheer scope of the information, given the client's business size and expected growth. Test beyond that margin. When in doubt, assume it will be larger than anticipated.So, instead of hard count estimates, I tend to lean on about 5 categorical sizes of small, medium, large, very large, and horrendously large. This consideration is given to every table in the test system. Some tables are lookup and as such remain small even in systems that are overall horrendously large. Some tables, such as transactional history on frequently changed data, are expected to be extremely high row count, but relatively narrow, so query time is affected, but inserts are not (especially frequent in many-to-many relationships). And then there are some tables that are so wide that one even considers breaking them into sections.The point is, there is no reason to believe one cannot make at least a very good educated decision about expected size of database, both in count and physical disk space, and add a little (or a lot) extra just to make sure the ironworks float. Know your client. Know the industry. Know generics about business data in general. If one does not know these, one should not be leading a project.The show stopper, more often than not, is that clients tend to fault on the side of too skimpy, especially when it comes to hardware, and then on both great DBA and Admin management. Many clients tend to not want to spend anything until it is failing. I have seen a little too often the lofty goals of a client get mangled by the infrastructure they did not think was critical, thinking they'd just hire someone when they got into a jam. It takes years off their growth margins, and lost opportunity costs.As an aside, in the example I gave above, that client had actually lost a major contract because their internal data guys could not give them the answers that would prove the job manageable, cost-worthy, nor the structure of the management flow of data. They simply could not see outside of the box of their current billing and production systems. A year and a half later, another similar job came up, and one of the owners decided get help. Within two weeks we gave them what they needed to make the necessary decisions, and get the job done. My contract was immediately extended from 2 week assessment to the full 6 months of deployment to manage the data flow. Trucks were loaded and delivered by the new system lading orders. They made millions. We made a year's salary in 6 months. It was win-win... but their internal IT people were never happy about it, but could say nothing.
Post #622443
Posted Friday, December 19, 2008 5:03 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, February 18, 2014 7:14 AM
Points: 1,344, Visits: 1,983
I didn't intend to say that you can't get a good estimate of the size of the data/transactions/etc for any projects. Just that, all too often, the estimates will be low.

I've encountered situations where it was categorically stated during design that something was out-of-scope and then, a few months after go-live, management decided to bring it into scope. Also, I've had cases where a detailed analysis and projection of volmes based on existing and anticipated business is done and then, after implementation, unexpected new business turns up which throws out all the projections.

Basic rule seems to be to assume that all volumes are significantly underestimated!


Derek
Post #622778
Posted Friday, December 19, 2008 6:15 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Today @ 4:03 AM
Points: 587, Visits: 2,525
Yes. All volumetrics are an educated guess. Well, they were when I did them. Multiply by the contingency factor and you have a size to use all the way from unit testing through to release!

Bless you all for some really fascinating contributions to the discussion. It is a shame we can't pass round a virtual bottle of 'seasonal' port and toast our virtual friendships.



Best wishes,

Phil Factor
Simple Talk
Post #622841
Posted Friday, December 19, 2008 6:38 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Thursday, October 4, 2012 9:20 AM
Points: 583, Visits: 1,060
Merry Chistmas to you Phil....can I say that?

--Shaun


Hiding under a desk from SSIS Implemenation Work
Post #622863
« Prev Topic | Next Topic »

Add to briefcase ««1234»»»

Permissions Expand / Collapse