SQL Server, PostgresSQL and Fish Curry

An interview with Adam Machanic, discussing hot new features of SQL 2005, stored procedures, fish curry and more

An interview with Adam Machanic

At Tech Ed 2006 in Boston I cornered Adam for 45 mins and talked to him about SQL Server 2005, stored procedures, beer music and various other things. I finally got round to transcribing the interview. Enjoy!
-Tony Davis, Simple-Talk editor.

Adam is an independent database software consultant, based in Boston, Massachusetts. He has implemented SQL Server solutions for a variety of high-availability OLTP and large-scale data warehouse applications, and also specializes in .NET data access layer performance optimization. He is a SQL Server MVP, is co-author of Pro SQL Server 2005 and is currently in the process of solo-authoring “Expert SQL Server 2005”. He regularly speaks at user groups and community events.

Small talk

[TD] Tell me a little bit about what you’re up to right now, the sort of projects you’re working on at moment…

[AM] I work mainly with start ups and ISVs…most businesses I work with are smaller companies who bring me in at beginning of projects to help with development work, data architecture etc. I just started up a consulting firm called Data Manipulation Group, Inc. It’s just me at the moment but if I can find some people in the area (Boston) who are qualified they are welcome to email me and I’d love to talk to them.

[TD] It does sound like a busy time…because you’re also writing a book at the moment, right?

[AM] Yeah, it’s been (laughs wryly)…a tough process. Hopefully it will be out in November 2006 but progress has been a little slower than I anticipated. I’m fighting hard to maintain a very high quality level but I find I need a full day or two days solid focus on it otherwise it’s very hard to make good progress. And of course finding those solid blocks of time is incredibly difficult. It’s very different from writing an article…I can bang out a 4-5 page article in 2 hours, but the book…a page every two hours…if I’m lucky.

How 2005 changes things for developers and DBAs

[TD] I know you’ve already working quite extensively with 2005, and obviously playing with it quite a lot while writing the book…what’s your overall impression…what’s better/worse than in 2000? Have they made the DBMS easier to use, harder to use?

[AM] In my opinion SQL Server always has been and is still a very easy database to install and manage and, in fact, admin continues to get easier. Paul Flessner talked recently about the self-tuning, self administering database and I think they have taken it quite a way in that direction already. For small-to-medium databases…you really don’t need to do that much with it.

[TD] But surely 2005 has added considerable complexity…there are so many elements to it now…

[AM] It’s certainly true that you can’t really be a “SQL Server 2005 expert” in the same way you could be a 2000 expert. With 2000 it was possible to know pretty much everything there was to know about the database. With 2005 you can spend all your time just being a Service Broker expert or an Analysis Services expert. I know SSIS experts who don’t touch the relational engine…ever….and they make plenty of money. Same with Analysis Services and Reporting Services.

[TD] So how is all this affecting the “DBA-Developer divide”? Before I started doing SQL Server, I did quite a lot of work with Oracle so I guess that’s my reference point…in Oracle this divide was strong, with the DBA very much the “gatekeeper” and final arbiter of what could and couldn’t go on in the database…and developers often feeling they needed to subvert the DBA in order to get things done. I’ve never sensed that in SQL Server but wondered whether the complexity of 2005 would move things in this direction…

[AM] In my limited experience with Oracle…and I have to stress that it is limited…I found that managing Oracle is a lot harder that managing SQL Server. I worked in one company where we had two SQL Server DBA/developers and we were managing over 100 servers. We also had three Oracle servers and we had three Oracle DBAs…and these DBAs were constantly complaining at how underwater they were and had no time for anything…and there were two of us managing 40 times more servers without too much trouble.

Even installing Oracle seems difficult. I tried to install it and I had to get a DBA to come and help me for 4 hours to get it to install properly. Installing SQL Server is a 5 min job.

[TD] Oracle has never had a great rep for its usability…though that’s something that is changing with 10g. But even with SQL Server… 2005 is such a leap in complexity …surely the days of the one-man SQL Server shop, the guy who could do anything and everything, are coming to an end?

[AM] For small to medium systems, if you’ve done at least the basics right, then SQL Server will just run. A lot of the shops I go into run into problems because they didn’t even have a DBA…and they had grown beyond the point where they could cope with that. Eventually you do, but for small shops you still don’t really need a DBA and I think that’s the allure of SQL Server in many ways.

[TD] So what is “getting the basics right”?

[AM] It really is down to basic database design a lot of the time. You would not believe the number of shops I’ve gone into that didn’t have primary keys…they didn’t define a single PK, so there are no indexes in the database at all…no constraints on any of the data…

[TD] !!!????

[AM] It’s more common than you might think…it’s OK for me because I just come in, create a PK and the app is running 20x faster, and I’m the hero…but all I’ve done is what they should have done in the first place.

The inevitable stored procedure debate

[TD] In terms of application design, I know that you’re a big fan of using stored procedures. You’ve just written an article for Simple-Talk on the big stored procedure debate and it caused quite a bit of controversy and debate. Why do you think this topic so polarizing?

[AM] I think a lot of people are swayed by anecdotal evidence rather than actual experience. They heard someone had a bad experience with technique “x” and then they just brand “x” as “bad technology”. Also, there’s a bit of fanaticism around …if author “y” says stored procedures are bad then they must be bad–even if you don’t really understand the arguments. I happen to like stored procedures….probably because I’ve spent a so much of my time fixing problems with incorrect use of ad-hoc SQL!

[TD] There seem to be so many different strands to it…people talk about tying yourself in to SQL Server and having to make big changes to move to another database but I find that a tired argument…but I mean how many times do you actually need to migrate to another database?

[AM] The migration argument is very tired but not because it doesn’t happen a lot…it does…but because nobody using Oracle or DB2 or MySQL or any other platform is only going to use the standard SQL part of that platform. Everyone uses the vendor extensions anyway! And that’s because you have to. If you don’t you’re not fully utilizing the DBMS and your app will simply not perform as well

And even the standard SQL is not consistent from platform to platform so whatever it is you’re doing, you’ll need to make changes when you migrate. For example SQL Server fully “entry level” compliant to SQL-92 and supports bits and pieces of SQL-99…extensions such as recursive CTEs…and you can bet that Oracle will be different and DB2 different again.

[TD] I’ve always bought in to the idea that if you have logic protecting the data then the best place for it is in the database where it can’t be bypassed.

[AM] My rule is database as final arbiter of all matters data. Let the business tier deal with the business questions but if it’s a data question, put it in the database. That’s why I get into another set of arguments with people about business logic in database…I always ask them…well, what is business logic? They mumble a bit and generally have trouble answering.

That’s why I have a very tight definition of business logic that I live with. I partition things into data logic, business logic and application logic. Data logic is anything to do with data consistency, verifiability and integrity…the core data rules.

Above that we have the business logic. What do we do with the data, how do we manipulate the data to get the business answers that we require. This probably doesn’t belong in a stored procedure as often as it belongs somewhere else–i.e., in a business logic layer.

Then above that we have application rules…what color do I paint the UI? Where should this button show up? And so on. These are clearly UI questions and you want those as far out of the database as possible.

If you layer your application in this way you end up with solid data core that is verifiable and then you create multiple business layers over the data layer…all talking to the same data layer and following the same rules but answering different sets of questions independently of each other. Then if you like you can add multiple UI layers over those business layers. By properly layering you can eliminate dependencies and make yourself really flexible…but…that’s a long way off for some people.

What’s hot in 2005?

[TD] Before I came to Tech Ed, I spent some time on newsgroups and forums trying to seek out a common thread to the interest in SQL Server 2005 and came to conclusion that…there wasn’t one….there is just so much that is new that there seems to be no single issue that is more prominent than any other. Do you agree?

[AM] I spend a lot of time on newsgroups and forums. Most of last year things were really boring…I was answering the same questions over and over. The most common questions on forums were things like: how do I do dynamic parameters for a query? How do I concatenate string at a row/set-based level? I still see those questions…but suddenly a whole host of new questions are appearing.

[TD] So have you detected any particular “hot” areas?

[AM] The traffic on forums is kind of bursty. For example, I can log on to the CLR forum and find 20 questions posted in one day, then the next 3 weeks there will be none. Same goes for service broker. I never worked out how users all seem to arrive on the same topic simultaneously and then disappear again!

Initially there was a lot of interest in SQL CLR but now that has died out – I’m now seeing very few questions on that.

[TD] Why do you think that is?

[AM] I think people are still a little scared of the technology, but more than that… there really aren’t that many use-cases for it. A lot of the interest came from people who hadn’t really researched it. I’ve actually tried to use CLR on a couple of projects. I had a couple of scenarios where I thought it would be great but it turned out to have a few problems.

[TD] Performance-related?

[AM] Well there are cases where CLR will definitely out-perform T-SQL but you don’t tend to run into these cases that often. One example is heavy math — but not too many apps try to do heavy math in the DB…it’s done in the business tier so it becomes a non-issue.

Using CLR functions with regular expression is another popular sample that you see in books and online…but it turns out that if you filter data based on a regular expression from CLR UDF the optimizer can’t do anything with it and you end up with a table scan and horrible performance. This is a classic instance where you can have functionality or you can have performance – and for database apps, it’s performance every time.

Then there’s the 8000 byte limit on UDAs and UDTs…a lot of people are frustrated by that. It’s V.1 and it will be really interesting to see how it progresses but right now not that many people can use it.

[TD] If CLR, SSAS, SSIS, etc are the bells and whistles, what is the best really core new functionality that they have put in the database…what should a programmer be looking to exploit in their code right now?

[AM] For me the coolest thing is the new exception handling. I already did three migration projects this year and for the first one I really wasn’t up to speed on it, in the second I started to realize how useful it was and now I use it ALL the time.

[TD] What’s the primary difference?

[AM] Instead of just letting the error go up to the application you catch it in the database. If you care about the error you can log it right there. You don’t have to bubble it up to the app and tell the app to go back to the DB and log it. I usually catch the exception add a marker to it that says “I caught an exception here”, logging it and then throwing an exception back to the app. So the app knows an error occurs but I’ve already intercepted it, logged it, and maybe taken some corrective action. It’s amazing how many places you can fit it in to make your code more readable and flow a lot better, by handling exceptions in the data layer instead of the app.

That’s my number 1, but Service Broker is close behind. I’m working on an app now that has a lot of offline, disconnected asynch processes. They had written four different windows apps that would sit there and ping the DB from time to time waiting for a “start doing some work” message. We converted it all to SB and it all works amazingly well.

The database engine

[TD] Do you think the database engine is well instrumented? If you have made a mistake in your design that means your database is performing poorly, is it easy to find out exactly where the problem is? Again, taking Oracle as my reference point… if it is anything, the Oracle database is very well instrumented…you can get very granular, wait-based information out of the engine…you can find out exactly how long each small part of the whole operation took. That sort of capability wasn’t there in 2000 and I’m interested in how 2005 has changed that…if at all.

[AM] Yeah…I don’t think SQL Server is quite there yet. The new DMVs…Dynamic Management Views…are taking it a lot further and Profiler 2005 has added a whole host of new events, so you can get a lot more performance data out of the engine, but you still can’t get really granular information and it’s hard get it in real time.

[TD] OK, so you don’t often run into a performance issue and think…I can’t find where the problem is!

[AM] Yes that happens all the time! (laughs). In my experience, blocking issues are very common, as are IO issues.

[TD] Tell us a bit about the load tool you’ve developed to investigate execution times, IO stats and so on.

[AM] It’s currently a very simple load tool that I developed for readers of my book, to help people collect basic timing and IO statistics for themselves. It let’s you put in a batch of SQL or calls to stored procedures and call them in a loop. There is also a multi-threaded mode that let’s you have up to 200 simultaneous virtual users. Of course, when you’re testing SQL you don’t want same data to be queried over and over…because Server will just cache it. Performance will be great but guess what…you need to test the IOs. So the tool lets you substitute in values for parameters in the SQL. It runs a query and returns the values, then you can map those values into the queries that you are actually testing, so you can test a range of values instead of one value at a time.

[TD] Sounds very cool. Will future versions go any further then that? For example, provide stats on parsing and latching etc so you can gauge a solution’s likely scalability as well as performance?

[AM] At the moment, my tool just does very simple reporting on logical IOs – logical reads, which includes both physical and cache reads. It also gives CPU timing stats and total time reporting. It pulls these out using the simple SQL SET STATISTICS IO ON/TIME ON, and with the result of the query SQL Server sends back the stats, so this makes it very easy.

Going a step further I would have to latch into the tracing API and trace in the background while running a load and then correlate based on process IDs. But if I put that much work into it, it will move from the free product range to the paid product range…so I apologize to my readers in advance if that happens–but I plan to always maintain a free version, either way.

Is there life after SQL Server?

[TD] A bit of an off-the wall question …but if you suddenly found yourself working on Linux, which database would you use?

[AM] Easy question. Easy question. PostgresSQL. I’ve been studying it as much as possible in my spare time…I’ve installed it (on Windows as it happens). It’s an awesome DBMS and has some really cool features

[TD] You surely can’t mean that (mock horror)… it does some things better than SQL Server?

[AM] Oh yes. For example, it has a lot of index types that SQL Server doesn’t support. It also has this really interesting API that lets a user create their own index types…it’s an open index API. I’m not yet advanced enough to use it but you feed it an algorithm (e.g. tree algorithm) and it will index your data based on that. Very cool.

[TD] Do you ever see yourself making the switch?

[AM] I’m not tied to Windows/SQL Server. Well…I know it really well right now and it would be painful to switch but I’d be open to the idea. But right now I’m very happy in this space. I really enjoy it…it’s a great community to be in. You get a real sense for that…especially at conferences like PASS, where you really feel that you are part of something.

Winding down

[TD] So where are the best online resources? Which are the handful of places that you visit every day?

[AM] You mean apart from Simple-Talk.com of course (laughs)

[TD] Simple-Talk you say. Interesting. How do you spell that…S.I.M.P.L.E….(laughs)

[AM] Microsoft seem to be putting a lot of resources into making the MSDN forums (forums.microsoft.com/msdn) very high quality – they have a lot of really good MS guys answering questions there and I tend to visit them most days. I also hang around a bit on sqljunkies.com and answer questions on the forum. In all honesty I don’t actually surf round that much…I tend to stick to the sites I know.

[TD] What about outside of SQL Server. Any beer related sites? I know you’re a keen home-brewer…

[AM] I used to go to brewrats.org, but one of the best one is beeradvocate.com. Actually, though, a lot of the non-SQL Server stuff I read is food-related. One of the best sites out there is actually run by Kalen Delaney’s daughter -travelerslunchbox.com – where she posts recipes but also the background story to it, like where she found the recipe, plus some gorgeous photos of the food. It’s a really great site.

[…big cheer in background as Italy score against Ghana in the World Cup…]

[TD] So you are a keen cook?

[AM] Yes, mainly Asian food…Japanese, Indian…lots of spices!

[TD] I always wished I could cook a really good curry.

[AM] I actually created a great fish curry…I’ll send you the recipe and you can check it out.

[TD] Cheers Adam

[Ed Note: Adam did send me the recipe…I will try it out and report back soon in my blog]