SQL2005 vs Teradata

  • Hi ...

    The organization I am employed at is currently considering moving our datawarehouse environment from SQL2005 to Teradata. It seems that some executives are convinced that Teradata is going to solve all our problems:w00t:. I am not so sure.

    Is there anyone who has had experience with Teradata in this sort of scenario?

    Thanks in advance,

  • It's good. I wouldn't consider moving to it without really good reasons though. You need to really do the analysis of what your current problems are, what the solutions would be in AS 2005, and then examine the TeraData solutions.

    My only experience with it was that the total cost of ownership was very high. The initial setup and licensing fees are just the beginning.

    At the end of the day, most executives are focused on the bottom line. Data is important, and getting it in a timely fashion and having it reliable is worth a lot, but they will probably be happy with the lest expensive solution. Suggest a full CBA for both directions (a lot of work for you probably) that examines sticking with MS technology (remembering that AS 2008 and Office 2008 give you some great new tools) or making a changeover to TeraData.

    You may be surprised at the outcome, but I would expect TeraData to be much more expensive short and long-term.

  • I agree with Michael - Teradata is a great data warehouse platform, but it's also expensive as all get out from software licenses to employee costs. You can reasonably expect to pay a heck of a lot more for a Teradata DBA than a SQL Server DBA. Don't get me wrong either, those Teradata DBA's that I've worked with knew their stuff.

  • I don't always think it is just a matter of costs. If moving to Teradata solves the problems you are having, it could pay itself back in the long run. I worked with Teradata a long time ago as a programmer and found it to be very powerful especially when working with very large amounts of data (in terabytes, hence the name teradata).

  • I've had some experience connecting to a client's Teradata system to run queries and moving data back and forth to my SQL Server systems. I didn't like the client tools, and I despised the drivers provided for setting up a Teradata linked server. The only efficient way to get data in and out of Teradata was to use their FastImport and FastExport tools which look like they were written in the 70's. To be fair, my access was as a mere user over a slow VPN, and I know nothing about the skill level of the people setting up the databases. If I was a DBA at the main location surrounded by all that hardware (and directly connected to it) I would undoubtedly have a better experience.

    This may be a gross oversimplification, but it seems to me that Teradata's big advantage is that tables are automatically partitioned across all nodes and are operated on in parallel. In SQL Server you have to manually partition tables, which requires some thought and effort. In practice I found that the Teradata tables I wanted to use were not partitioned by the columns I wanted to join on, so it wasted a lot of time spooling the results from all nodes so it could redistribute them partitioned by the field required by the next step in the execution plan. I had to rewrite and break up large queries to figure out what kind of temp tables I needed (and what to partition by) to get an efficient process. Almost all of my T-SQL optimization techniques were useless. The automatic partitioning is worthless if it makes the wrong choices, so if you have to analyze the tables and figure out the best partitioning scheme in advance I don't see that this is any different from defining the partitioning under SQL Server Enterprise. On the other hand, I don't know how to compare the hardware for a truly huge data warehouse. You can keep adding Teradata nodes until you have a massive system with hundreds of nodes operating in parallel. I have no idea what an equivalent SQL Server system would look like, or what the design and operating challenges would be.

    A Teradata system is immensly powerful, and immensly expensive. All your SQL Server DBAs and developers will require significant training to adapt, not just to learn Teradata idiosyncracies but to unlearn some SQL Server habits that can cause problems. The company would be well advised to hire at least a few people that already know Teradata to help train the others, and it wouldn't be surprising to find out that some of your best SQL Server talent might jump ship to sell their existing skills rather than start over.

    A valid comparison is not between what Teradata might do for you and what SQL Server is currently doing, but what SQL Server might do if they invested all that Teradata money into improving your SQL Server hardware and personnel. I would be annoyed at upper management suggesting a different approach if they're not buying requested hardware and training. How many of your DBAs went to the PASS conference, for instance?

  • What the problems you're experiencing? That might be the first step in getting advice. I think both platforms perform well. It's more a matter of the application working well with the platform.

  • The problems I had would probably not be an issue if I worked primarily with Teradata, or at least had access to a DBA that knew what they were talking about. We were dealing with a client system and couldn't talk directly to their IT people, we had to go through our account managers and several layers of marketing nimrods at the client to get anything done. With linked servers we had trouble getting the OLEDB provider to work and the ODBC driver was pitiful. Not only that, but the same ODBC driver could load data into Access several times faster than it would work with SQL Server. The alternative to linked servers was Teradata client tools that could do batch uploads & downloads, but they were really annoying to setup and modify.

    Most of the pain and frustration was due to how we had to use the Teradata system, and has nothing to do with Teradata vs. SQL Server. We were limited to a single slow VPN connection, and were not able to talk directly to their IT people. You can try to teach yourself syntax from the online manual, but that doesn't help when you need to deal with changing IP addresses and firewall issues. Also, I didn't work with Teradata often enough to remember all the syntax details. Once we had a data transfer project set up it would just run as a scheduled job, it might be months before we had to change anything or start a new project. And finally, its a big letdown to go from Master of All I Survey (in the server room, at least) to a lowly user limited to read access on views.

    The point I wanted to make about Teradata was not that it is inferior in any way to SQL Server, but switching to it is very expensive and will cause training and possibly turnover headaches. To compare the two systems I think it is only fair to consider what kind of performance you'd get from SQL Server if you spent the same money on bigger hardware and training.

    Finally, I object to the switch being suggested by upper management instead of the DBAs and developers who are familiar with the current system and probably have a better idea what it would take to improve it. Is this a smart CIO who knows what they're talking about, or some PHB who just read an article in a trade journal? Several years ago I worked for an internet startup where we build a working system on SQL Server and ASP. Then the CFO convinced the boss that he'd make more millions in the IPO if we switched to Java and Oracle. Did they retrain the developers that created the system, or did they lay them all off and outsource the job to a consulting company owned by the CFO's cousin's brother-in-law? I think you can guess the answer from my cynicism.

  • Hi all...

    Thanks for the replies. The comments are really insightful.

    The problems in our MIS environment are numerous. They range from political to technical. I personally do not have control over the datawarehouse itself. My area is confined to reporting based on the existing data and creation of new data structures for various projects. The problems put forward by the "powers that be" are:

    1) Slow data load times from legacy systems (DB2)

    2) Long processing times for generation of reports

    3) Need to integrate data from additional legacy systems into the the DW. (they say that SQL will not be able to handle the large volume. :P)

    Personally I agree with what Scott Coleman said about looking at investing the money into improving the current SQL2005 infrastructure and personnel, rather than spending on a totally new system.

    In my opinion the problems with our warehouse are:

    1) Incorrect DW design/structure

    2) Incorrect h/w setup

    3) People making critical design decisions do not have proper SQL skills:w00t:

    4) Incorrect ETL process being used

    Frankly, I do not think that we are using our current systems to their full potential. Further Teradata comes as a complete solution (h/w and s/w) with direct attached storage whereas our current installation uses a SAN which is shared with other systems/servers. So Teradata may improve certain query processing times but it does not solve the issue of bad design. In truth, I do not believe that a true comparison has been made between our current SQL2005 and Teradata.

    I look forward to your comments.

  • I echo what Scott said.

    I don't believe the client tools for Teradata are as good as for SQL Server. The only real tool available is the equivalent of Query Analyser (called Queryman)which is fine if you don't use Enterprise Manager at all (I use EM a lot and missed it when using Teradata).

    There is no real equivalent of DTS that I know of (I've used Teradata but admit I an no expert) and scheduling jobs is a pain - we had to create a text based script and upload it to a unix server and schedule it as a cron job (similar to Windows AT command).

    Error trapping and handling only seemed available in scheduled jobs - I couldn't seem to trap @@ERROR in the Query Analyser equivalent. Also, I couldn't define any variables so if I wanted to have a date parameter I would have to insert the date into a volatile (temp) table and use it in a join. Not a show stopper but a different way of thinking.

    Don't let them beat you up about size. Tesco (the larget supermarket chain in the UK) use SQL Server for their customer loyalty program - millions of customers, hundreds of millions (or billions) of rows, terabytes of data. SQL Server can handle big volumes of data if it is set up and coded properly. Badly designed code will always run slowly.

    Another gripe is the way we were allowed to use Teradata - no functions, no procedures just plain SQL. A bit of a step backwards compared to what I am used to with SQL Server. Also, Teradata is a company wide warehouse with thousands of users and the DBAs implemented a query manager (called TASM) - anything that the query plan said would take more than 20 minutes (or whatever) got put into the slow queue where only one query at a time ran - this was a real pain as our scripts would have mutliple steps and the total estimated execution time would exceed the limit and get put into the slow queue.

    If you are using SQL Server, you are probably using reporting services. I don't believe that there is anything similar in Teradata. To produce reports, people would either copy and paste the results from a query into Excel (a manual task) or use Cognos. Don't forget that the primary purpose of a warehouse is to get data out to decision makers and that the reporting will need to be considered as well.

    I prefer SQL Server (in case you were in any doubt) because I feel much more in control of my own environment and I have used it a lot more than Teradata. Migrating to Teradata will cost a lot of money (hardware, software and people time to re-engineer everything). If you migrate to Teradata, the warehouse will probably run faster in the same way a V8 Mustang is faster than Ford Focus - it will be a much bigger spec machine and you are not comparing like with like.

    I would suggest that you persuade management to spend a little time and money trying prove that you can solve some of the problems (such as load times from the source systems) and that the problems are not due to SQL Server not being up to the job.

    Jez

  • All great points and feedback. If I could add one more wrinkle:

    Since you posted this into the BI forum and since we're talking about data warehousing, you really should consider using Analysis Services as part of your reporting / BI strategy. The speed, optimizations, flexibility, etc. will be dramatic. But I'll add the following caveats:

    - It won't fix data warehousing design issues. Indeed, once you get up to speed, it may exasperate your frustrations with inconsistencies in the current design.

    - You'll need to do the education, particularly around MDX. It's close enough to SQL to drive you batty for a while.

    - RS works with AS, but it's not designed for business intelligence. It (and tools like Excel) will get you started, but you'll be wanting more.

    AS is included with your licence (make sure you're running Enterprize) so that'll save on software, but you'll need a little more disk for the cubes (in my experience, budget 20% + 20% + 10% of the database size). It'll pay off in the long run.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply