Programming below SQL

  • Dear Forum,

    just as well as other high-level languages, SQL trades useful programming features and ease of use against performance and low-level access. In the old days, when performance became an issue, it was always an option to go below the high-level language (i.e. use assembler) to gain maximum performance at the price of more development effort.

    I have never seen anything like this for SQL. Has anybody seen useful SQL-Server tools or applications (including data access), which use machine-near programming below SQL? Maybe direct access to the .MDF / .LDF files ?

    Kay

  • SOrry SQL is not desgined to give you near machine access as it is a proprietary system. They do provide various methods to talk with objects they provide for some things but it all depends on what you are actually after? As for anything below SQL itself that would be outside the definition of SQL as a language and methods of accessing Relational Databases.

  • Hi Kay,

    An interesting idea :-). Wasn't SQL invented for the purpose of hiding data retrieval details from the end users/programmers ? One of the most important SQL features was its declarativeness. In many articles I found an explanation that a declarative language means that the programmer should only say WHAT he needs to obtain as a final result, and not HOW it should be executed. In these days we all know that this is not exactly true, especially when you should squeeze the last CPU clock cycle from a DB server or execute a demanding SQL. Thus, the declarativeness of SQL seems to me more like an illussion that should encorage non-IT people to start developping database applications (and mess something up at the end :-)). I accepted the fact that it DOES make a difference how you write the SQL, although some platforms have good optimizers that derive the optimal execution plan.

    Now, about your question: there are some log mining/exploring tools that recognize the transaction log file format and display it and/or allow object level recovery. Actually, some of them were advertised on SQL Server Central. I think that the MDF/NDF file formats are well known or can be easily cracked in time. Anyway, I think that these tools are intented to help in database recovery, not to improve performance. If there is a way to bypass the whole DB storage/indexing/whatever engine and access the data files directly, aren't we going back to the old days, when the client application was well aware of (and both dependent on) the undelying data storage and structure ?

    And finally, my oppinion: to get the best performance, a developper/DB admin should:

    - ensure optimal database design (both physical and logical),

    - work with illustrative test data (both by size and by distribution),

    - write smart SQL statements, and

    - as a LAST RESORT, maybe use platform specific hints to force the optimizer use the optimal execution path, but only if it is absolutelly necessary.

    Regards,

    Goce.

  • If you know in advance absolutely every way in which your data will ever be used, then of course some kind of dedicated data manager will be more performant than SQL Server. But do you REALLY want to go back to ISAM files (or worse!) and sacrifice all the delights of a proper RDBMS just for a few ms?

    In some applications that require ultra-fast data management, the answer is yes. An example that springs to mind is massively multiplayer online games, an arena in which the performance demands dictate that the on-the-fly data management is done with structures that are little more than flat files (Offline and aggregation tasks are done with more friendly tools). However, for typical 'business' applications it would be hard to justify throwing away the power of the relational model.

    If you are up against a performance wall and it seems to be SQL Server that is holding you back, I would recommend something like:

    get all data for this run from SQL Server onto a client; transform it into a perfomance-oriented data structure; do the performance intensive thing; transform the results into a relational structure; store results in SQL Server

    rather than

    move entire data handling system from SQL Server to performance-oriented data files; bang your head repeatedly against desk when you have to do all your JOINs and ORDER BYs and WHEREs and indexing yourself...

     

  • One might argue that this is a violation of Codd's rules 5, 8, 9, 10, 11 and 12. You can read about them for example here:

    http://www.sqlservercentral.com/columnists/fkalis/coddsrules.asp

    To me it seems strange that people focus that much on performance in a relational database system. The main reason people use such a system is the reliability and flexibility. If you only focus on speed network or hierarchical database systems like IMS (when will this finally rest in peace?) should provide many times faster access.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Frank

    I wish you can relate your message to some of my clients -NOT to focus so much on performance....

     

  • If performance is an issue, re work the design or buy a bigger box.

  • I cannot see where having machine code access would have helped me in any job I've worked in.  In all of the situations, the worst offenders as far as speed & efficiency goes occurred because of:

    1) Bad database design (poor schema, under/over normalization)

    2) A lack of index tuning

    3) poor front-end / middle tier code

    I had a friend who complained about how much faster his database software ran after he switched from sql server to sqllite.  The guy had a terrible database programmer before, and nobody ever tried making the database design more efficient.  Go with the obvious problems first before trying your hand at opening up the patient's skull and trying neurosurgery.

  • I wish you can relate your message to some of my clients -NOT to focus so much on performance....

    I know, I know...

    Performance definitely is important, but it is NOT everything. But I am aware, that there is a discrepancy between theory and praxis. Each and everybody seems to expect a boost in performance when the move the a full-blown RDBMS is done. I don't know where's the reason for this, but I suspect clever sales people to spread this word, and people who buy software usually don't have the experience to judge on it.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Direct access to MDF/LDF files while the server is running?  Are you thinking about updates, or just read-only?  And after you've figured out how to hook in to all the locking, caching, and logging functionality of SQL Server you imagine you might have something that would actually be faster and would not corrupt the databases?  You realize that your code will forever be blamed for any future software issue with the server (probably both SQL and OS), so you're relying on having Microsoft start releasing completely bug-free code?  I'm having a very difficult time suppressing some very sarcastic and/or insulting responses, but please reassure us that we've talked you out of it.

    The easiest way to add performance is to analyze your design and make sure it is optimal for your application.  Maybe you need to add or change indexes, denormalize some tables to reduce join I/O, distribute large tables across multiple filegroups, etc.

    Are you mixing OLTP and DSS activity in the same database?  Maybe you should be replicating data from active transactional databases to heavily-indexed read-mostly DSS databases on another server(s).

    Maybe you should look at hardware: more RAM, more disks, faster processors, more servers.  Whatever the hardware costs will be infinitely cheaper than the software effort you're talking about.  Do you know what your bottleneck is?  Have you mastered Perfmon and SQL Profiler?  I have a Dell server with attached storage (14-disk SCSI Powervault) that runs like crap.  (We have another almost-identical system, same Powervault, runs great.)  The disk queue length, which should be in single digits, regularly runs into the hundreds or thousands.  I know its a problem, but I don't know exactly what to do about it.  If I knew a fraction of what I would need to know to start writing low-level code for SQL Server, I'd know twenty different ways to configure it better.  My company doesn't want to pay me to spend time figuring it out, so they're buying me a SAN.

    If you are sure the database is designed correctly, maintenance plans are minimizing fragmentation, and you've got all the RAM and disks you can use and/or afford, start looking at your data layer for coding opportunities.  Can you add some caching to your access methods?  Store some lookup tables locally, or in memory on the client?  If you're old enough to talk about the old days, you can probably do something useful with local flat files (DBF for instance) for lookups and just write complete transactions back to the server.

    You can write code to execute inside SQL without interfering with it, have you looked at external stored procedures?

    Some of these suggestions may be things you've already thought of, and there are undoubtedly a lot of other ideas I didn't think of that are as good or better, but my point is there is a huge list of things I would try before I considered writing code to get inside of SQL Server.  I know that if I did, the first server crash or data corruption would be my last day at this company.

  • I forgot to mention the issue of having to rewrite all your code for each new release of SQL Server, maybe even for each service pack.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply