Help me: Many Tables to join.

  • I have around 10 tables which needs INNER JOINs to fetch data.

    Q: i came across that, "joining such many tables may degrade the performance". is it true?

    if yes, may i know why it is a bad practice.

    And, please tell me the alternatives to go with.

    Always appreciating your helps.

  • the answer is already in the quote you used

    "joining such many tables may degrade the performance"

    ...it all depends.

    Do all the tables have indexes that are well maintained and cover the joins that you are using

    Are you doing any aggregations?

    What volume of data is in the tables

    do you perceive there is a problem with the query and have you looked at the execution plan / io stats etc

    .

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Reason for my question is that, this SP is listed under 'Memory/CPU intensive Queries' by our performance tool.

    * Index : All Tables are having Clustered Index

    * Covered : Only 50% columns are COVERED.

    * Volume : 1 table has 6 million rows, 3 tables having .3 million (3 lakhs) rows, others less than 100 (master tables)

    it's typical structure is

    select

    Column1,

    Column2,

    Column3,

    Column4,

    Column5,

    Column6,

    Column7,

    Column8,

    Column9,

    FUNCTION_CALL(Column9) As Column10,

    FROM

    Table1

    INNER JOIN Table2 ON ....

    INNER JOIN Table3 ON ....

    INNER JOIN Table4 ON ....

    ..

    ..

    INNER JOIN Table10 ON ....

    Could this FUNCTION_CALL at SELECT list be a problem? and this Function does an UNION on 2 tables having some 10 rows each.

  • Please post the actual execution plan.

  • A scalar UDF accessing data in the select clause of a query, yeah it's almost certainly a problem, UDFs are (and they don't show up in execution plans or IO statistics)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • In addition, that function acts as a correlated subquery; it is executed for each record in the result set. Although it may be just a union of two tables with 10 records each, I'm sure the Column9 acts as some sort of filter in the function.

    If that union of the two tables were persisted in a temp table, memory table or CTE, within the code, then that table used in to the query, that may improve your performance as well as reduce your resource consumption.

  • I agree with Gail. Try to comment that UDF to see if it improves performance.

    I see some UDFs everyday with heavy calculations and loops that cause performance degradation.

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • GilaMonster (1/21/2012)


    A scalar UDF accessing data in the select clause of a query, yeah it's almost certainly a problem, UDFs are (and they don't show up in execution plans or IO statistics)

    +2. That's your most likely culprit of any serious performance issues.

    Not using a Where clause on a 3 million row (*100*100*100... for the rest of the tables) is also going to cause tremendous pain. I'm assuming there is one because I can't see any reasonable reason to dump out 3 trillion rows to an end user, and you've just shortened your example. However, if there isn't a where clause on this puppy... yeaaaaahhhh... that's gonna hurt.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Yes Evil Kraig F,

    It has the WHERE clause, just I shortened it.

  • Is there any chance for you to post the actual execution plan as asked for before?

    Otherwise all we can do is guessing and provide some rather vague recommendation...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • But, on the question of "Too Many Tables to Join"? No, I don't think 10 is too many tables. I wouldn't even worry about "too many" until I was looking at 15-20. I've seen an 86 table join run very fast (compile time was 3 minutes, but it ran fast). X number of joins is not good and Y number of joins is good. It's the code and structure around those tables, not simply the number of them.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Sorry LutzM,

    I dont have permission to PRODCUTION Env.

    Also, i dont know how to paste here the execution plan.

    I regret i couldnt give you people the enough info in a single shot.

    pls bear with me.

  • http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • After you remove the UDF (which is MANDATORY!!!!), you may still have a classic performance tuning problem.

    a) are all necessary indexes in place

    b) the problem with increasing numbers of tables joined is that even with very good (i.e. freshly updated) statistics in place you can still be at the mercy of having even small value-skew issues lead to HUGE differences between estimated and actual row counts. Should that situation be in play you will likely see a big win from disassembling the query into one or more intermediate temporary tables. That will allow the optimizer to get the optimal plans for each sub-part of the whole. I have seen 4-5 orders of magnitude perf increase from such refactors.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Plan Attached

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply