Query Performance with multiple joins

  • I have a query with multiple inner joins and left joins (about 40). When I run the query it takes about 20 seconds to return 3000 records.

    When I get the records without the joins it takes 1 second. When I run the query with all the joins but returning columns from the main table only it also takes 1 second.

    How can I improve the query? Any tricks with joins? or is this a performance cost I have to pay for using joins?

  • No way to tell from the information provided. Do you need all the joins? How many rows in the main table? It's possible you could carve the query up into smaller chunks and get much better performance. IE. If there are 200,000 rows in the "main" table and you know you only need 3000 of them and that query is nearly instant, consider running that first, storing the results and all fields from that table that you need in a temp table and then using the temp table in the query with all the other joins.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Other questions:

    How do you get the rows without the joins?

    How big are the tables you are joining to?

    Are the joined tables indexed?

    What does the query execution plan look like?

    Joins can be very fast and efficient, or very slow, depending on the size of the tables, the indexes in place to support the join, whether or not you are trying to join based on functions or substrings of a column, etc. Lotsa stuff. Do NOT just say to yourself "All JOINs are slow." There's a whole lot more to performance tuning than you suspect.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The tables I'm joining with contain additional information that need to be displayed. The total number records in the the main table is 150,000. I tried, as you suggested, querying the main table first for the records needed from the main table. It still takes a long time to join the result "tmp" table to all the other tables.

  • When I get the records only from the main table, I don't have all the information I need to display.

    The tables I'm joining to are not as big as the main table (between 100 to 8000 records).

    The joined tables are indexed. Primary key is what I join on. Not all tables have index for the column displayed.

    The execution plan shows that each join takes about 2-3% of the time (except for one that takes 7%).

  • Can you post the query and the execution plan, and if possible sample data & structure.

    It's hard to know what the precise problem is without seeing this information. I will say that 40 tables, is getting to the edge of what constitutes a safe operating range in SQL Server. I've seen joins up to 86 tables, but the compile time was over 3 minutes.

    From what you've said, again, no evidence, it sounds like your indexes are not laid out to the query, or the query isn't written to take advantage of existing indexes.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • csillag.yitzik (4/29/2010)


    I have a query with multiple inner joins and left joins (about 40). When I run the query it takes about 20 seconds to return 3000 records.

    When I get the records without the joins it takes 1 second. When I run the query with all the joins but returning columns from the main table only it also takes 1 second.

    How can I improve the query? Any tricks with joins? or is this a performance cost I have to pay for using joins?

    Sounds like an "agressive GET" for Hibernate or some other ORM and Hibernate (for Java) and nHibernate (for C#) have a bit of difficulty writing good code for such things for the same reason humans do... it thinks it has to all be done in a single query which, as you can see, can certainly be a huge performance problem.

    You'll need to analyze the code and find out where the "minimum result set" comes from and isolate those rows in a Temp table. Then, join on the Temp table so the query doesn't have to filter out so many rows from so many tables. I've turned many a 45 minute reporting query a nice gentle <3 second kitty cat using that same method.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • csillag.yitzik (4/29/2010)


    I have a query with multiple inner joins and left joins (about 40). When I run the query it takes about 20 seconds to return 3000 records.

    Can you give us some insight into why so many joins are necessary? It is unlikely that the current structure represents optimal design.

    Just a brief description of the data structures will do.

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply