Alternatives To Left Join: Poor Performance of Procedure

  • Hi,

    I have a question here for a query where the tables involved are not too big but I suspect that the left join in the query is slowing things little. I have 2 tables involved here: Player and PartnerPlayer. The PartnerPlayer has a field to be returned and there is one-on-one relation between these two tables on a column called PlayerId.

    So basically I am trying to get the data from Player table and 1 column from PartnerPlayer table. For the players who have entry in PartnerPlayer, an account number is displayed else NULL is returned for those rows from Player table who does not exist in the PartnerPlayer.

    Please let me know if the query can rewritten given that each of the tables involved have around 100k records.

    Thanks

    Chandan Jha

  • Sounds like a NOT EXISTS, but it's not going to be measurable difference in performance just from changing that.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi

    You have an existing index

    CREATE NONCLUSTERED INDEX [IX_Player_partnerIddtCreated] ON [dbo].[Player]

    (

    [partnerId] ASC,

    [dtCreated] ASC

    )

    INCLUDE ( [playerId])

    If you take a look in your execution plan you'll see a missing index. You can create it, but it's wiser if you could extend the already existing one [IX_Player_partnerIddtCreated] with the following definition:

    CREATE NONCLUSTERED INDEX [IX_Player_partnerIddtCreated]

    ON [dbo].[Player] ([partnerId] ASC,

    [dtCreated] ASC)

    INCLUDE ([playerId],[partnerId],,[alias],[forename],[surname],[firstGame],[firstMoneyGame])

    Tables are not big so you have a performance issue.

    Regards,

    IgorMi

    Igor Micev,My blog: www.igormicev.com

  • And ditch the NOLOCK hint.

    If the table has 100K of records, this looks like it's also returning over 100K of records. Is the WHERE clause not filtering at all?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (12/17/2013)


    And ditch the NOLOCK hint.

    If the table has 100K of records, this looks like it's also returning over 100K of records. Is the WHERE clause not filtering at all?

    Sir, this application has nolock hints everywhere because this is a very very heavy gaming database and this is what the management wants to stick to.

    Also, you are seeing so many records returned because the execution plan that I sent you is from a query for the partner which had most number of players.

    I think a lot of such joins occur because of bad schema during the initial phase and not many predict how the data is going to behave 1 year later. also, in this case any indexing is not helping.

    Thanks

    Chandan

  • Try to execute the procedure without using 'dbo.ConvertTimeZoneFromUTC'. Table 'PartnerPlayer' is missing an index. (Missing a Primary key, there Should be). You can also use NOT EXISTS instead of Left Outer Join.

  • chandan_jha18 (12/17/2013)


    Grant Fritchey (12/17/2013)


    And ditch the NOLOCK hint.

    If the table has 100K of records, this looks like it's also returning over 100K of records. Is the WHERE clause not filtering at all?

    Sir, this application has nolock hints everywhere because this is a very very heavy gaming database and this is what the management wants to stick to.

    Also, you are seeing so many records returned because the execution plan that I sent you is from a query for the partner which had most number of players.

    I think a lot of such joins occur because of bad schema during the initial phase and not many predict how the data is going to behave 1 year later. also, in this case any indexing is not helping.

    Thanks

    Chandan

    And management is fully aware that NOLOCK means that they can return bad data, inaccurate data, missing rows, extra rows, and they're OK with that? It's pretty rare the business that doesn't care if the same query run twice returns completely different results without data changes.

    The thing is, you're only joining a couple of tables. That's pretty normal for a relational database. In fact, joining together 6,10 even 20 tables isn't completely insane on a relational system. The issue is, you're looking at table scans. No matter how you do your joins, table scans are likely to lead to slow performance. So you need to determine how to reduce those scans AND get good performance on your joins and your order by statements. Further, it looks like PartnerPlayer is a heap table. That's likely going to cause a number of issues too.

    But, I'm still back to the amount of data. The table cardinality is 2.7K and this query is returning 1/2 of it. You're probably only ever going to get scans on that level of data retrieval. I get that this is an outlier in terms of the data, but simply retrieving it the same way as other data is going to lead to this situation, too much data going through for it to perform well. You need to filter more. Unless this is an export, no one consumes 100K rows of data. So they're likely doing something like looking at the top 100 or sorting by region or something, build that into the query so that you're filtering better and then you can apply meaningful indexes that help performance. As long as you're returning half the table, you're stuck.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Rewriting the query and/or adding nonclustered indexes won't help, unless you basically rewrite the entire table by including a gazillion in the nonclus index(es) -- and constantly maintain it as another column gets referenced in a query.

    The way to really solve these types of query issues is to first get the best clustered indexes.

    If you (almost) always specify a date range when querying dbo.Player, as in the sample queries/code you posted, change the indexes as scripted below.

    Naturally this may take quite a while, as the tables will have to be written/rewritten during this process. Of course remove the "ONLINE = ON," if that doesn't work on your system.

    ALTER TABLE dbo.PartnerPlayer DROP CONSTRAINT fk_PartnerPlayer_partnerIdPlayerId;

    DROP INDEX IX_Player_partnerIddtCreated ON dbo.Player;

    ALTER TABLE dbo.Player DROP CONSTRAINT PK_Player;

    CREATE CLUSTERED INDEX CL_Player ON dbo.Player ( dtCreated ) WITH ( FILLFACTOR = 99, SORT_IN_TEMPDB = ON ) ON [PRIMARY];

    CREATE UNIQUE CLUSTERED INDEX CL_Player ON dbo.Player ( dtCreated, partnerId, playerId ) WITH ( FILLFACTOR = 99, SORT_IN_TEMPDB = ON ) ON [PRIMARY];

    CREATE UNIQUE NONCLUSTERED INDEX IX_Player_partnerId_playerId ON dbo.Player ( partnerId, playerId ) WITH ( FILLFACTOR = 96, ONLINE = ON, SORT_IN_TEMPDB = ON ) ON [PRIMARY];

    CREATE CLUSTERED INDEX CL_PartnerPlayer ON dbo.PartnerPlayer ( partnerId, playerId ) WITH ( FILLFACTOR = 96, SORT_IN_TEMPDB = ON ) ON [PRIMARY];

    ALTER TABLE [dbo].[PartnerPlayer] WITH CHECK

    ADD CONSTRAINT [fk_PartnerPlayer_partnerIdPlayerId] FOREIGN KEY([partnerId], [playerId])

    REFERENCES [dbo].[Player] ([partnerId], [playerId]);

    ALTER TABLE [dbo].[PartnerPlayer] CHECK CONSTRAINT [fk_PartnerPlayer_partnerIdPlayerId];

    Edit: Changed index definition.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply