Help with SQL Query

  • I need help with a query for performance issue. We are running SQL Server 2008R2. I am developing a query for a report.
    I have a person table and a personPhoneNumber table.

    These table contain many more columns in reality but I am simplifying to narrow to the issue.

    Table: tblPerson
    PersonID int PK
    FirstName VARCHAR(100)
    LastName VARCHAR(100)

    Table: tblPersonPhoneNumber
    PersonPhoneNumberID int PK
    PersonID int
    PhoneNumber VARCHAR(20) 
    IsPrimary BIT
    PhoneNumberType int (1-Home, 2-Business, 3-Cell)
    LastUpdate DATETIME

    The report requirement is for me to pull the Primary Phone Number for each Person record JOINED by PersonID. That should have been simple BUT we have a data issue where
    for any given PersonID in the tblPersonPhoneNumber table, there can be many PhoneNumber records and PhoneNumberTypes. There SHOULD be only one record with the IsPrimary bit set
    to true. But the data issue is that we have more than one record with IsPrimary set to true. (ie. a Cell Phone and a Home record both set to true). So a new requirement was made to get around this and
    that is to chose the IsPrimary record that has the more recent LastUpdate date only. I have tried a few things below but the return results are taking too long. Best time was 35 seconds and worst was like 3 minutes from all the things I tried just to return one Person with Primary Phone. One problem is the tblPersonPhoneNumber contains over 5 million records.

    Here is what I have tried:

    CREATE TABLE #PersonPhoneNumberMaxLastUpdate
    (
     PersonID INT PRIMARY KEY NONCLUSTERED,
     PhoneNumber VARCHAR(20),
     LastUpdate DateTime 
    )
    CREATE INDEX IX_PPNMAX_LastUpdate ON #PersonPhoneNumberMaxLastUpdate(LastUpdate)

    INSERT INTO #PersonPhoneNumberMaxLastUpdate
    (
     PersonID,
     LastUpdate 
    )
    SELECT
     PersonID, 
     Max( LastUpdate )
    FROM tblPersonPhoneNumber
    WHERE IsPrimary = 1
    GROUP BY PersonID

    SELECT
    p.FirstName,
    p.Lastname,
    ppn.PhoneNumber
    FROM tblPerson p
    LEFT JOIN
    (
     SELECT a.PersonID,a.PhoneNumber FROM tblPersonPhoneNumber a
     INNER JOIN #PersonPhoneNumberMaxLastUpdate b ON a.PersonID = b.PersonID AND a.LastUpdate = b.LastUpdate
     WHERE a.IsPrimary = 1
    ) ppn ON ppn.PersonID = p.PersonID
    WHERE p.PersonID = 1

    DROP TABLE #PersonPhoneNumberMaxLastUpdate

    In the above code it was taking too long to load the temp table. .
    I have also tried embedding grouping within the join instead of loading a temp table below:

    SELECT
    p.FirstName,
    p.Lastname,
    ppn.PhoneNumber
    FROM tblPerson p
    LEFT JOIN
    (
     SELECT a.PersonID,a.PhoneNumber FROM tblPersonPhoneNumber a
     INNER JOIN (SELECT PersonID, Max( LastUpdate ) [LastUpdate]
        FROM tblPersonPhoneNumber WHERE IsPrimary = 1
        GROUP BY PersonID ) b ON a.PersonID = b.PersonID AND a.LastUpdate = b.LastUpdate
        WHERE a.IsPrimary = 1
          ) ppn ON ppn.PersonID = p.PersonID
    WHERE p.PersonID = 1

    and finally I tried putting sub select code in a View like this and joining to the view

    CREATE VIEW vwPersonPhoneNumberMaxLastUpdate
    AS
    SELECT a.PersonID,a.PhoneNumber FROM tblPersonPhoneNumber a
    INNER JOIN (SELECT PersonID, Max( LastUpdate ) [LastUpdate]
       FROM tblPersonPhoneNumber WHERE IsPrimary = 1
       GROUP BY PersonID ) b ON a.PersonID = b.PersonID AND a.LastUpdate = b.LastUpdate
       WHERE a.IsPrimary = 1
       

    None of the above options worked in getting the query time down. I would like to get the time down to no more
    than 5 seconds. Fixing the data issue isn't an option right now. So I have to come up with a way to get just one
    of the primary records for any given PersonID. Any help would be much appreciated in coming up with a query
    that can accomplish what I want and not have a big performance hit due to the 5 million records.

  • You could try to get the Primary number first ...

    WITH ctePhoneNumber AS (
    SELECT
      pn.PersonID,
      pn.PhoneNumber,
      rn = ROW_NUMBER() OVER (PARTITION BY BY pn.PersonID ORDER BY pn.LastUpdate DESC)
    FROM tblPersonPhoneNumber AS pn
    -- WHERE IsPrimary = 1
    )
    SELECT
    p.PersonID,
    p.FirstName,
    p.Lastname,
    ppn.PhoneNumber
    FROM tblPerson p
    LEFT JOIN ctePhoneNumber AS ppn
     ON p.PersonID = ppn.PersonID
    WHERE ppn.rn = 1
     OR ppn.rn IS NULL

  • Thank you. I will give this a try and let you know the results.

  • -- Here's the (fairly standard) APPLY version:

    SELECT

    p.PersonID,

    p.FirstName,

    p.Lastname,

    x.PhoneNumber

    FROM tblPerson p

    OUTER APPLY (

    SELECT TOP(1)

    pn.PhoneNumber

    FROM tblPersonPhoneNumber pn

    WHERE pn.PersonID = p.PersonID

    AND pn.IsPrimary = 1

    ORDER BY pn.LastUpdate DESC

    ) x

    -- you'll want an appropriate index for this to work well:

    -- CREATE INDEX ix_Whatever ON tblPersonPhoneNumber (PersonID,LastUpdate) INCLUDE (PhoneNumber)

    If it takes more than a second, post up an actual execution plan as a .sqlplan attachment

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply