Optimizing Query

  • I am trying to optimize the following query which is based on the AdventureWorksDW2008R2 database. Using the StartDate and EndDate fields the query is able to determine who has left the company and has been rehired. Finding out who was rehired was easy by using the query

    SELECT DISTINCT(A.EmployeeKey), A.DepartmentName, A.Title, A.LastName, A.FirstName, A.StartDate, A.EndDate, 'Y' AS Rehired

    FROM dbo.DimEmployee A

    JOIN dbo.DimEmployee B ON

    A.FirstName = B.FirstName AND

    A.LastName = B.LastName AND

    ISNULL(A.EndDate, '01/01/1900') <> ISNULL(B.EndDate, '01/01/1900')

    ORDER BY DepartmentName, Title, LastName, FirstName

    So I combined the above into a new query to indicate who was not rehired along with those who have been rehired. But I am pulling the same record set twice to do this.

    SELECT DISTINCT(EmployeeKey), DepartmentName, Title, LastName, FirstName, StartDate, EndDate, 'N' AS Rehired

    FROM dbo.DimEmployee

    WHERE EmployeeKey NOT IN (

    SELECT DISTINCT(A.EmployeeKey)

    FROM dbo.DimEmployee A

    JOIN dbo.DimEmployee B ON

    A.FirstName = B.FirstName AND

    A.LastName = B.LastName AND

    ISNULL(A.EndDate, '01/01/1900') <> ISNULL(B.EndDate, '01/01/1900')

    )

    UNION ALL

    SELECT DISTINCT(A.EmployeeKey), A.DepartmentName, A.Title, A.LastName, A.FirstName, A.StartDate, A.EndDate, 'Y' AS Rehired

    FROM dbo.DimEmployee A

    JOIN dbo.DimEmployee B ON

    A.FirstName = B.FirstName AND

    A.LastName = B.LastName AND

    ISNULL(A.EndDate, '01/01/1900') <> ISNULL(B.EndDate, '01/01/1900')

    ORDER BY DepartmentName, Title, LastName, FirstName

    Is there a better way to handle this type of query?

    Thanks Steve

  • Couple things first...

    DISTINCT is not a function, take the brackets out (round brackets are allowed around any column name)

    What you have there is completely equivalent to either of these:

    1) SELECT DISTINCT A.EmployeeKey, A.DepartmentName, A.Title, A.LastName, A.FirstName, A.StartDate, A.EndDate, 'Y' AS Rehired

    2) SELECT DISTINCT (A.EmployeeKey), (A.DepartmentName), (A.Title), (A.LastName), (A.FirstName), (A.StartDate), (A.EndDate), ('Y') AS Rehired

    Also Distinct is unnecessary inside an IN subquery

    Now, am I correct in saying that what you want is a list of all employees with a N AS Rehired if they were not rehired and Y if they were rehired?

    What's the exact definition of an employee being rehired? A second row with a different end date for the same name? (is name really unique?)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Yes you are correct. The definition of being rehired is that the first and last names are equal. If there are two rows where the names equal and the end dates are different then I am assuming that the employee has been rehired.

    My main goal is to see if there is a better way to do exclusionary selects like this since I have other instances where I need to do a similar query but due to the number of rows in the tables the queries are taking forever to run. I'm trying to simplify the execution plan if possible using the above as an example.

  • slpgma (4/10/2012)


    Yes you are correct. The definition of being rehired is that the first and last names are equal. If there are two rows where the names equal and the end dates are different then I am assuming that the employee has been rehired.

    Not so sure that's a good idea for anything other than practice. If you did that in Bristol, RI, you're code would really be a problem because half of the folk's names there are "Manny Costa". 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If I may suggest... post the query that you are actually working with. I can see a very simple solution to this that may well not work for the real situation you have.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I wish I could post the query but I am not sure I have the right since it is part of a corporate application. The query I am using in the corporate app is identical to the one I posted but the table names and fields have been changed to protect the innocent.

  • Not so sure that's a good idea for anything other than practice. If you did that in Bristol, RI, you're code would really be a problem because half of the folk's names there are "Manny Costa". 😉

    It may be a bad business example but it was only public table I could find to demonstrate what I wanted to do. The whole goal was for me is that I needed to develop a query that would find all rows in a table where two fields were equal and a third field was not.

  • slpgma (4/11/2012)


    Not so sure that's a good idea for anything other than practice. If you did that in Bristol, RI, you're code would really be a problem because half of the folk's names there are "Manny Costa". 😉

    It may be a bad business example but it was only public table I could find to demonstrate what I wanted to do. The whole goal was for me is that I needed to develop a query that would find all rows in a table where two fields were equal and a third field was not.

    Not easy without a proper table/example.

    select *

    from a join b on a.id = b.id

    where a.col1 = b.col1 and a.col2 <> b.col2

    You can mix and match your ANDs and ORs as much as you want.


Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply