Query for non-existent entries

  • Hi,

    I'd like to build a query that produces the missing entries in a scores per week table. Say that you have a table with the following structure:

    ID|Name|noWeek|Score

    1 John 1 2

    2 John 2 3

    3 John 3 1

    4 John 5 7

    ..................

    Let's say the maximum week number is 5 for all possible people in this table. What I'd like is a query that lists the missing entries. Sth like this:

    - John 4 -

    I've got the number of weeks (1-2...5) in a lookup table, if that helps. So I guess we're looking at some kind of CROSS JOIN, or cartesian product, or the like.

    Any ideas of how this could be done? Thanks,

  • Left join the scores table to the weeks table. If you mock up sample data scripts, folks will show you exactly how.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Chris,

    Thanks for the advice... I tried a left join in the first place, but it didn't work.

    Here's my 2 tables:

    CREATE TABLE [dbo].[lkpWeeks](

    [ID] [numeric](18, 0) IDENTITY(1,1) NOT NULL,

    [WeekNo] [int] NOT NULL,

    [SortOrder] [numeric](18, 0) NULL,

    CONSTRAINT [PK_lkpWeeks] PRIMARY KEY CLUSTERED

    (

    [WeekNo] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    CREATE TABLE [dbo].[tblPAScores](

    [ID] [int] IDENTITY(1,1) NOT NULL,

    [VolunteerID] [varchar](50) NOT NULL,

    [NoWeek] [int] NULL,

    [PA] [numeric](5, 2) NULL,

    CONSTRAINT [PK_tblPAScores] PRIMARY KEY CLUSTERED

    (

    [ID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    The weeks table stores values 1..5. The PAscores table stores the scores for different volunteers. I'd like to have the missing entries in tblPAscores; if volunteer John is missing week 4's score, the query would produce that, asa said above.

    Am I missing something? Thanks

  • a_ud (5/2/2013)


    <snipped>

    The weeks table stores values 1..5. The PAscores table stores the scores for different volunteers. I'd like to have the missing entries in tblPAscores; if volunteer John is missing week 4's score, the query would produce that, asa said above.

    Am I missing something? Thanks

    You may find this easier if you have a People table holding the names of all people.

    Once you have a Weeks table, a People table, and a Scores table, express the result you want in plain language:

    "Person and Week where the Person does not have a Score for that Week."

    That helps make it clear that you want to start with People and Weeks and eliminate PersonID/Week combinations for which Scores exist.

    Two possibilities:

    SELECT p.PersonID, w.WeekNbr

    FROM People p

    CROSS APPLY Weeks w

    EXCEPT

    SELECT s.PersonID, s.WeekNbr

    FROM Scores s

    SELECT p.PersonID, w.WeekNbr

    FROM People p

    CROSS APPLY Weeks w

    WHERE NOT EXISTS (SELECT 1 FROM Scores s WHERE s.PersonID = p.PersonID AND s.WeekNbr = w.WeekNbr)

    Querying for values that do NOT exist in a table is very costly without appropriate indexes. If you have a significant amount of data, you'll probably want to look into indexing your tblPAScores table on VolunteerID and NoWeek.

    BTW, you will find that using the prefix "tbl" on your table names, or "tibbling", as some call it, will provoke visceral negative reactions from some members of the SQL Server community. Including schema metadata ("tbl" in the table name identifies the object as a table) in object names violates ISO standards, which riles up some folks. Also, using "No" to stand for "Number" in object names can be confusing (does it mean "number" or "no"?) - I much prefer using "Number" or "Nbr". I use "Number" whenever feasible because it's unambiguous and easily human-readable. The 128-character limit on identifiers in SQL Server is very generous, so abbreviations are less necessary than they once were.

    Jason Wolfkill

  • Thanks, I'm still getting my head around it, I'm going to try it right now.

    I've never used a CROSS APPLY, but I guess this could be translated (done) also using subqueries which don't rely on a CROSS APPLY, right?

    I'm going to try now. Thnx again

  • a_ud (5/3/2013)


    Thanks, I'm still getting my head around it, I'm going to try it right now.

    I've never used a CROSS APPLY, but I guess this could be translated (done) also using subqueries which don't rely on a CROSS APPLY, right?

    I'm going to try now. Thnx again

    I'd guess that should read CROSS JOIN rather than CROSS APPLY. If you get stuck, post up some sample data (INSERT INTO ... SELECT ...).

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (5/3/2013)


    a_ud (5/3/2013)


    Thanks, I'm still getting my head around it, I'm going to try it right now.

    I've never used a CROSS APPLY, but I guess this could be translated (done) also using subqueries which don't rely on a CROSS APPLY, right?

    I'm going to try now. Thnx again

    I'd guess that should read CROSS JOIN rather than CROSS APPLY. If you get stuck, post up some sample data (INSERT INTO ... SELECT ...).

    Well, CROSS JOIN works, too, but both CROSS APPLY and CROSS JOIN create a Cartesian product of the People and Weeks table, which is what the OP needs - a rowset with rows for each WeekNbr for each PersonID to compare to the Scores table to see which PersonID/WeekNbr combinations do not exist in Scores.

    @a_ud, CROSS APPLY just means "apply every row of this rowset to each row of the current result set". You can use it with a table-valued function to apply the result of the function to each row or with a table, view, or subquery to apply every row from the table, view, or subquery to each row of the result set (a Cartesian product, as I noted above). CROSS APPLY works a little bit like an INNER JOIN - only rows that have a result from the CROSS APPLY will be returned. OUTER APPLY does the same thing as CROSS APPLY, but returns all rows whether or not they have a result from the CROSS APPLY, kind of like a LEFT OUTER JOIN.

    Jason Wolfkill

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply