Count duplicate occurances in relational tables

  • I am fairly new to SQL and pressed for time. I don't think this is difficult to do but I can't seem to work it out myself.

    I have two tables in a one-to-many relationship:

    Table1 contains records in which each key is unique and not duplicated. There are two fields: table1_key and table1_data.

    Table2 contains records with multiple duplicated keys which are also in table1. There are also two fields: table2_key and table2_data. The data here, for example, are record keys for a third table. The same key can occur many times, even hundreds as can the same data, but there are no records in which the key and data are both identical.

    For each unique record key in table1, I would like to return not only the data field from table1 but the count of the number of occurances of that record's key found in table2. I'm pretty sure this is possible, just not sure of the syntax.

    Hope someone can help.

    Thanks

  • Welcome to ssc. There's an excellent article linked in my signature block "please read this" - have a read, it will help you get faster, more accurate answers to your questions.

    What's the relationship between the two tables?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • The relationship is via the key (first field) in the first table 'Subjects' and the second field 'SubjectsIKey' in the second.

    CREATE TABLE Subjects

    (

    SubjectsKey TEXT PRIMARY KEY ASC ON CONFLICT REPLACE,

    SubjectsHeading TEXT

    )

    CREATE TABLE Subjects_items

    (

    SubjectsItem INTEGER,

    SubjectsIKey TEXT,

    UNIQUE(SubjectsItem,SubjectsIKey)

    )

    The first field in the first table is unique, no duplicates. There may be many 'items', however, which have the same subject entry and these appear in the second table. Example:

    In the table 'Subjects'

    SubjectsKey SubjectsHeading

    DOGS Dogs

    CATS Cats

    In the table 'Subjects_items'

    SubjectsItem SubjectsIKey

    10 DOGS

    11 DOGS

    12 CATS

    13 DOGS

    What I'd like to do is find out how many occurances of SubjectsIKey in the second table ('Subjects_items') match each SubjectsKey in the first table. The result is two fields: SubjectsKey and a value showing the count. In the example, DOGS 3 and then CATS 1.

    I can do it using two SELECT statements, one to collect all the keys in the first table into an array or memory list then loop through them using another SELECT to count each heading, but my intuition says it should be possible using one single SELECT. The actual data could contain over 100,000 entries in Subjects and at least as many records in Subjects_items.

    Hope this is clear.

    Thanks.

  • dean-hodgson (10/9/2012)


    The relationship is via the key (first field) in the first table 'Subjects' and the second field 'SubjectsIKey' in the second.

    CREATE TABLE Subjects

    (

    SubjectsKey TEXT PRIMARY KEY ASC ON CONFLICT REPLACE,

    SubjectsHeading TEXT

    )

    CREATE TABLE Subjects_items

    (

    SubjectsItem INTEGER,

    SubjectsIKey TEXT,

    UNIQUE(SubjectsItem,SubjectsIKey)

    )

    ...

    Is this really the DDL for the tables? Are you really on SQL Server, version 7 or 2000?

    Here are three different ways to get the results you are looking for, the last one will only work on 2k5 and above:

    SELECT s.SubjectsKey, CountPerKey = ISNULL(d.CountPerKey,0)

    FROM Subjects s

    LEFT JOIN (

    SELECT SubjectsIKey, CountPerKey = COUNT(*)

    FROM Subjects_items

    GROUP BY SubjectsIKey

    ) d ON d.SubjectsIKey = s.SubjectsKey

    SELECT s.SubjectsKey, CountPerKey = COUNT(i.SubjectsIKey)

    FROM Subjects s

    LEFT JOIN Subjects_items i

    ON i.SubjectsIKey = s.SubjectsKey

    GROUP BY s.SubjectsKey

    SELECT s.SubjectsKey, CountPerKey = ISNULL(d.CountPerKey,0)

    FROM Subjects s

    OUTER APPLY (

    SELECT SubjectsIKey, CountPerKey = COUNT(*)

    FROM Subjects_items i

    WHERE i.SubjectsIKey = s.SubjectsKey

    GROUP BY SubjectsIKey

    ) d


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Thank you for that. I presented pseudocode rather than actual as I felt that would be sufficient. The actual statements are more complex and involve additional fields.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply