SQL Query to select one row and fill missing columns from another matching row.

  • I have a table with the following structure and need to retrieve only record from the table:

    Table: Person

    Records:

    Id, Name, MartialStatus, EmploymentStatus, Email, JobId

    1, John,8,6,John@xxx.com,99

    2, John,10,7,John@xxx.com,100

    3, John,NULL,NULL,John@xxx.com,101

    4, Max,6,5,max@emailreaction.org, 102

    JobId, GroupId, Desc

    99, 50, "blah blah"

    100, 50, "blah blah"

    101, 50, "blah blah"

    102, 51, "blah blah"

    I want to retrieve a record from the person table based on the group id...

    for eg. for Group Id 50 I want the following record which is mix of record 3 and 2 as:

    ------------------------------------

    3, John,10,7,John@xxx.com,101

    -------------------------------------

    Record num 3 should be retrieved with the marital status and employment status from record two as they were null.

    any help will be highly appreciated.

  • This works for the test data. It picks up values from the 3rd row if the 2nd has nulls too.

    --===== TEST DATA =========

    declare @Person table

    (Id int, Name varchar(20), MartialStatus int, EmploymentStatus int, Email varchar(50), JobId int );

    INSERT INTO @Person

    ( Id, Name, MartialStatus, EmploymentStatus, Email, JobId )

    VALUES

    (1, 'John', 8, 6, 'John@xxx.com', 99),

    (2, 'John', 10, 7, 'John@xxx.com', 100),

    (3, 'John', NULL, NULL, 'John@xxx.com', 101),

    (4, 'Max', 6, 5, 'max@emailreaction.org', 102);

    --select * from @Person

    declare @Job table

    (JobId int, GroupId int, [Desc] varchar(30) );

    INSERT INTO @Job

    ( JobId, GroupId, [Desc] )

    VALUES

    (99, 50, 'blah blah'),

    (100, 50, 'blah blah'),

    (101, 50, 'blah blah'),

    (102, 51, 'blah blah');

    --select * from @Job

    --I want to retrieve a record from the person table based on the group id...

    --for eg. for Group Id 50 I want the following record which is mix of record 3 and 2 as:

    --------------------------------------

    --3, John,10,7,John@xxx.com,101

    ---------------------------------------

    --Record num 3 should be retrieved with the marital status and employment status from record two as they were null.

    --====== SUGGESTED SOLUTION =========

    declare @GroupId int = 50;

    with CTE as

    (

    select

    RowId = ROW_NUMBER() over (partition by p.Name order by p.JobId desc),

    p.Id,

    p.Name,

    p.MartialStatus,

    p.EmploymentStatus,

    p.Email,

    p.JobId

    from @job j

    inner join @Person p on j.jobId = p.JobId

    where j.GroupId = @GroupId

    )

    select a.Id,

    a.Name,

    MartialStatus = coalesce(a.MartialStatus,b.MartialStatus,c.MartialStatus),

    EmploymentStatus = coalesce(a.EmploymentStatus,b.EmploymentStatus,c.EmploymentStatus),

    a.Email,

    a.JobId

    from CTE a

    left outer join CTE b on a.RowId +1 = b.RowId

    left outer join CTE c on a.RowId +2 = c.RowId

    where a.RowId = 1;

  • Record 1 is also part of the group defined for 2 and 3 (groupid 50) and you didn't say how you wanted that to be handled (unless ignoring as you did is what you want).

    This approach may be a little simpler (uses Laurie's set up data):

    SELECT Id=MAX(Id), Name=MAX(Name)

    ,MartialStatus=MAX(MartialStatus)

    ,EmploymentStatus=MAX(EmploymentStatus)

    ,Email=MAX(Email)

    ,JobId=MAX(a.JobId)

    FROM @Person a

    INNER JOIN @Job b ON a.JobId = b.JobId

    GROUP BY b.GroupId


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thanks laurie and dwain.c.

    dwain.c i like your approach as it's bit simple but please can you give some explanation so that it makes more sense to me... i always thought that max() function is used for numbers...

    did a quick test and both of them seem to work but unfortunately I can't use SQL 2008, sorry didn't mention this earlier.

    Regards,

    kk

  • Record num 3 should be retrieved with the marital status and employment status from record two as they were null.

    I read this as 3 being a later record than 2.

    The marital status & employment status do ascend in the sample data, but if they don't, selecting MAX() may get a value from any record, not necessarily the latest. Same goes for the other columns e.g. email address may change to something which sorts lower e.g. "john@aaa.com". So selecting MAX() may not get the latest results.

  • also is it possible to auto return all columns in the table or we need to explicitly specity all the columns in a select query?

  • unfortunately I can't use SQL 2008, sorry didn't mention this earlier.

    What version are you using?

  • kk 93815 (10/2/2012)


    also is it possible to auto return all columns in the table or we need to explicitly specity all the columns in a select query?

    You should always specify the columns you want in case more columns are added to the table - this will change your query results.

    Its also easier & clearer to read.

  • Thanks laurie. we are using 2005

  • kk 93815 (10/2/2012)


    Thanks laurie and dwain.c.

    dwain.c i like your approach as it's bit simple but please can you give some explanation so that it makes more sense to me... i always thought that max() function is used for numbers...

    did a quick test and both of them seem to work but unfortunately I can't use SQL 2008, sorry didn't mention this earlier.

    Regards,

    kk

    Laurie's answer is correct on MAX. Remember that character strings also have an inherent collation sequence, which is used to resolve MAX/MIN.

    Laurie's double LEFT JOIN and using COALESCE may be better at resolving multiple record ties, unfortunately I don't think it will work if there happens to be 4 records that are all tied.

    I meant to say originally that you probably should try both approaches across a wider range of test data to see which works better for you. Often when you do that, you'll find there are issues and then you can repost those cases so someone can suggest how to best handle them.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • kk 93815 (10/2/2012)


    Thanks laurie. we are using 2005

    I think both solutions should run on SQL 2005.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • This will work in 2005. Only the test data has changed (& variable declaration).

    Please note: You should post on a 2005 forum for 2005 answers 🙂

    --===== TEST DATA =========

    declare @Person table

    (Id int, Name varchar(20), MartialStatus int, EmploymentStatus int, Email varchar(50), JobId int );

    INSERT INTO @Person

    ( Id, Name, MartialStatus, EmploymentStatus, Email, JobId )

    VALUES

    (1, 'John', 8, 6, 'John@xxx.com', 99);

    INSERT INTO @Person

    ( Id, Name, MartialStatus, EmploymentStatus, Email, JobId )

    VALUES

    (2, 'John', 10, 7, 'John@xxx.com', 100);

    INSERT INTO @Person

    ( Id, Name, MartialStatus, EmploymentStatus, Email, JobId )

    VALUES

    (3, 'John', NULL, NULL, 'John@xxx.com', 101);

    INSERT INTO @Person

    ( Id, Name, MartialStatus, EmploymentStatus, Email, JobId )

    VALUES

    (4, 'Max', 6, 5, 'max@emailreaction.org', 102);

    --select * from @Person

    declare @Job table

    (JobId int, GroupId int, [Desc] varchar(30) );

    INSERT INTO @Job

    ( JobId, GroupId, [Desc] )

    VALUES

    (99, 50, 'blah blah');

    INSERT INTO @Job

    ( JobId, GroupId, [Desc] )

    VALUES

    (100, 50, 'blah blah');

    INSERT INTO @Job

    ( JobId, GroupId, [Desc] )

    VALUES

    (101, 50, 'blah blah');

    INSERT INTO @Job

    ( JobId, GroupId, [Desc] )

    VALUES

    (102, 51, 'blah blah');

    --select * from @Job

    --I want to retrieve a record from the person table based on the group id...

    --for eg. for Group Id 50 I want the following record which is mix of record 3 and 2 as:

    --------------------------------------

    --3, John,10,7,John@xxx.com,101

    ---------------------------------------

    --Record num 3 should be retrieved with the marital status and employment status from record two as they were null.

    --====== SUGGESTED SOLUTION =========

    declare @GroupId int;

    set @GroupId = 50;

    with CTE as

    (

    select

    RowId = ROW_NUMBER() over (partition by p.Name order by p.JobId desc),

    p.Id,

    p.Name,

    p.MartialStatus,

    p.EmploymentStatus,

    p.Email,

    p.JobId

    from @job j

    inner join @Person p on j.jobId = p.JobId

    where j.GroupId = @GroupId

    )

    select a.Id,

    a.Name,

    MartialStatus = coalesce(a.MartialStatus,b.MartialStatus,c.MartialStatus),

    EmploymentStatus = coalesce(a.EmploymentStatus,b.EmploymentStatus,c.EmploymentStatus),

    a.Email,

    a.JobId

    from CTE a

    left outer join CTE b on a.RowId +1 = b.RowId

    left outer join CTE c on a.RowId +2 = c.RowId

    where a.RowId = 1;

  • dwain.c (10/2/2012)


    kk 93815 (10/2/2012)


    Thanks laurie. we are using 2005

    I think both solutions should run on SQL 2005.

    :w00t::hehe::w00t:Didn't look at the setup data.:hehe::w00t::hehe:


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Question to OP: Can the name change? - If yes, you cannot partition by name.

    Numbers for MartialStatus and EmploymentStatus can change from higher to lower, therefore you cannot really use MAX(), as I believe what you really want to see is the latest non-null value for them.

    Can Email be NULL? If so, than it should be some logic for email as well.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • I thought it might be interesting to approach this with a recursive CTE, such as the following. Try this setup data:

    declare @Person table

    (Id int IDENTITY, Name varchar(20), MartialStatus int

    ,EmploymentStatus int, Email varchar(50), JobId int );

    INSERT INTO @Person ( Name, MartialStatus, EmploymentStatus, Email, JobId )

    SELECT 'John', 8, 6, 'John@xxx.com', 99

    UNION ALL SELECT 'John', 10, 7, 'John@xxx.com', 100

    UNION ALL SELECT 'John', NULL, NULL, 'John@xxx.com', 101

    UNION ALL SELECT 'Max', 6, 5, 'max@emailreaction.org', 102

    UNION ALL SELECT 'Dwain', NULL, 5, 'dwain@ssc.com', 103

    UNION ALL SELECT 'Dwain', 6, 2, 'dwain@ssc.com', 104

    UNION ALL SELECT 'Dwain', NULL, 10, 'dwain@yahoo.com', 105

    --select * from @Person

    declare @Job table (JobId int, GroupId int, [Desc] varchar(30) );

    INSERT INTO @Job ( JobId, GroupId, [Desc] )

    SELECT 99, 50, 'blah blah'

    UNION ALL SELECT 100, 50, 'blah blah'

    UNION ALL SELECT 101, 50, 'blah blah'

    UNION ALL SELECT 102, 51, 'blah blah'

    UNION ALL SELECT 103, 52, 'blah blah'

    UNION ALL SELECT 104, 52, 'blah blah'

    UNION ALL SELECT 105, 52, 'blah blah'

    And this script:

    ;WITH Grouper AS (

    SELECT Id, Name, MartialStatus, EmploymentStatus, Email, a.JobId, GroupId

    ,n=ROW_NUMBER() OVER (PARTITION BY GroupID ORDER BY Id DESC)

    FROM @Person a

    INNER JOIN @Job b ON a.JobId = b.JobId

    ),

    PickLast AS (

    SELECT Id, Name, MartialStatus, EmploymentStatus, Email, JobId, GroupId, n

    FROM Grouper

    WHERE n = 1

    UNION ALL

    SELECT a.Id, a.Name

    ,ISNULL(b.MartialStatus, a.MartialStatus)

    ,ISNULL(b.EmploymentStatus, a.EmploymentStatus)

    ,ISNULL(b.Email, a.Email)

    ,a.JobId, a.GroupId, a.n

    FROM Grouper a

    INNER JOIN PickLast b ON a.GroupID = b.GroupID AND

    a.n = b.n + 1

    WHERE b.MartialStatus IS NULL OR b.EmploymentStatus IS NULL OR b.Email IS NULL

    )

    SELECT Id, Name, MartialStatus, EmploymentStatus, Email, JobId, GroupId

    FROM (

    SELECT Id, Name, MartialStatus, EmploymentStatus, Email, JobId, GroupId

    ,n=ROW_NUMBER() OVER (PARTITION BY GroupId ORDER BY n DESC)

    FROM PickLast) a

    WHERE n = 1

    The PickLast rCTE loops through the records picking up the latest non-null value for each of the 3 fields: email, marital status (which is misspelled by the way) and employment status. So now it doesn't matter how many records there are. Only that there is a unique identifier that specifies the order of insertion (I made Id IDENTITY for that reason).


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply