SQL Query to select one row and fill missing columns from another matching row.

Question

Post reply

SQL Query to select one row and fill missing columns from another matching row.

kk 93815

SSC-Addicted

Points: 418
More actions
October 2, 2012 at 2:52 am

#282501

I have a table with the following structure and need to retrieve only record from the table:
Table: Person
Records:
Id, Name, MartialStatus, EmploymentStatus, Email, JobId
1, John,8,6,John@xxx.com,99
2, John,10,7,John@xxx.com,100
3, John,NULL,NULL,John@xxx.com,101
4, Max,6,5,max@emailreaction.org, 102
JobId, GroupId, Desc
99, 50, "blah blah"
100, 50, "blah blah"
101, 50, "blah blah"
102, 51, "blah blah"
I want to retrieve a record from the person table based on the group id...
for eg. for Group Id 50 I want the following record which is mix of record 3 and 2 as:
------------------------------------
3, John,10,7,John@xxx.com,101
-------------------------------------
Record num 3 should be retrieved with the marital status and employment status from record two as they were null.
any help will be highly appreciated.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

laurie-789651 SSCertifiable Points: 7680 More actions · Answer 1

This works for the test data. It picks up values from the 3rd row if the 2nd has nulls too.

--===== TEST DATA =========

declare @Person table

(Id int, Name varchar(20), MartialStatus int, EmploymentStatus int, Email varchar(50), JobId int );

INSERT INTO @Person

( Id, Name, MartialStatus, EmploymentStatus, Email, JobId )

VALUES

(1, 'John', 8, 6, 'John@xxx.com', 99),

(2, 'John', 10, 7, 'John@xxx.com', 100),

(3, 'John', NULL, NULL, 'John@xxx.com', 101),

(4, 'Max', 6, 5, 'max@emailreaction.org', 102);

--select * from @Person

declare @Job table

(JobId int, GroupId int, [Desc] varchar(30) );

INSERT INTO @Job

( JobId, GroupId, [Desc] )

VALUES

(99, 50, 'blah blah'),

(100, 50, 'blah blah'),

(101, 50, 'blah blah'),

(102, 51, 'blah blah');

--select * from @Job

--I want to retrieve a record from the person table based on the group id...

--for eg. for Group Id 50 I want the following record which is mix of record 3 and 2 as:

--------------------------------------

--3, John,10,7,John@xxx.com,101

---------------------------------------

--Record num 3 should be retrieved with the marital status and employment status from record two as they were null.

--====== SUGGESTED SOLUTION =========

declare @GroupId int = 50;

with CTE as

(

select

RowId = ROW_NUMBER() over (partition by p.Name order by p.JobId desc),

p.Id,

p.Name,

p.MartialStatus,

p.EmploymentStatus,

p.Email,

p.JobId

from @job j

inner join @Person p on j.jobId = p.JobId

where j.GroupId = @GroupId

)

select a.Id,

a.Name,

MartialStatus = coalesce(a.MartialStatus,b.MartialStatus,c.MartialStatus),

EmploymentStatus = coalesce(a.EmploymentStatus,b.EmploymentStatus,c.EmploymentStatus),

a.Email,

a.JobId

from CTE a

left outer join CTE b on a.RowId +1 = b.RowId

left outer join CTE c on a.RowId +2 = c.RowId

where a.RowId = 1;

Dwain Camps SSC Guru Points: 86908 More actions · Answer 2

Record 1 is also part of the group defined for 2 and 3 (groupid 50) and you didn't say how you wanted that to be handled (unless ignoring as you did is what you want).

This approach may be a little simpler (uses Laurie's set up data):

SELECT Id=MAX(Id), Name=MAX(Name)

,MartialStatus=MAX(MartialStatus)

,EmploymentStatus=MAX(EmploymentStatus)

,Email=MAX(Email)

,JobId=MAX(a.JobId)

FROM @Person a

INNER JOIN @Job b ON a.JobId = b.JobId

GROUP BY b.GroupId

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

kk 93815 SSC-Addicted Points: 418 More actions · Answer 3

Thanks laurie and dwain.c.

dwain.c i like your approach as it's bit simple but please can you give some explanation so that it makes more sense to me... i always thought that max() function is used for numbers...

did a quick test and both of them seem to work but unfortunately I can't use SQL 2008, sorry didn't mention this earlier.

Regards,

kk

laurie-789651 SSCertifiable Points: 7680 More actions · Answer 4

Record num 3 should be retrieved with the marital status and employment status from record two as they were null.

I read this as 3 being a later record than 2.

The marital status & employment status do ascend in the sample data, but if they don't, selecting MAX() may get a value from any record, not necessarily the latest. Same goes for the other columns e.g. email address may change to something which sorts lower e.g. "john@aaa.com". So selecting MAX() may not get the latest results.

kk 93815 SSC-Addicted Points: 418 More actions · Answer 5

also is it possible to auto return all columns in the table or we need to explicitly specity all the columns in a select query?

laurie-789651 SSCertifiable Points: 7680 More actions · Answer 6

unfortunately I can't use SQL 2008, sorry didn't mention this earlier.

What version are you using?

laurie-789651 SSCertifiable Points: 7680 More actions · Answer 7

kk 93815 (10/2/2012)
also is it possible to auto return all columns in the table or we need to explicitly specity all the columns in a select query?

You should always specify the columns you want in case more columns are added to the table - this will change your query results.

Its also easier & clearer to read.

kk 93815 SSC-Addicted Points: 418 More actions · Answer 8

kk 93815

SSC-Addicted

Points: 418

October 2, 2012 at 4:55 am

#1544304

Thanks laurie. we are using 2005

Dwain Camps SSC Guru Points: 86908 More actions · Answer 9

kk 93815 (10/2/2012)
Thanks laurie and dwain.c.
dwain.c i like your approach as it's bit simple but please can you give some explanation so that it makes more sense to me... i always thought that max() function is used for numbers...
did a quick test and both of them seem to work but unfortunately I can't use SQL 2008, sorry didn't mention this earlier.
Regards,
kk

Laurie's answer is correct on MAX. Remember that character strings also have an inherent collation sequence, which is used to resolve MAX/MIN.

Laurie's double LEFT JOIN and using COALESCE may be better at resolving multiple record ties, unfortunately I don't think it will work if there happens to be 4 records that are all tied.

I meant to say originally that you probably should try both approaches across a wider range of test data to see which works better for you. Often when you do that, you'll find there are issues and then you can repost those cases so someone can suggest how to best handle them.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86908 More actions · Answer 10

kk 93815 (10/2/2012)
Thanks laurie. we are using 2005

I think both solutions should run on SQL 2005.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

laurie-789651 SSCertifiable Points: 7680 More actions · Answer 11

This will work in 2005. Only the test data has changed (& variable declaration).

Please note: You should post on a 2005 forum for 2005 answers 🙂

--===== TEST DATA =========

declare @Person table

(Id int, Name varchar(20), MartialStatus int, EmploymentStatus int, Email varchar(50), JobId int );

INSERT INTO @Person

( Id, Name, MartialStatus, EmploymentStatus, Email, JobId )

VALUES

(1, 'John', 8, 6, 'John@xxx.com', 99);

INSERT INTO @Person

( Id, Name, MartialStatus, EmploymentStatus, Email, JobId )

VALUES

(2, 'John', 10, 7, 'John@xxx.com', 100);

INSERT INTO @Person

( Id, Name, MartialStatus, EmploymentStatus, Email, JobId )

VALUES

(3, 'John', NULL, NULL, 'John@xxx.com', 101);

INSERT INTO @Person

( Id, Name, MartialStatus, EmploymentStatus, Email, JobId )

VALUES

(4, 'Max', 6, 5, 'max@emailreaction.org', 102);

--select * from @Person

declare @Job table

(JobId int, GroupId int, [Desc] varchar(30) );

INSERT INTO @Job

( JobId, GroupId, [Desc] )

VALUES

(99, 50, 'blah blah');

INSERT INTO @Job

( JobId, GroupId, [Desc] )

VALUES

(100, 50, 'blah blah');

INSERT INTO @Job

( JobId, GroupId, [Desc] )

VALUES

(101, 50, 'blah blah');

INSERT INTO @Job

( JobId, GroupId, [Desc] )

VALUES

(102, 51, 'blah blah');

--select * from @Job

--I want to retrieve a record from the person table based on the group id...

--for eg. for Group Id 50 I want the following record which is mix of record 3 and 2 as:

--------------------------------------

--3, John,10,7,John@xxx.com,101

---------------------------------------

--Record num 3 should be retrieved with the marital status and employment status from record two as they were null.

--====== SUGGESTED SOLUTION =========

declare @GroupId int;

set @GroupId = 50;

with CTE as

(

select

RowId = ROW_NUMBER() over (partition by p.Name order by p.JobId desc),

p.Id,

p.Name,

p.MartialStatus,

p.EmploymentStatus,

p.Email,

p.JobId

from @job j

inner join @Person p on j.jobId = p.JobId

where j.GroupId = @GroupId

)

select a.Id,

a.Name,

MartialStatus = coalesce(a.MartialStatus,b.MartialStatus,c.MartialStatus),

EmploymentStatus = coalesce(a.EmploymentStatus,b.EmploymentStatus,c.EmploymentStatus),

a.Email,

a.JobId

from CTE a

left outer join CTE b on a.RowId +1 = b.RowId

left outer join CTE c on a.RowId +2 = c.RowId

where a.RowId = 1;

Dwain Camps SSC Guru Points: 86908 More actions · Answer 12

dwain.c (10/2/2012)
kk 93815 (10/2/2012)
Thanks laurie. we are using 2005
I think both solutions should run on SQL 2005.

:w00t::hehe::w00t:Didn't look at the setup data.:hehe::w00t::hehe:

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 13

Question to OP: Can the name change? - If yes, you cannot partition by name.

Numbers for MartialStatus and EmploymentStatus can change from higher to lower, therefore you cannot really use MAX(), as I believe what you really want to see is the latest non-null value for them.

Can Email be NULL? If so, than it should be some logic for email as well.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Dwain Camps SSC Guru Points: 86908 More actions · Answer 14

I thought it might be interesting to approach this with a recursive CTE, such as the following. Try this setup data:

declare @Person table

(Id int IDENTITY, Name varchar(20), MartialStatus int

,EmploymentStatus int, Email varchar(50), JobId int );

INSERT INTO @Person ( Name, MartialStatus, EmploymentStatus, Email, JobId )

SELECT 'John', 8, 6, 'John@xxx.com', 99

UNION ALL SELECT 'John', 10, 7, 'John@xxx.com', 100

UNION ALL SELECT 'John', NULL, NULL, 'John@xxx.com', 101

UNION ALL SELECT 'Max', 6, 5, 'max@emailreaction.org', 102

UNION ALL SELECT 'Dwain', NULL, 5, 'dwain@ssc.com', 103

UNION ALL SELECT 'Dwain', 6, 2, 'dwain@ssc.com', 104

UNION ALL SELECT 'Dwain', NULL, 10, 'dwain@yahoo.com', 105

--select * from @Person

declare @Job table (JobId int, GroupId int, [Desc] varchar(30) );

INSERT INTO @Job ( JobId, GroupId, [Desc] )

SELECT 99, 50, 'blah blah'

UNION ALL SELECT 100, 50, 'blah blah'

UNION ALL SELECT 101, 50, 'blah blah'

UNION ALL SELECT 102, 51, 'blah blah'

UNION ALL SELECT 103, 52, 'blah blah'

UNION ALL SELECT 104, 52, 'blah blah'

UNION ALL SELECT 105, 52, 'blah blah'

And this script:

;WITH Grouper AS (

SELECT Id, Name, MartialStatus, EmploymentStatus, Email, a.JobId, GroupId

,n=ROW_NUMBER() OVER (PARTITION BY GroupID ORDER BY Id DESC)

FROM @Person a

INNER JOIN @Job b ON a.JobId = b.JobId

),

PickLast AS (

SELECT Id, Name, MartialStatus, EmploymentStatus, Email, JobId, GroupId, n

FROM Grouper

WHERE n = 1

UNION ALL

SELECT a.Id, a.Name

,ISNULL(b.MartialStatus, a.MartialStatus)

,ISNULL(b.EmploymentStatus, a.EmploymentStatus)

,ISNULL(b.Email, a.Email)

,a.JobId, a.GroupId, a.n

FROM Grouper a

INNER JOIN PickLast b ON a.GroupID = b.GroupID AND

a.n = b.n + 1

WHERE b.MartialStatus IS NULL OR b.EmploymentStatus IS NULL OR b.Email IS NULL

)

SELECT Id, Name, MartialStatus, EmploymentStatus, Email, JobId, GroupId

FROM (

SELECT Id, Name, MartialStatus, EmploymentStatus, Email, JobId, GroupId

,n=ROW_NUMBER() OVER (PARTITION BY GroupId ORDER BY n DESC)

FROM PickLast) a

WHERE n = 1

The PickLast rCTE loops through the records picking up the latest non-null value for each of the 3 fields: email, marital status (which is misspelled by the way) and employment status. So now it doesn't matter how many records there are. Only that there is a unique identifier that specifies the order of insertion (I made Id IDENTITY for that reason).

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St