RE: SQL Query to select one row and fill missing columns from another matching row.

Dwain Camps

SSC Guru

Points: 86908

October 2, 2012 at 7:03 pm

#1544655

I thought it might be interesting to approach this with a recursive CTE, such as the following. Try this setup data:

declare @Person table

(Id int IDENTITY, Name varchar(20), MartialStatus int

,EmploymentStatus int, Email varchar(50), JobId int );

INSERT INTO @Person ( Name, MartialStatus, EmploymentStatus, Email, JobId )

SELECT 'John', 8, 6, 'John@xxx.com', 99

UNION ALL SELECT 'John', 10, 7, 'John@xxx.com', 100

UNION ALL SELECT 'John', NULL, NULL, 'John@xxx.com', 101

UNION ALL SELECT 'Max', 6, 5, 'max@emailreaction.org', 102

UNION ALL SELECT 'Dwain', NULL, 5, 'dwain@ssc.com', 103

UNION ALL SELECT 'Dwain', 6, 2, 'dwain@ssc.com', 104

UNION ALL SELECT 'Dwain', NULL, 10, 'dwain@yahoo.com', 105

--select * from @Person

declare @Job table (JobId int, GroupId int, [Desc] varchar(30) );

INSERT INTO @Job ( JobId, GroupId, [Desc] )

SELECT 99, 50, 'blah blah'

UNION ALL SELECT 100, 50, 'blah blah'

UNION ALL SELECT 101, 50, 'blah blah'

UNION ALL SELECT 102, 51, 'blah blah'

UNION ALL SELECT 103, 52, 'blah blah'

UNION ALL SELECT 104, 52, 'blah blah'

UNION ALL SELECT 105, 52, 'blah blah'

And this script:

;WITH Grouper AS (

SELECT Id, Name, MartialStatus, EmploymentStatus, Email, a.JobId, GroupId

,n=ROW_NUMBER() OVER (PARTITION BY GroupID ORDER BY Id DESC)

FROM @Person a

INNER JOIN @Job b ON a.JobId = b.JobId

),

PickLast AS (

SELECT Id, Name, MartialStatus, EmploymentStatus, Email, JobId, GroupId, n

FROM Grouper

WHERE n = 1

UNION ALL

SELECT a.Id, a.Name

,ISNULL(b.MartialStatus, a.MartialStatus)

,ISNULL(b.EmploymentStatus, a.EmploymentStatus)

,ISNULL(b.Email, a.Email)

,a.JobId, a.GroupId, a.n

FROM Grouper a

INNER JOIN PickLast b ON a.GroupID = b.GroupID AND

a.n = b.n + 1

WHERE b.MartialStatus IS NULL OR b.EmploymentStatus IS NULL OR b.Email IS NULL

)

SELECT Id, Name, MartialStatus, EmploymentStatus, Email, JobId, GroupId

FROM (

SELECT Id, Name, MartialStatus, EmploymentStatus, Email, JobId, GroupId

,n=ROW_NUMBER() OVER (PARTITION BY GroupId ORDER BY n DESC)

FROM PickLast) a

WHERE n = 1

The PickLast rCTE loops through the records picking up the latest non-null value for each of the 3 fields: email, marital status (which is misspelled by the way) and employment status. So now it doesn't matter how many records there are. Only that there is a unique identifier that specifies the order of insertion (I made Id IDENTITY for that reason).

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St