Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

2 records at a time? Expand / Collapse
Author
Message
Posted Friday, November 15, 2013 1:13 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 9:33 AM
Points: 8, Visits: 71
Hi everyone,

I'm not even sure how to word this succinctly enough to search for this topic, so I apologize in advance if this is covered elsewhere! It almost feels like a gaps and islands issue, but not quite....

Background Info:
I am developing a form for a mortgage company. There can be any number of borrowers on a given loan, and the business has asked that this form return only 2 borrowers at a time for a loan. For example, if there are 3 borrowers for a loan, they want the first copy of the form to print the first 2 borrowers and then another copy of the form to print the 3rd. No matter how many copies are printed, they want the borrower information to be labeled as 'Borrower1' xyz and 'Borrower2' xyz. Also, there will be a LOT more fields returned on the real form, so the sample information below is very simplified test data.

Sample Data:
CREATE TABLE #t (LoanID VARCHAR(5), BorrowerName VARCHAR(20), BorrowerOrder INT);
GO
INSERT INTO #t VALUES
('::E', 'John Smith', 0)
, ('::E', 'Jane Smith', 1)
, ('::E', 'Rob Jackson', 2)
, ('AF_CF', 'Sloan Burton', 1)
, ('AF_CF', 'Virginia Burton', 2)
, ('AF_CF', 'John Hardeman', 0)
, ('::E', 'Mary Lamb', 4)
, ('::E', 'Patrick Stewart', 3)

So for loan 'AF_CF', the form should print twice:
1st copy: Borrower1 = Sloan Burton, Borrower2 = Virginia Burton
2nd copy: Borrower1 = John Hardeman, Borrower2 = NULL

We are on 2012, so I tried to do this with LEAD():
SELECT
Borrower1 = BorrowerName
, Borrower2 = LEAD(BorrowerName) OVER(ORDER BY BorrowerOrder)
FROM
#t AS t
WHERE
FileID = 'AF_CF'

But this returns a record for every borrower, so the form is printing 3 times:
1st: Borrower1 = Sloan Burton, Borrower2 = Virginia Burton
2nd: Borrower1 = Virginia Burton, Borrower2 = John Hardeman
3rd: Borrower1 = John Hardeman, Borrower2 = NULL

I don't want that 2nd record to return. This result is what makes me think of gaps and islands, but I don't know if the 2nd record is really an island since it's (1) not stored this way...it's returning this way because of the query and (2) it's not sequential data....

Finally, I tried restricting this by putting this into a CTE and then returning only the odd numbered records like I have below. This runs pretty quickly when dealing with one loan. But...I am concerned that the CTE will be slow when we run batches of loans.

Attempt with CTE:
--With CTE
;WITH cte AS
(SELECT
Borrower1 = BorrowerName
, Borrower2 = LEAD(BorrowerName) OVER(ORDER BY BorrowerOrder)
, RowNumber = ROW_NUMBER() OVER(ORDER BY BorrowerOrder)

FROM
#t AS t
WHERE
FileID = '::E'
)
SELECT
*
FROM
cte
WHERE
RowNumber % 2 = 1

Is there a better, cleaner way to do this? Or is the CTE the best way to go?
Post #1514842
Posted Sunday, November 17, 2013 5:00 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 4:06 PM
Points: 6,924, Visits: 12,636
Since you already have the BorrowerOrder column I'm not sure why it would need to be so complicated...
Wouldn't a simple BorrowerOrder < 2 be enough?

SELECT
*
FROM
#t AS t
WHERE BorrowerOrder < 2





Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question
How to post performance related questions
Links for Tally Table , Cross Tabs and Dynamic Cross Tabs , Delimited Split Function
Post #1514973
Posted Sunday, November 17, 2013 8:26 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 4:44 PM
Points: 8,286, Visits: 8,736
LutzM (11/17/2013)
Since you already have the BorrowerOrder column I'm not sure why it would need to be so complicated...
Wouldn't a simple BorrowerOrder < 2 be enough?

SELECT
*
FROM
#t AS t
WHERE BorrowerOrder < 2


The trouble with that is that it's only going to return 1 row, so what happens when the file has more than 2 borrowers? Also, the sequence of the values in the BORROWERORDER field may contain gaps, so that may only get one borrower when it should get two. Or even worse, return no borrowers at all.

I think the OP's solution with a CTE is probably OK; roughly half of the CTE's rows are excluded from the output, but if the BorrowerOrder values are not guaranteed to be set of integers without gaps (and I imagine they can't be, since if they were there would be no need to use LEAD instead of just adding 1) there is no way to avoid that, and the performance penalty for having the extra data in the CTE is not enormous as all the rows with the right FILEID have to be read whether they are first of a pair or second of a pair. If the table is clustered on a key of which the FILEID is the first column, or there is a suitable covering index for this query where FILEID is the first column, the solution using a CTE will be pretty fast.
Of course a slightly different query consisting of a CTE which adds a row number and doesn't combine records and a main select which joins with a tally table might be faster, but it might actually be slower. Probably it would be easy enough to measure and compare the two approaches if there's adequate test data.


Tom
Post #1514990
Posted Sunday, November 17, 2013 10:11 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 10:30 PM
Points: 35,959, Visits: 30,251
jessica.green2312 (11/15/2013)
Hi everyone,

I'm not even sure how to word this succinctly enough to search for this topic, so I apologize in advance if this is covered elsewhere! It almost feels like a gaps and islands issue, but not quite....

Background Info:
I am developing a form for a mortgage company. There can be any number of borrowers on a given loan, and the business has asked that this form return only 2 borrowers at a time for a loan. For example, if there are 3 borrowers for a loan, they want the first copy of the form to print the first 2 borrowers and then another copy of the form to print the 3rd. No matter how many copies are printed, they want the borrower information to be labeled as 'Borrower1' xyz and 'Borrower2' xyz. Also, there will be a LOT more fields returned on the real form, so the sample information below is very simplified test data.

Sample Data:
CREATE TABLE #t (LoanID VARCHAR(5), BorrowerName VARCHAR(20), BorrowerOrder INT);
GO
INSERT INTO #t VALUES
('::E', 'John Smith', 0)
, ('::E', 'Jane Smith', 1)
, ('::E', 'Rob Jackson', 2)
, ('AF_CF', 'Sloan Burton', 1)
, ('AF_CF', 'Virginia Burton', 2)
, ('AF_CF', 'John Hardeman', 0)
, ('::E', 'Mary Lamb', 4)
, ('::E', 'Patrick Stewart', 3)

So for loan 'AF_CF', the form should print twice:
1st copy: Borrower1 = Sloan Burton, Borrower2 = Virginia Burton
2nd copy: Borrower1 = John Hardeman, Borrower2 = NULL

We are on 2012, so I tried to do this with LEAD():
SELECT
Borrower1 = BorrowerName
, Borrower2 = LEAD(BorrowerName) OVER(ORDER BY BorrowerOrder)
FROM
#t AS t
WHERE
FileID = 'AF_CF'

But this returns a record for every borrower, so the form is printing 3 times:
1st: Borrower1 = Sloan Burton, Borrower2 = Virginia Burton
2nd: Borrower1 = Virginia Burton, Borrower2 = John Hardeman
3rd: Borrower1 = John Hardeman, Borrower2 = NULL

I don't want that 2nd record to return. This result is what makes me think of gaps and islands, but I don't know if the 2nd record is really an island since it's (1) not stored this way...it's returning this way because of the query and (2) it's not sequential data....

Finally, I tried restricting this by putting this into a CTE and then returning only the odd numbered records like I have below. This runs pretty quickly when dealing with one loan. But...I am concerned that the CTE will be slow when we run batches of loans.

Attempt with CTE:
--With CTE
;WITH cte AS
(SELECT
Borrower1 = BorrowerName
, Borrower2 = LEAD(BorrowerName) OVER(ORDER BY BorrowerOrder)
, RowNumber = ROW_NUMBER() OVER(ORDER BY BorrowerOrder)

FROM
#t AS t
WHERE
FileID = '::E'
)
SELECT
*
FROM
cte
WHERE
RowNumber % 2 = 1

Is there a better, cleaner way to do this? Or is the CTE the best way to go?


This is actually pretty easy to do but I need some clarification on the Borrower Order column. 0 typically comes before 1 or 2 in the world of numbers. Why, in you example, do the BorrowOrder of 1 and 2 come before 0?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1515063
Posted Monday, November 18, 2013 7:48 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 9:33 AM
Points: 8, Visits: 71
Thank you all for your replies!

Tom, you have it correct. There are, at times, gaps in the BorrowerOrder numbers, and there are times when there might be up to 7 or 8 borrowers.

Jeff, I apologize! This was my error in creating the test data. For the example resultset I gave, it should have been Sloan - 0, Virginia - 1, John - 2. What do you recommend?
Post #1515232
Posted Tuesday, November 19, 2013 6:33 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 10:30 PM
Points: 35,959, Visits: 30,251
jessica.green2312 (11/18/2013)
Jeff, I apologize! This was my error in creating the test data. For the example resultset I gave, it should have been Sloan - 0, Virginia - 1, John - 2. What do you recommend?


Heh... my first recommendation is to stop cutting yourself short by posting test-data-creation code that works only in 2008 and above. There are a whole lot of some really helpful and smart folks that could help you that are still stuck in 2005 depending on what environment they might currently be sitting in. Other than that, you did an outstanding job of posting readily consumable data we can understand. Well done.

With that thought in mind, I've converted your test data creation script to work in 2005 and up. I've also made the correction you identified and I made the BorrowerOrder non-sequential as you suggested it may be in real life.
  CREATE TABLE #t (LoanID VARCHAR(5), BorrowerName VARCHAR(20), BorrowerOrder INT);
GO
INSERT INTO #t
SELECT '::E' , 'John Smith' , 0 UNION ALL
SELECT '::E' , 'Jane Smith' , 1 UNION ALL
SELECT '::E' , 'Rob Jackson' , 2 UNION ALL
SELECT 'AF_CF' , 'Sloan Burton' , 0 UNION ALL
SELECT 'AF_CF' , 'Virginia Burton' , 2 UNION ALL
SELECT 'AF_CF' , 'John Hardeman' , 4 UNION ALL
SELECT '::E' , 'Mary Lamb' , 4 UNION ALL
SELECT '::E' , 'Patrick Stewart' , 3
;

I may have read a little too much into your request but my understanding is that you actually need to control the letters and the copies of the letters. No better way to do that than to create a row-per-copy relationship for a result set so that all you have to do is "field replacement" from a table. In other words, denormalize the hell out of the data in the form of a result set. Here's the code to do that. This uses an ancient method known as a "CrossTab" and it can be much faster than a Pivot. Please see the following article for more on that subject (which is followed by the code for this problem)...
http://www.sqlservercentral.com/articles/T-SQL/63681/

A little math along with the CrossTab code makes short work of it all.
   WITH ctePreSort AS
( --=== Create unbroken sequences starting at zero for each LoanID in BorrowerOrder order
SELECT LoanID
,BorrowerName
,SortOrder = ROW_NUMBER() OVER (PARTITION BY LoanID ORDER BY BorrowerOrder)-1
FROM #t
) --=== Do a string aggregation (CrossTab) to pivot the data correctly for use as a print control table.
SELECT LoanID
,CopyNumber = SortOrder/2+1
,Borrower1 = MAX(CASE WHEN SortOrder%2 = 0 THEN BorrowerName ELSE '' END)
,Borrower2 = MAX(CASE WHEN SortOrder%2 = 1 THEN BorrowerName ELSE '' END)
FROM ctePreSort
GROUP BY LoanID,SortOrder/2
ORDER BY LoanID,SortOrder/2
;


Here are the results...
LoanID CopyNumber Borrower1     Borrower2
------ ---------- ------------- ---------------
::E 1 John Smith Jane Smith
::E 2 Rob Jackson Patrick Stewart
::E 3 Mary Lamb
AF_CF 1 Sloan Burton Virginia Burton
AF_CF 2 John Hardeman

(5 row(s) affected)


Finally, I tried restricting this by putting this into a CTE and then returning only the odd numbered records like I have below. This runs pretty quickly when dealing with one loan. But...I am concerned that the CTE will be slow when we run batches of loans.


I think you were right to be concerned. I haven't used it myself, yet, but I've seen some performance tests by people I trust and LEAD/LAG doesn't measure up to what I had expected (doesn't come close, actually). I can tell you that, especially with an index on the GROUP BY columns, the CrossTab type of string aggregation I did above absolutely flies (although I admit that I have NOT tested this particular bit of code but have severely tested nearly identical code).


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1515859
Posted Wednesday, November 20, 2013 6:31 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 6:03 PM
Points: 3,590, Visits: 5,098
I can second what Jeff said about performance of LAG/LEAD as I've done some testing myself. Frankly I was not impressed in the admittedly limited testing that I have done, even though they do tend to make the code look simpler (in some cases).

In this case, I think what Jeff offered looks pretty simple. At least to me. But perhaps I am not objective as I've used that same technique before.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1516287
Posted Thursday, November 21, 2013 8:43 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 9:33 AM
Points: 8, Visits: 71
Ahhh! Thank you so much, Jeff!! I've used crosstab before, but didn't realize that's what that was called. Scary, right? I feel like a kid who drops a perfectly good toy in favor of a shiny new one. Hahaha. This is great and works really well - I appreciate you helping me out. You explained everything so well and thank you for sharing the link to your crosstab/pivot article! I'm sharing this post and the article link with my group at work. And I also appreciate you pointing out the use of the backwards compatible test code to broaden my audience. I'll be sure to do this going forward.

Have a great day, everyone!
Post #1516459
Posted Thursday, November 21, 2013 8:44 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 9:33 AM
Points: 8, Visits: 71
Thank you, Dwain - good to know!
Post #1516460
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse