Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Get list of all workers having same SET of Rates


Get list of all workers having same SET of Rates

Author
Message
T.Ashish
T.Ashish
Right there with Babe
Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)

Group: General Forum Members
Points: 767 Visits: 582
Hi All,

I have attached some test data for you that has two temp tables "#worker" and "#worker_rate".

The issue is to find all workers who are sharing SAME SET of rate_codes.

I'm able to get the output as "workers sharing same rate_codes", but unfortunately I could not get the list of workers sharing same SET of rate_codes. Also definition of SAME SET is not defined.

I don't know what I'm missing. Any help will be appreciated.
Attachments
test_data.txt (90 views, 744 bytes)
Lowell
Lowell
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14980 Visits: 39002
nice job providing the sample data!

in this case, you need to get the "sets" organized first so they can be compared.


Someone else may have another method, but i thought using FOR XML to create a comma delimtied list would work fine.


select distinct T1.worker_id,AllRateCodes.Codez from #worker_rate T1
cross apply (SELECT Codez = STUFF((SELECT ',' + T2.rate_code
from #worker_rate T2
where T1.worker_id = T2.worker_id
ORDER BY rate_code
FOR XML PATH('')),1,1,'')) AllRateCodes




now that that is organized, I'm not sure if you need more than that, since it's visualized, ort if you need to join that resultset agaisnt itself so you can compare them?

i'm ASSUMING workerid is integers here really so i can order by and prevent duplicates(ie 1=2 and 2=1, which is repetitively redundant)


With MyCTE
AS
(
select distinct T1.worker_id,AllRateCodes.Codez from #worker_rate T1
cross apply (SELECT Codez = STUFF((SELECT ',' + T2.rate_code
from #worker_rate T2
where T1.worker_id = T2.worker_id
ORDER BY rate_code
FOR XML PATH('')),1,1,'')) AllRateCodes
)

SELECT * FROM MYCTE T1 INNER JOIN MyCTE T2
ON T1.Codez = T2.Codez
WHERE T1.worker_id <> T2.worker_id
AND T1.worker_id < T2.worker_id



Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

ChrisM@Work
ChrisM@Work
SSCrazy Eights
SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)

Group: General Forum Members
Points: 9025 Visits: 19036
Here's a pure TSQL method to compare with Lowell's string-concatenation method.
;WITH Rates AS (SELECT worker_id, rate_code, rate_count = COUNT(*) OVER(PARTITION BY worker_id) 
FROM #worker_rate r)
SELECT
rate_group = DENSE_RANK() OVER(ORDER BY r1.worker_id),
r2.worker_id
FROM Rates r1
INNER JOIN Rates r2
ON r2.worker_id >= r1.worker_id
AND r2.rate_code = r1.rate_code
AND r2.rate_count = r1.rate_count
GROUP BY r1.worker_id, r2.worker_id
HAVING MAX(r1.rate_count) = COUNT(*)



“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
dwain.c
dwain.c
SSCarpal Tunnel
SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)

Group: General Forum Members
Points: 4283 Visits: 6431
To the OP: Please specify precisely what your expected results are. Do not describe them. List out the rows you want to see.

I'm thinking this is a relational division problem and I'm curious which answer provided previously is correct.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
T.Ashish
T.Ashish
Right there with Babe
Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)Right there with Babe (767 reputation)

Group: General Forum Members
Points: 767 Visits: 582
Dear All,

Thanks for your kind support. It was really not easy to write a script specially when requirement is not clear. I made a few changes in suggested queries and it gave me desired results.


With MyCTE
AS
(
select distinct T1.worker_id,AllRateCodes.SET_OF_CODES from #worker_rate T1
cross apply
(
SELECT SET_OF_CODES = STUFF((SELECT ',' + T2.rate_code
from #worker_rate T2
where T1.worker_id = T2.worker_id
ORDER BY rate_code
FOR XML PATH('')),1,1,'')
) AllRateCodes
)
SELECT
T1.WORKER_ID, T1.SET_OF_CODES, DENSE_RANK() OVER (ORDER BY T1.SET_OF_CODES) AS SNO FROM MYCTE T1 ORDER BY SET_OF_CODES


The required output was like this:

If worker 1 has rates a/b/c
then we have to find other workers who have rates a/b/c.
If worker 2 has a/b/c/d
It should not show with worker 1 as a/b/c and with other worker as a/b/c/d.
So worker_id will be distinct in the list.


worker_id set_of_codes sno
4 1,2 1
1 1,2,3 2
2 1,2,3 2
3 1,2,3 2

WayneS
WayneS
SSCertifiable
SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)

Group: General Forum Members
Points: 6257 Visits: 10403
Another possibility might be to use the intersect operator

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
Links: For better assistance in answering your questions, How to ask a question, Performance Problems, Common date/time routines,
CROSS-TABS and PIVOT tables Part 1 & Part 2, Using APPLY Part 1 & Part 2, Splitting Delimited Strings

a.myasnikov
a.myasnikov
SSC Rookie
SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)

Group: General Forum Members
Points: 29 Visits: 141
Just wondering what the grand-masters will say about a checksum solution?

From what I know about checksums they are pretty reliable way to tell one set from the other... Any comments are welcome.



; WITH CTE
as (
SELECT worker_id,
CHECKSUM_AGG(CHECKSUM(rate_code)) chk
FROM #worker_rate
GROUP BY worker_id
)

SELECT worker_id,
DENSE_RANK() OVER (ORDER BY chk) SNO
FROM CTE



Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45347 Visits: 39936
a.myasnikov (4/8/2014)
Just wondering what the grand-masters will say about a checksum solution?

From what I know about checksums they are pretty reliable way to tell one set from the other... Any comments are welcome.



; WITH CTE
as (
SELECT worker_id,
CHECKSUM_AGG(CHECKSUM(rate_code)) chk
FROM #worker_rate
GROUP BY worker_id
)

SELECT worker_id,
DENSE_RANK() OVER (ORDER BY chk) SNO
FROM CTE




That would be a great idea and it definitely shows "thinking outside the box" but... CHECKSUM and CHECKSUM_AGG use a simple "Exlusive OR" (an "adder" by any other name) that allows for duplicates to occur.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
hunchback
hunchback
SSC-Enthusiastic
SSC-Enthusiastic (117 reputation)SSC-Enthusiastic (117 reputation)SSC-Enthusiastic (117 reputation)SSC-Enthusiastic (117 reputation)SSC-Enthusiastic (117 reputation)SSC-Enthusiastic (117 reputation)SSC-Enthusiastic (117 reputation)SSC-Enthusiastic (117 reputation)

Group: General Forum Members
Points: 117 Visits: 639
I do not get the right result with the CHECKSUM suggestion. May be I am doing something wrong.

This is the result I am getting:

worker_id SNO
4 1
1 2
2 2
3 2

This is the result from the XML approach:

worker_id Codez worker_id Codez
1 1,2,3 2 1,2,3
1 1,2,3 3 1,2,3
2 1,2,3 3 1,2,3

The nice thing about the XML approach is that you get also the list of rate codes. IMHO it would be nice having an ordered set function for string aggregation (http://connect.microsoft.com/SQLServer/feedback/details/728969/feature-request-ordered-set-functions-within-group-clause).

There is another solution I learned from Peter (Peso) which yield a much better performance. Here is a nice article comparing three different approaches.

http://social.technet.microsoft.com/wiki/contents/articles/22165.t-sql-relational-division.aspx


WITH C1 AS (
SELECT
worker_id,
COUNT(*) AS cnt,
MIN(rate_code) AS min_rc,
MAX(rate_code) AS max_rc
FROM
#worker_rate
GROUP BY
worker_id
)
, C2 AS (
SELECT
B.worker_id,
B.rate_code,
A.cnt,
A.min_rc,
A.max_rc
FROM
C1 AS A
INNER JOIN
#worker_rate AS B
ON A.worker_id = B.worker_id
)
SELECT
A.worker_id AS lwid,
B.worker_id AS rwid
FROM
C2 AS A
INNER JOIN
C2 AS B
ON A.worker_id < B.worker_id
AND A.rate_code = B.rate_code
AND A.cnt = B.cnt
AND A.min_rc = B.min_rc
AND A.max_rc = B.max_rc
GROUP BY
A.worker_id,
B.worker_id
HAVING
COUNT(*) = MIN(B.cnt);
GO





Lynn Pettis
Lynn Pettis
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24263 Visits: 37984
May I ask, as it doesn't appear to be obvious to me at any rate, what exactly are you looking for as a final result set? Can you show us what this would look like based on your sample data?

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search