Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Permutations (difficult one) Expand / Collapse
Author
Message
Posted Monday, January 27, 2014 10:19 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, July 11, 2014 2:37 AM
Points: 8, Visits: 45
Hi,
We're tyring to find the best way to solve the following, hope you can help please.
We have a fairly large table of users (2500+) and the users belong to 1 or more groups (at the moment the max number of groups a user is in is 32 but it changes daily).
A sample data set looks like:
Usr | Grp
==============
SmithA | GrpA
SmithA | GrpB
MouseM | GrpD
MouseM | GrpB
MouseM | GrpC

The result set we need would be:
SmithA | GrpA
SmithA | GrpB
SmithA | GrpA,GrpB
MouseM | GrpB
MouseM | GrpC
MouseM | GrpD
MouseM | GrpB,GrpC
MouseM | GrpB,GrpD
MouseM | GrpC,GrpD
MouseM | GrpB,GrpC,GrpD

Note the alphabetical order of the grps - it's not particularly imporant but if they aren't sorted alphabetical then we'd need all combinations.

Any ideas please?
Post #1535110
Posted Monday, January 27, 2014 10:24 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 9:10 AM
Points: 13,230, Visits: 12,709
What a truly bizarre result set you need. You have to return the data in three different types of ways?

Can you post some ddl and sample data? I think you will need three different queries here, one for each "type" of result.


_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1535112
Posted Monday, January 27, 2014 10:28 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Today @ 1:08 PM
Points: 889, Visits: 5,701
I saw the first part and thought... oh, use STUFF...

and then I saw the odd requirement (show all permutations)

Here's the sample data part I did from the example... since he's new and all...

-- CREATE TABLES
CREATE TABLE #Person(
PersonID INT IDENTITY,
UserID VARCHAR(15) NOT NULL,
CONSTRAINT pkPerson PRIMARY KEY (PersonID));

CREATE TABLE #Groups(
GroupName VARCHAR(4) PRIMARY KEY );
GO

-- ADD DATA
INSERT INTO #Person(UserID) VALUES ('SmithA'),('MouseM');

INSERT INTO #Groups(GroupName) VALUES ('GrpA'),('GrpB'),('GrpC'),('GrpD');

As you'll note in several people's signatures - the best way to get a quick answer, or ANY answer - is to make it as easy as possible for people to recreate your problem. This includes CREATE TABLE and INSERT statements to at least recreate your situation.

For a newbie, nice job! But sample tables and data are a HUGE help.
Post #1535113
Posted Monday, January 27, 2014 5:21 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 4:53 AM
Points: 3,422, Visits: 5,368
Take a look at this article:
Generating n-Tuples with SQL

Then look into the discussion thread for a slight performance improvement on the approach:
http://www.sqlservercentral.com/Forums/Topic1301485-3122-5.aspx

Coding up the improved approach with your sample data yields this:

WITH SampleData (Usr, Grp) AS
(
SELECT 'SmithA','GrpA'
UNION ALL SELECT 'SmithA','GrpB'
UNION ALL SELECT 'MouseM','GrpD'
UNION ALL SELECT 'MouseM','GrpB'
UNION ALL SELECT 'MouseM','GrpC'
),
UNIQUEnTuples (n, Usr, Tuples, ID) AS
(
SELECT 1, Usr, CAST(Grp AS VARCHAR(8000)), Grp
FROM SampleData
UNION ALL
SELECT 1 + n.n, n.Usr, t.Grp + ',' + n.Tuples, Grp
FROM UNIQUEnTuples n
CROSS APPLY
(
SELECT Grp
FROM SampleData t
WHERE t.Grp < n.ID AND t.Usr = n.Usr
) t
-- WHERE n <= 5
)
SELECT *
FROM UNIQUEnTuples
ORDER BY Usr, n;


This has the potential to run like a pig if you really do have 32 groups possible for a Usr. You might be able to improve the performance by running the anchor leg of the query first to a Temp table and then using a WHILE loop to implement the recursive leg.

If you're not sure how to do this, let me know.

Edit: I added the commented WHERE clause, because you might want to run it with that WHERE clause uncommented the first time you try with your 32 groups, just to see how long it takes.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1535263
Posted Monday, January 27, 2014 8:45 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 4:53 AM
Points: 3,422, Visits: 5,368
pietlinden (1/27/2014)

...
and then I saw the odd requirement (show all permutations)
...


Technically what the OP is looking for is combinations (not permutations).



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1535285
Posted Tuesday, January 28, 2014 6:43 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, July 11, 2014 2:37 AM
Points: 8, Visits: 45
First of all that's amazing, it give exactly the required output. Thank you very much. Currently I'm really struggling to see how it works but I'll try and disect this evening.

There's a small problem though, if I run this on my real data, it runs out of memory after 30 minutes.
There are 474 users who belong to on average 6 groups (max 32).
The average length of a group name is 20 characters (max 47).

In the interests of being able to get the task nailed today (the business are badgering me) is there something we can do to optimise?

I've created a table of the 88 distinct groups with a int column 1-88 indexing them.
Every row in the user table now has an extra int column containing this GrpID.

Is that helpful and how would it change the query pls?
Post #1535411
Posted Tuesday, January 28, 2014 4:00 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 4:53 AM
Points: 3,422, Visits: 5,368
I would bet it's not memory that's the issue. It's row counts.

Consider the case where you said you have one user in 32 groups. That user alone would generate POWER(2,32)-1 rows (working from memory here so check the article to be sure). That's a lot of rows.

You can try the loop I suggested but it will still have to deal with ridiculous numbers of rows.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1535653
Posted Tuesday, January 28, 2014 5:27 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 9:51 AM
Points: 5,446, Visits: 7,616
dwain.c (1/28/2014)
That user alone would generate POWER(2,32)-1 rows (working from memory here so check the article to be sure). That's a lot of rows.


Amusingly:


select power(2 ,32)


Msg 232, Level 16, State 3, Line 1
Arithmetic overflow error for type int, value = 4294967296.000000.


Besides, if I remember these formulas right, it's 32P31.

Out of curiousity, what is the business requirement causing this technical requirement? Besides functionally painful, it's certainly nearly impossible for an end user to use at 32 groups crosswired into combination sets, so it strikes me as a midstep. What's the end goal from this source data?



- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions | Forum Netiquette
For index/tuning help, follow these directions. |Tally Tables

Twitter: @AnyWayDBA
Post #1535662
Posted Tuesday, January 28, 2014 6:42 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, July 11, 2014 2:37 AM
Points: 8, Visits: 45
Ah ha, thought how to reduce the problem size down to 12 groups :)
Since the Roles with multiple groups consist of <=12 groups all I need to to do is filter the LDAP data and reduce it those entries that are in the set of groups that are part of multi group roles!...yippeee

Original problem is

A role consists of between 1 and x groups from a set of x groups.
A role can only have a particular group once.
A user may have more than one role
Example:

Role | Group
Helpdesk | Password Reset
Security | Password Reset,
Security | Enable User
Reception | Enable User
Reception | Disable User

The data I have is of the form

User | Group
SmithA | Password Reset
SmithA | Enable User
BloggsJ | Password Reset
MouseM | Enable User
MouseM | Disable User

From that I need to deduce the users have the following roles
User | Role
SmithA | Security
BloggsJ | Helpdesk
MouseM | Reception

Thanks
Post #1535688
Posted Tuesday, January 28, 2014 7:07 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 4:53 AM
Points: 3,422, Visits: 5,368
Evil Kraig F (1/28/2014)
dwain.c (1/28/2014)
That user alone would generate POWER(2,32)-1 rows (working from memory here so check the article to be sure). That's a lot of rows.


Amusingly:


select power(2 ,32)


Msg 232, Level 16, State 3, Line 1
Arithmetic overflow error for type int, value = 4294967296.000000.


...


Indeed, but this works:

SELECT POWER(CAST(2 AS BIGINT) ,32)

-- Result: 4294967296




My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1535690
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse