Recent PostsRecent Posts Popular TopicsPopular Topics
 Home Search Members Calendar Who's On

 Puzzle: vertex covers in SQL Rate Topic Display Mode Topic Options
Author
 Message
 Posted Monday, January 7, 2013 8:29 AM
 Mr or Mrs. 500 Group: General Forum Members Last Login: Thursday, April 30, 2015 10:03 AM Points: 560, Visits: 2,288
 Celko,How about me proposing this as a problem in Beyond Relational's TSQL Challenges? There are some very sharp minds there that could come up with something interesting.
Post #1403653
 Posted Monday, January 7, 2013 12:51 PM
 SSCommitted Group: General Forum Members Last Login: Thursday, April 30, 2015 5:07 AM Points: 1,945, Visits: 3,506
 I've never seen this problem, so I'm not even sure if I am solving it correctly. This code is probably quite inefficient, but I THINK it solves the Min Cover problem? It is an NP-Complete problem. NOBODY has an efficient answer. I just thought it might be fun. I would add a bit more to my posting to get use some tools: `CREATE TABLE Grid(v1 SMALLINT NOT NULL CHECK (v1 > 0), v2 SMALLINT NOT NULL CHECK (v2 > 0), v_cover SMALLINT DEFAULT 0 NOT NULL,PRIMARY KEY (v1, v2), CHECK (v1 < v2));INSERT INTO Grid (v1, v2)VALUES (1, 2), (1, 4), (2, 3), (2, 5), (2, 6); CREATE VIEW Vertexes (v)ASSELECT v1, v_cover FROM GridUNIONSELECT v2, v_cover FROM Grid; CREATE VIEW Vertex_Degrees (v, degree)ASSELECT V.v, COUNT(*) AS degree FROM Grid AS G, Vertexes AS V WHERE V.v IN (G.v1, G.v2) GROUP BY V.v;`If you make the vertex cover flag a SMALLINT, then we can play games with it. We probably use the degree for a greedy algorithms. Books in Celko Series for Morgan-Kaufmann PublishingAnalytics and OLAP in SQL Data and Databases: Concepts in Practice Data, Measurements and Standards in SQLSQL for SmartiesSQL Programming Style SQL Puzzles and Answers Thinking in SetsTrees and Hierarchies in SQL
Post #1403829
 Posted Monday, January 7, 2013 5:38 PM
 Hall of Fame Group: General Forum Members Last Login: Yesterday @ 5:14 AM Points: 3,893, Visits: 6,065
 Joe,When I was playing around with adding the additional column into the table, I chose to make all the vertices be a CHAR data type instead of SMALLINT. The reason being is that you're not likely to perform math (or would you?) on the identifier of a vertex.This also eliminates the need for the CASTs in the solution I did provide (presumably making it just a tad faster).Not sure why I couldn't get my version with the additional column to work. Well, actually it did work but it wasn't providing "as optimal" a solution as the solution I have posted using the TABLE VARIABLE. I'll probably try again at some point when I have some time, because even though it wasn't as optimal it was faster as I expected it might be. My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!My thought question: Have you ever been told that your query runs too fast?My advice:INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.Need to UNPIVOT? Why not CROSS APPLY VALUES instead?Since random numbers are too important to be left to chance, let's generate some!Learn to understand recursive CTEs by example.Splitting strings based on patterns can be fast!My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
Post #1403932
 Posted Tuesday, January 8, 2013 2:37 AM
 Mr or Mrs. 500 Group: General Forum Members Last Login: Thursday, April 30, 2015 10:03 AM Points: 560, Visits: 2,288
 I see two approaches:1. going from all vertices selected to a solution (top-down)2. going from no vertices selected to a solution (bottom-up)In both cases you will eventually run into a number of vertices where you will not arrive at a solution in a reasonable amount of time, i.e. you cannot traverse the full search tree. This is where you would need to apply an algorithm that uses a random approach and provides the best solution found within a given amount of time. It will not necessarily be the theoretically best solution.The random approach is divided into two (given N vertices):1. starting with K = N-1 generate all combinations K of N vertices and check for coverage; once a solution is found for given combination stop and decrease K by one and repeat; stop this when the number of combinations exceeds some threshold;2. generate K random vertices and check if it is a solution; if not then repeat; if yes then decrease K by 1 and repeat;Stop when exceeding a given time threshold.Now let me think about how to map this into an SQL approach...
Post #1404073
 Posted Tuesday, January 8, 2013 2:56 AM
 Hall of Fame Group: General Forum Members Last Login: Yesterday @ 5:14 AM Points: 3,893, Visits: 6,065
 Michael Meierruth (1/7/2013)Celko,How about me proposing this as a problem in Beyond Relational's TSQL Challenges? There are some very sharp minds there that could come up with something interesting.I would be interested to know how you would judge such a competition. Speed is important but so is a "nearer optimal" solution.I've seen this in bin packing solutions, where you usually have to trade off speed for better bin utilization. My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!My thought question: Have you ever been told that your query runs too fast?My advice:INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.Need to UNPIVOT? Why not CROSS APPLY VALUES instead?Since random numbers are too important to be left to chance, let's generate some!Learn to understand recursive CTEs by example.Splitting strings based on patterns can be fast!My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
Post #1404084
 Posted Tuesday, January 8, 2013 3:43 AM
 Mr or Mrs. 500 Group: General Forum Members Last Login: Thursday, April 30, 2015 10:03 AM Points: 560, Visits: 2,288
 For a challenge like this the requirement would be to find the optimal solution(s).For the perfomance test a sufficiently large number of vertices would be used such that the cpu time would not exceed 60 seconds for the best solution. Any solution exceeding 5 minutes would be eliminated. To prevent 'cheating' there would be two performance tests, one for which the data is made public before hand and one where the data is not made public until the challenge has been closed.
Post #1404102
 Posted Tuesday, January 8, 2013 6:48 AM
 Mr or Mrs. 500 Group: General Forum Members Last Login: Thursday, April 30, 2015 10:03 AM Points: 560, Visits: 2,288
 diamondgm (1/7/2013)I've never seen this problem, so I'm not even sure if I am solving it correctly.This code is probably quite inefficient, but I THINK it solves the Min Cover problem?Scuze the messiness, I'm at work and did it in a rush diamondgm,For the arbitrary data below your code produces a solution with 11 vertices. Dwain's code produces a solution with 6 vertices (and which is correct). Thus, by definition, yours is not a min cover solution. Dwain's solution looks to be min cover (but still checking that).values (1,2),(2,3),(3,4),(4,15),(14,15),(4,9),(9,13),(9,10),(10,12),(10,11),(7,11),(7,8),(8,9),(6,7),(5,6),(5,8),(4,5)Just looking at this sample data leads me to believe that a good approach is to select vertices with the most edges first.
Post #1404214
 Posted Tuesday, January 8, 2013 10:24 AM
 SSCommitted Group: General Forum Members Last Login: Thursday, April 30, 2015 5:07 AM Points: 1,945, Visits: 3,506
 When I was playing around with adding the additional column into the table, I chose to make all the vertices be a CHAR data type instead of SMALLINT. The reason being is that you're not likely to perform math (or would you?) on the identifier of a vertex., Agreed; I did numeric because the book I was using numbered the nodes, and fell into that notation. I cannot imagine any math on them. Books in Celko Series for Morgan-Kaufmann PublishingAnalytics and OLAP in SQL Data and Databases: Concepts in Practice Data, Measurements and Standards in SQLSQL for SmartiesSQL Programming Style SQL Puzzles and Answers Thinking in SetsTrees and Hierarchies in SQL
Post #1404369
 Posted Tuesday, January 8, 2013 5:32 PM
 Hall of Fame Group: General Forum Members Last Login: Yesterday @ 5:14 AM Points: 3,893, Visits: 6,065
 Here's another network you can validate it on, assuming you want to manually calculate the optimal cover (from the 3rd article in my signature line).Here's the setup data (note where the CHECK constraint has been removed because originally this was designed as a directed graph so the vertex traversal can go from larger to smaller).`CREATE TABLE #Grid(v1 CHAR(1) NOT NULL, -- CHECK (v1 > 0), v2 CHAR(1) NOT NULL, -- CHECK (v2 > 0), PRIMARY KEY (v1, v2)--, --CHECK (v1 < v2));INSERT INTO #GridSELECT 'A', 'B'UNION ALL SELECT 'A', 'C'UNION ALL SELECT 'A', 'D'UNION ALL SELECT 'B', 'E'UNION ALL SELECT 'B', 'F'UNION ALL SELECT 'C', 'G'UNION ALL SELECT 'C', 'H'UNION ALL SELECT 'D', 'I'UNION ALL SELECT 'D', 'J'UNION ALL SELECT 'E', 'K'UNION ALL SELECT 'F', 'L'UNION ALL SELECT 'G', 'K'UNION ALL SELECT 'G', 'M'UNION ALL SELECT 'H', 'L'UNION ALL SELECT 'I', 'M'UNION ALL SELECT 'J', 'M'UNION ALL SELECT 'K', 'N'UNION ALL SELECT 'K', 'O'UNION ALL SELECT 'L', 'O'UNION ALL SELECT 'M', 'O' UNION ALL SELECT 'O', 'Q'UNION ALL SELECT 'Q', 'R'UNION ALL SELECT 'J', 'P'UNION ALL SELECT 'N', 'Q'UNION ALL SELECT 'N', 'R'UNION ALL SELECT 'P', 'S'UNION ALL SELECT 'P', 'T'UNION ALL SELECT 'Q', 'W'UNION ALL SELECT 'Q', 'U'UNION ALL SELECT 'R', 'U'UNION ALL SELECT 'R', 'V'UNION ALL SELECT 'S', 'V'UNION ALL SELECT 'S', 'R'UNION ALL SELECT 'T', 'V'UNION ALL SELECT 'T', 'Y'UNION ALL SELECT 'U', 'W'UNION ALL SELECT 'U', 'X'UNION ALL SELECT 'V', 'X'UNION ALL SELECT 'V', 'Y'UNION ALL SELECT 'X', 'W'UNION ALL SELECT 'W', 'Z'UNION ALL SELECT 'Y', 'Z'`My solution should return these covers:`V V,R,S,T,X,YQ Q,N,O,R,U,WM M,G,I,J,OK K,E,G,N,OP P,J,S,TL L,F,H,OD D,A,I,JC C,A,G,HB B,A,E,FZ Z,W,Y` My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!My thought question: Have you ever been told that your query runs too fast?My advice:INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.Need to UNPIVOT? Why not CROSS APPLY VALUES instead?Since random numbers are too important to be left to chance, let's generate some!Learn to understand recursive CTEs by example.Splitting strings based on patterns can be fast!My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables