Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 1234»»»

Puzzle: vertex covers in SQL Expand / Collapse
Author
Message
Posted Saturday, January 5, 2013 4:13 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Friday, December 19, 2014 12:12 PM
Points: 1,945, Visits: 3,180
I have to do a book chapter on graph database. I already did “The Kevin Bacon” problem in SQL for a graph-base product company, so i wanted another one to demonstrate how these things work. The problem I picked is “Vertex covering” (http://en.wikipedia.org/wiki/Vertex_covering), which is known to be NP-complete and the best algorithms for it have an upper bound of twice the size.

In short, it sounds easy but it is not. Anyone want to play with it in SQL?

Specs:
Formally, a vertex cover of an undirecterd graph G is a set C of vertices such that each edge of G is incident to at least one vertex in C. The set C is said to cover the edges of G.

Informally, you have a weird street map and want to put up security cameras at intersections (nodes) in such a way that no street (edge) is not under surveillance.

Here is a simple table and a simple five edge graph, taken from the Wikipedia article just referenced.

CREATE TABLE Grid
(v1 SMALLINT NOT NULL CHECK (v1 > 0),
v2 SMALLINT NOT NULL CHECK (v2 > 0),
PRIMARY KEY (v1, v2),
CHECK (v1 < v2));

INSERT INTO Grid (v1, v2)
VALUES (1, 2), (1, 4), (2, 3), (2, 5), (2, 6);

{1, 2, 6} and {2, 4} are vertex covers. The second is the minimal cover. Can you prove it in SQL? How would you look for a cover? My first thought is grab random subset of (n) nodes and test them – brute force.

The obvious limitations are that all the vertices is a cover but not minimal, and that the empty set is not a cover until the graph is empty.


Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #1403293
Posted Sunday, January 6, 2013 5:29 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 11:50 PM
Points: 3,440, Visits: 5,397
Let's make that Wiki article a bit more accessible: http://en.wikipedia.org/wiki/Vertex_covering while I work on it.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1403380
Posted Sunday, January 6, 2013 6:23 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 11:50 PM
Points: 3,440, Visits: 5,397
Probably not an optimal solution but I think this will work.

CREATE TABLE #Grid
(v1 SMALLINT NOT NULL CHECK (v1 > 0),
v2 SMALLINT NOT NULL CHECK (v2 > 0),
PRIMARY KEY (v1, v2),
CHECK (v1 < v2));

INSERT INTO #Grid (v1, v2)
VALUES (1, 2), (1, 4), (2, 3), (2, 5), (2, 6);

DECLARE @Covers TABLE
(v SMALLINT -- Covering vertex
,vn VARCHAR(8000)) -- Covered vertices

WHILE EXISTS (
SELECT v
FROM #Grid
CROSS APPLY (VALUES (v1),(v2)) a (v)
EXCEPT
SELECT Item
FROM @Covers
CROSS APPLY DelimitedSplit8K(vn, ',')
)
BEGIN

WHILE EXISTS (
SELECT v
FROM #Grid
CROSS APPLY (VALUES (v1),(v2)) a (v)
EXCEPT
SELECT Item
FROM @Covers
CROSS APPLY DelimitedSplit8K(vn, ',')
)
BEGIN

;WITH Counts AS (
SELECT v, n=COUNT(v)
FROM (
SELECT v
FROM #Grid
CROSS APPLY (VALUES (v1),(v2)) a (v)
WHERE v NOT IN (
SELECT Item
FROM @Covers
CROSS APPLY DelimitedSplit8K(vn, ','))) a
GROUP BY v)
INSERT INTO @Covers
--OUTPUT INSERTED.*
SELECT v, CAST(a.v AS VARCHAR(5)) + (
SELECT ',' + CASE WHEN v=v2 THEN CAST(v1 AS VARCHAR(5)) ELSE CAST(v2 AS VARCHAR(5)) END
FROM #Grid b
WHERE a.v = b.v1 OR a.v = b.v2
FOR XML PATH(''))
FROM (
SELECT TOP 1 v
FROM Counts
ORDER BY n DESC, v) a

END

END

SELECT *
FROM @Covers

DROP TABLE #Grid


Best I could do on short notice.

Let the performance competition begin.

A question to you Joe. Could I alter your original table's structure? I'm thinking that if you add a "Covered" column the algorithm could be made more efficient by marking a vertex as covered once it is.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1403385
Posted Sunday, January 6, 2013 7:17 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 11:50 PM
Points: 3,440, Visits: 5,397
CELKO (1/5/2013)

{1, 2, 6} and {2, 4} are vertex covers. The second is the minimal cover. Can you prove it in SQL? How would you look for a cover? My first thought is grab random subset of (n) nodes and test them – brute force.



Instead of choosing a random subset of nodes, my loop solution selects the vertex remaining that has the most coverage.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1403393
Posted Sunday, January 6, 2013 7:24 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Friday, December 19, 2014 12:12 PM
Points: 1,945, Visits: 3,180
Probably not an optimal solution but I think this will work.


No "probably" about it; this is an NP-complete problem so we know that the best we can do is to get some answer that looks better than the other ones we found in a reasonable time.

Let the performance competition begin.


I can offer on one my books at prize! I will send a rare UN-autographed copy so the winner can take it to B&N, claim they lost the sales slip and get cash!

Could I alter your original table's structure? I'm thinking that if you add a "Covered" column the algorithm could be made more efficient by marking a vertex [sic: edge, not vertex] as covered once it is.


Sure; 80-95% of the work in SQL is in the DDL! There is a greedy algorithm that does this kind of marking; it is known to be between optimal and "twice optimal" set sizes.



Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #1403395
Posted Sunday, January 6, 2013 7:31 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 11:50 PM
Points: 3,440, Visits: 5,397
CELKO (1/6/2013)

I can offer on one my books at prize! I will send a rare UN-autographed copy so the winner can take it to B&N, claim they lost the sales slip and get cash!



I'll take one of the not-so-rare autographed ones as long as you pick up the shipping to Thailand.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1403396
Posted Sunday, January 6, 2013 7:48 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 11:50 PM
Points: 3,440, Visits: 5,397
Originally my second solution... withdrawn because I couldn't get it working correctly.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1403397
Posted Sunday, January 6, 2013 8:35 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 11:50 PM
Points: 3,440, Visits: 5,397
My apologies. Apparently I posted some buggy code!

I have fixed the first solution. Now working on the second.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1403400
Posted Sunday, January 6, 2013 11:47 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 11:50 PM
Points: 3,440, Visits: 5,397
Seems I couldn't get my second solution to work so I withdrew it.

For now, I only have the one option for you, which I made a slight correction to above.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1403424
Posted Monday, January 7, 2013 5:06 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, November 19, 2014 12:10 AM
Points: 141, Visits: 860
I've never seen this problem, so I'm not even sure if I am solving it correctly.
This code is probably quite inefficient, but I THINK it solves the Min Cover problem?
Scuze the messiness, I'm at work and did it in a rush

USE tempdb
GO
IF OBJECT_ID('tempdb.dbo.Grid') IS NOT NULL DROP TABLE Grid
DECLARE @MinCover TABLE (Node INT)
DECLARE @Node INT
CREATE TABLE dbo.Grid
(
v1 INT
,v2 INT
,Processed BIT
);

INSERT INTO Grid (v1, v2, Processed)
VALUES (1, 2, 0), (1, 4, 0), (2, 3, 0), (2, 5, 0), (2, 6, 0);

WHILE EXISTS (SELECT * FROM Grid WHERE Processed = 0)
BEGIN
;WITH NodeOcc AS
(
SELECT Node = v1
FROM Grid
UNION ALL
SELECT v2
FROM Grid
)
INSERT INTO @MinCover
SELECT TOP 1 Node
FROM NodeOcc
WHERE Node NOT IN (SELECT Node FROM @MinCover)
GROUP BY Node
ORDER BY COUNT(*) DESC

UPDATE G
SET Processed = 1
FROM Grid G
JOIN @MinCover NN ON G.v1 = NN.Node OR G.v2 = NN.Node
END

SELECT *
FROM @MinCover


Post #1403550
« Prev Topic | Next Topic »

Add to briefcase 1234»»»

Permissions Expand / Collapse