EdgeOfKnowWare (2/22/2016)
Alan, I agree with you, if you're just wanting one ranking. If we expand upon the Original Posters issue (which is what I was facing) and say that you want to different groupings, how would you handle that?So, let's take the example and add a column.
--NTILE (and others, presumably) gotcha: NULL inputs are included as belonging to the bottom tile,
--displacing other values to higher tiles.
IF OBJECT_ID('tempdb.dbo.#t', 'U') IS NOT NULL DROP TABLE #t;
CREATE TABLE #t
(
ID INT IDENTITY(1,1) PRIMARY KEY,
val INT NULL,
val2 INT NULL
);
--INSERT 10 values: 3 NULL, 7 integers
INSERT INTO #t SELECT NULL,1;
INSERT INTO #t SELECT NULL,2;
INSERT INTO #t SELECT NULL,3;
INSERT INTO #t SELECT 5,4;
INSERT INTO #t SELECT 7,5;
INSERT INTO #t SELECT 8,6;
INSERT INTO #t SELECT 9,NULL;
INSERT INTO #t SELECT 9,NULL;
INSERT INTO #t SELECT 11,NULL;
INSERT INTO #t SELECT 12,NULL;
--Compute quartiles.
--With 7 non-NULL values, this should have 2-2-2-1 rows per quartile and the NULLs ignored (to my thinking).
--Instead, we get back tiles based on 10 rows of input, with 3-3-2-2 rows per quartile.
--The lowest real values of 5 and 7 are in Q2, not Q1, b/c the 3 NULLs are all the rows in Q1:
Suppose you want the results of this:
IDvalQval2Q2
1NULLNULL11
2NULLNULL21
3NULLNULL32
45142
57153
68264
792NULLNULL
893NULLNULL
9113NULLNULL
10124NULLNULL
How would you write the query then? That's the direction I was going.
For something like that I would do this:
WITH
Q1 AS
(
SELECT
ID,
val,
Q = NULL,
val2
FROM #t
WHERE val IS NULL
UNION ALL
SELECT
ID,
val,
Q = NTILE(4) OVER (ORDER BY val),
val2
FROM #t
WHERE val IS NOT NULL
),
Q2 AS
(
SELECT ID, val2, Q2 = NTILE(4) OVER (ORDER BY val2)
FROM Q1
WHERE val2 IS NOT NULL
)
SELECT
Q1.ID,
val,
Q,
Q1.val2,
Q2
FROM Q1
LEFT JOIN Q2 ON Q1.ID = Q2.ID;
If you went that route you would really want this index:
CREATE NONCLUSTERED INDEX nc_t_val1_temp ON #t(val, val2) INCLUDE (ID);
and would be able to further improve performance using this index:
CREATE NONCLUSTERED INDEX nc_t_val2_temp ON #t(val2) INCLUDE (ID, val);
-- Itzik Ben-Gan 2001