July 24, 2013 at 3:31 pm
Hi SSC,
I've got a table which is wasting a lot of space in it's current narrow format and I'd like to pivot it out. The table contains about 44 million rows. The example below is a scaled down version of what I'm dealing with. The idea being to pivot out the [Measure] values into their own columns. In the real data there are about 12 different measures which will become their own columns.
My question is, is there a way to index the underlying table, or perhaps an analogous method to Pivot which would get index seeks or better performance over a data set of this size?
--Setup raw table
if object_id('tempdb.dbo.#Raw') is not null drop table #Raw
create table #Raw
(
Id varchar(25),
TradeDate int,
Measure varchar(100),
Value float
primary key clustered (Id, TradeDate, Measure)
)
--Populate sample data
;with nums as --numbers table
(
select num = row_number() over (order by [object_id])
from sys.objects with (nolock)
), measures as --measures
(
select Measure = 'Alpha' union all
select 'Beta' union all
select 'Gamma' union all
select 'Delta' union all
select 'Epsilon'
)
insert into #Raw
(
Id,
TradeDate,
Measure,
Value
)
select
Id = ids.num,
TradeDate = 40000 + dates.num,
Measure = m.Measure,
Value = checksum(newid()) % 10000
from nums ids
cross join nums dates
cross join measures m
where ids.Num between 1 and 100
/*********************
** Actual Statement **
*********************/
select Id, TradeDate, Alpha = max(Alpha), Beta = max(Beta), Gamma = max(Gamma), Delta = max(Delta), Epsilon = max(Epsilon)
from (select Id, TradeDate, Measure, Value
from #Raw) src
pivot (max(Value) for Measure in (Alpha, Beta, Gamma, Delta, Epsilon)) pvt
group by Id, TradeDate
July 24, 2013 at 3:55 pm
Maybe this index will help
CREATE NONCLUSTERED INDEX [IndexName] ON [dbo].[TableNAme]
(
[Id] ASC,
[TradeDate] ASC
)
INCLUDE ( [Value]) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF)
But what should work even better is to use a cross-tab query instead of pivot.
select Id,
TradeDate,
Alpha = max(CASE WHEN Measure = 'Alpha' THEN Value END),
Beta = max(CASE WHEN Measure = 'Beta' THEN Value END),
Gamma = max(CASE WHEN Measure = 'Gamma' THEN Value END),
Delta = max(CASE WHEN Measure = 'Delta' THEN Value END),
Epsilon = max(CASE WHEN Measure = 'Epsilon' THEN Value END)
from (select Id, TradeDate, Measure, MAX(Value)
from #Raw
GROUP BY Id, TradeDate, Measure) src
GROUP BY Id,
TradeDate
The only way to be sure is to test and test again.
You can find a detailed explanation on this at the following articles
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy