November 29, 2018 at 7:08 am
Hi All,
Need index suggestion for the below query which can improve the performance. (Assume that table has enough data)
SELECT SalesDate, SalesPersonID FROM dbo.Sales WHERE ProductID='PD1’ Order by productname;
I Would like to create index as below. Am I correct or anything needs to be changed?
CREATE INDEX <IDX___> ON dbo.Sales(ProductID) INCLUDE (SalesDate, SalesPersonID, productname)
Thanks,
Sam
November 29, 2018 at 7:16 am
Sam
Looks like you need to do a bit of normalising first - you shouldn't have ProductID and productname in a table of sales. Assuming changing that isn't an option, get rid of the ORDER BY clause, since all rows will have the same productname, won't they. Since you've taken that column out of the query, you no longer need it in the index, either.
John
November 29, 2018 at 7:51 am
samantha.sqldba - Thursday, November 29, 2018 7:08 AMHi All,Need index suggestion for the below query which can improve the performance. (Assume that table has enough data)
SELECT SalesDate, SalesPersonID FROM dbo.Sales WHERE ProductID='PD1’ Order by productname;
I Would like to create index as below. Am I correct or anything needs to be changed?
CREATE INDEX <IDX___> ON dbo.Sales(ProductID) INCLUDE (SalesDate, SalesPersonID, productname)Thanks,
Sam
CREATE INDEX <IDX___> ON dbo.Sales(ProductID,productname ) INCLUDE (SalesDate, SalesPersonId)
November 29, 2018 at 7:54 pm
Thank you all.
November 30, 2018 at 8:00 am
John Mitchell-245523 - Thursday, November 29, 2018 7:16 AMSamLooks like you need to do a bit of normalising first - you shouldn't have ProductID and productname in a table of sales. Assuming changing that isn't an option, get rid of the ORDER BY clause, since all rows will have the same productname, won't they. Since you've taken that column out of the query, you no longer need it in the index, either.
John
I'm thinking that having ProductID and ProductName in a sales table would actually be a form of denormalization for the purpose of rapid display. Since sales tables border on the fringe of being a history table, it's not a problem but it's also not normalization.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 30, 2018 at 8:02 am
Jonathan AC Roberts - Thursday, November 29, 2018 7:51 AMsamantha.sqldba - Thursday, November 29, 2018 7:08 AMHi All,Need index suggestion for the below query which can improve the performance. (Assume that table has enough data)
SELECT SalesDate, SalesPersonID FROM dbo.Sales WHERE ProductID='PD1’ Order by productname;
I Would like to create index as below. Am I correct or anything needs to be changed?
CREATE INDEX <IDX___> ON dbo.Sales(ProductID) INCLUDE (SalesDate, SalesPersonID, productname)Thanks,
Sam
CREATE INDEX <IDX___> ON dbo.Sales(ProductID,productname ) INCLUDE (SalesDate, SalesPersonId)
I'm curious. If you have the ProductID, why do you believe that ProductName needs to be a Key Column rather than just as an INCLUDE?
--Jeff Moden
Change is inevitable... Change for the better is not.
November 30, 2018 at 8:21 am
Jeff Moden - Friday, November 30, 2018 8:02 AMJonathan AC Roberts - Thursday, November 29, 2018 7:51 AMsamantha.sqldba - Thursday, November 29, 2018 7:08 AMHi All,Need index suggestion for the below query which can improve the performance. (Assume that table has enough data)
SELECT SalesDate, SalesPersonID FROM dbo.Sales WHERE ProductID='PD1’ Order by productname;
I Would like to create index as below. Am I correct or anything needs to be changed?
CREATE INDEX <IDX___> ON dbo.Sales(ProductID) INCLUDE (SalesDate, SalesPersonID, productname)Thanks,
Sam
CREATE INDEX <IDX___> ON dbo.Sales(ProductID,productname ) INCLUDE (SalesDate, SalesPersonId)
I'm curious. If you have the ProductID, why do you believe that ProductName needs to be a Key Column rather than just as an INCLUDE?
It's needed for the "Order by productname"
November 30, 2018 at 8:30 am
Jonathan AC Roberts - Friday, November 30, 2018 8:21 AMJeff Moden - Friday, November 30, 2018 8:02 AMJonathan AC Roberts - Thursday, November 29, 2018 7:51 AMsamantha.sqldba - Thursday, November 29, 2018 7:08 AMHi All,Need index suggestion for the below query which can improve the performance. (Assume that table has enough data)
SELECT SalesDate, SalesPersonID FROM dbo.Sales WHERE ProductID='PD1’ Order by productname;
I Would like to create index as below. Am I correct or anything needs to be changed?
CREATE INDEX <IDX___> ON dbo.Sales(ProductID) INCLUDE (SalesDate, SalesPersonID, productname)Thanks,
Sam
CREATE INDEX <IDX___> ON dbo.Sales(ProductID,productname ) INCLUDE (SalesDate, SalesPersonId)
I'm curious. If you have the ProductID, why do you believe that ProductName needs to be a Key Column rather than just as an INCLUDE?
It's needed for the "Order by productname"
Understood but I don't believe it needs to be a part of the key to be effective for sorting. The good part is, you got me thinking about this and I'm going to play with it a bit over the weekend. Thanks, Jonathan.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 30, 2018 at 8:43 am
No, I agree - it would need to be the first column in the key. But like I said before, it's not needed at all, unless the product ID PD1 has more than one product name.
John
November 30, 2018 at 9:18 am
John Mitchell-245523 - Friday, November 30, 2018 8:43 AMNo, I agree - it would need to be the first column in the key. But like I said before, it's not needed at all, unless the product ID PD1 has more than one product name.John
Because ProductId is specified in the SQL as a fixed value, SQL Server can find the product key and read the values in order to end up with the sort already done. At least the optimiser should be able to work that out. So I think the order of the columns in the index is correct.SELECT SalesDate, SalesPersonID FROM dbo.Sales WHERE ProductID='PD1’ Order by productname;
You're right about it not being needed if there is only one productname per ProductId, but then I thought to myself the Order by would not needed either. So I assumed that there is more than one productname per productId.
November 30, 2018 at 10:11 am
John Mitchell-245523 - Friday, November 30, 2018 8:43 AMNo, I agree - it would need to be the first column in the key. But like I said before, it's not needed at all, unless the product ID PD1 has more than one product name.John
This demonstrates it works, it takes a few minutes to execute though:
if OBJECT_ID('tempdb..#TestTable','U') IS NOT NULL
drop table #TestTable
go
select t2.N ProductId,convert(varchar(50),NEWID()) productName, getdate() SalesDate, t1.N SalesPersonId
into #TestTable
from dbo.tally t1
inner join dbo.tally t2 on t2.N <= 10
inner join dbo.tally t3 on t3.N <= 10
inner join dbo.tally t4 on t4.N <= 100
inner join dbo.tally t5 on t5.N <= 10
inner join dbo.tally t6 on t6.N <= 10
inner join dbo.tally t7 on t7.N <= 10
where t1.N <= 10
create index IX_#TestTable_1 on #TestTable(ProductId) INCLUDE (productName, SalesDate, SalesPersonId)
set statistics io,time on
select top(100) * from #TestTable WHERE ProductId = 3 ORDER BY productName
set statistics io,time off
drop index IX_#TestTable_1 on #TestTable
create index IX_#TestTable_2 on #TestTable(ProductId, productName) INCLUDE (SalesDate, SalesPersonId)
set statistics io,time on
select top(100) * from #TestTable WHERE ProductId = 3 ORDER BY productName
Query with index keyed only on ProductId
SQL Server Execution Times:
CPU time = 3293 ms, elapsed time = 1036 ms.
Query with index keyed on ProductId, productName
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 35 ms.
November 30, 2018 at 8:59 pm
Heh... ok. I'm confused. Here's the original query posted by the OP.
SELECT SalesDate, SalesPersonID FROM dbo.Sales WHERE ProductID='PD1’ Order by productname;
Can someone please tell me why anyone would find the output given by the above query to be more useful because it was sorted by ProductName? Without having the ProductName in the output, the sort is totally useless to anyone having to work with the output.
--=================================================================================================
-- Create the test table to simulate the real one according to what we know.
-- Please refer to the following articles for how to quickly make gobs of random but
-- constrained data.
-- http://www.sqlservercentral.com/articles/Data+Generation/87901/
-- http://www.sqlservercentral.com/articles/Test+Data/88964/
--=================================================================================================
--===== If the test table already exists, drop it to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Sales','U') IS NOT NULL
DROP TABLE #Sales
;
--===== Create and populate the test table on-the-fly.
-- Depending on your machine, this will take somewhere between 24 and 40 seconds to create the
-- 5 million rows. Details are in the tail comments.
WITH cteGenData AS
(
SELECT TOP 5000000
SalesDate = RAND(CHECKSUM(NEWID())) -- 2010-01-01 up to 2020-01-01
* DATEDIFF(dd,'2010','2020') -- (Temporal Closed/Open)
+ CONVERT(DATETIME,'2010')
,SalesPersonID = ABS(CHECKSUM(NEWID())%100)+1 -- 1 thru 100
,ProductNo = ABS(CHECKSUM(NEWID())%50)+1 -- 1 thru 50
FROM sys.all_columns ac1 --Cross Join forms a "Pseudo Cursor" row source which
CROSS JOIN sys.all_columns ac2 --replaces a loop
) --=== Finalize the output for the test table.
-- The ISNULLs make the columns NOT NULL in the test table.
-- Note that the ProductID and ProductName will always match, just like in real life.
SELECT SalesDate = ISNULL(SalesDate,0)
,SalesPersonID = ISNULL(SalesPersonID,0)
,ProductID = ISNULL(
CHAR(ProductNo%26+65) -- First letter based on ProductNo
+ CONVERT(VARCHAR(10),ProductNo)
,'')
,ProductName = ISNULL(
CHAR(ProductNo%26+65) -- First letter based on ProductNo
+'-Product Name # ' -- Literal
+ CONVERT(VARCHAR(10),ProductNo)-- ProductNo converted to characters
,'')
,OtherColumns = CONVERT(CHAR(100),'Simto other columns bulk')
INTO #Sales
FROM cteGenData
ORDER BY SalesDate
;
--===== Let's see the first 100 rows so we know what we're dealing with.
SELECT TOP 100 *
FROM #Sales
;
--=================================================================================================
-- The following is identical to the original query except for the literal for ProductID.
-- So, here's my question... DOES IT MAKE ANY SENSE AT ALL TO SORT THIS OUTPUT BY PRODUCTNAME?
-- Without anything having to do with the product in the SELECT list, ordering by ProductName
-- makes no sense to me what-so-ever!
--=================================================================================================
SELECT SalesDate
,SalesPersonID
FROM #Sales
WHERE ProductID = 'U20'
ORDER BY ProductName
;
--Jeff Moden
Change is inevitable... Change for the better is not.
December 3, 2018 at 8:30 am
Jonathan AC Roberts - Friday, November 30, 2018 10:11 AMJohn Mitchell-245523 - Friday, November 30, 2018 8:43 AMNo, I agree - it would need to be the first column in the key. But like I said before, it's not needed at all, unless the product ID PD1 has more than one product name.John
This demonstrates it works, it takes a few minutes to execute though:
if OBJECT_ID('tempdb..#TestTable','U') IS NOT NULL
drop table #TestTable
go
select t2.N ProductId,convert(varchar(50),NEWID()) productName, getdate() SalesDate, t1.N SalesPersonId
into #TestTable
from dbo.tally t1
inner join dbo.tally t2 on t2.N <= 10
inner join dbo.tally t3 on t3.N <= 10
inner join dbo.tally t4 on t4.N <= 100
inner join dbo.tally t5 on t5.N <= 10
inner join dbo.tally t6 on t6.N <= 10
inner join dbo.tally t7 on t7.N <= 10
where t1.N <= 10create index IX_#TestTable_1 on #TestTable(ProductId) INCLUDE (productName, SalesDate, SalesPersonId)
set statistics io,time on
select top(100) * from #TestTable WHERE ProductId = 3 ORDER BY productName
set statistics io,time off
drop index IX_#TestTable_1 on #TestTablecreate index IX_#TestTable_2 on #TestTable(ProductId, productName) INCLUDE (SalesDate, SalesPersonId)
set statistics io,time on
select top(100) * from #TestTable WHERE ProductId = 3 ORDER BY productNameQuery with index keyed only on ProductId
SQL Server Execution Times:
CPU time = 3293 ms, elapsed time = 1036 ms.Query with index keyed on ProductId, productName
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 35 ms.
Mmmm... good shout. One test, a thousand opinions and all that!
John
December 3, 2018 at 8:41 am
Still looking for why anyone would find the sorting to be useful considering the output and the fact that there's noting to preserve that order if used elsewhere.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 3, 2018 at 9:06 am
Jeff Moden - Monday, December 3, 2018 8:41 AMStill looking for why anyone would find the sorting to be useful considering the output and the fact that there's noting to preserve that order if used elsewhere.
You are correct. The OP's accepted answer is: "get rid of the ORDER BY clause, since all rows will have the same productname"
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply