May 7, 2013 at 11:43 pm
I'm looking for a SQL statement that will output 'new vs duplicate' data on a monthly bases, but I can't seem to come up SQL statement that can accomplish what I need. See example below to get a better idea of what I'm trying to do since this is hard to explain.
Example data:
id, num, create_date
// January
1, 34250, 2012-01-01
2, 25383, 2012-01-03
3, 93482, 2012-01-13
4, 52934, 2012-01-20
5, 34823, 2012-01-23
// Febuary
6, 09943, 2012-02-03
7, 32845, 2012-02-05
8, 34250, 2012-02-16 -- same num as id 1
9, 57325, 2012-02-28
10, 34823, 2012-02-28 -- same num as id 5
// March
11, 95234, 2012-03-05
12, 25383, 2012-03-03-- same num as id 2
13, 32845, 2012-03-16 -- same num as id 7
14, 34250, 2012-03-28-- same num as id 1, 8
15, 12412, 2012-03-28
// April
16, 53535, 2012-04-05
17, 23232, 2012-04-03
18, 57325, 2012-04-16 -- same num as id 9
19, 34250, 2012-04-28-- same num as id 1, 8, 14
20, 12343, 2012-04-28
Example results from data above would be something like below (just looking for new vs duplicate numbers):
January: 5 new and 0 duplicates (no duplicates will ever be found the first month)
Febuary: 3 new and 2 duplicates (checks for duplicates in the month of Jan only)
March: 2 new and 3 duplicates (checks for duplicates in the month of Jan & Feb only)
April: 3 new and 2 duplicates (checks for duplicates in the month of Jan, Feb & March only)
...and so on for each month after that
Hopefully you can see what I'm trying to do. Also, the total amount (new + duplicate) will always be 5 based on the data above. Any real smart person out there that can help?
Edit: Put in wrong forum (I'm using SQL Server 2005)
May 8, 2013 at 1:52 pm
<deleted>....not working as required
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
May 9, 2013 at 9:33 am
avitron2020 (5/7/2013)
I'm looking for a SQL statement that will output 'new vs duplicate' data on a monthly bases, but I can't seem to come up SQL statement that can accomplish what I need. See example below to get a better idea of what I'm trying to do since this is hard to explain.Example data:
id, num, create_date
// January
1, 34250, 2012-01-01
2, 25383, 2012-01-03
3, 93482, 2012-01-13
4, 52934, 2012-01-20
5, 34823, 2012-01-23
// Febuary
6, 09943, 2012-02-03
7, 32845, 2012-02-05
8, 34250, 2012-02-16 -- same num as id 1
9, 57325, 2012-02-28
10, 34823, 2012-02-28 -- same num as id 5
// March
11, 95234, 2012-03-05
12, 25383, 2012-03-03-- same num as id 2
13, 32845, 2012-03-16 -- same num as id 7
14, 34250, 2012-03-28-- same num as id 1, 8
15, 12412, 2012-03-28
// April
16, 53535, 2012-04-05
17, 23232, 2012-04-03
18, 57325, 2012-04-16 -- same num as id 9
19, 34250, 2012-04-28-- same num as id 1, 8, 14
20, 12343, 2012-04-28
Example results from data above would be something like below (just looking for new vs duplicate numbers):
January: 5 new and 0 duplicates (no duplicates will ever be found the first month)
Febuary: 3 new and 2 duplicates (checks for duplicates in the month of Jan only)
March: 2 new and 3 duplicates (checks for duplicates in the month of Jan & Feb only)
April: 3 new and 2 duplicates (checks for duplicates in the month of Jan, Feb & March only)
...and so on for each month after that
Hopefully you can see what I'm trying to do. Also, the total amount (new + duplicate) will always be 5 based on the data above. Any real smart person out there that can help?
Edit: Put in wrong forum (I'm using SQL Server 2005)
This works, assuming that your ID column is ever-increasing (i.e., IDs are assigned in the same order as create_date) - otherwise, you'll need to compare create_dates in the CROSS APPLY subquery:
-- Please remove the extraneous letter "x" from the CxREATE and DxROP keywords - my company filters internet traffic that includes certain T-SQL DDL keywords.
CxREATE TABLE dbo.testData (ID int PRIMARY KEY, number int, create_date datetime)
GO
INSERT INTO dbo.testData (ID, number, create_date)
SELECT 1, 34250, '2012-01-01' UNION ALL
SELECT 2, 25383, '2012-01-03' UNION ALL
SELECT 3, 93482, '2012-01-13' UNION ALL
SELECT 4, 52934, '2012-01-20' UNION ALL
SELECT 5, 34823, '2012-01-23' UNION ALL
SELECT 6, 09943, '2012-02-03' UNION ALL
SELECT 7, 32845, '2012-02-05' UNION ALL
SELECT 8, 34250, '2012-02-16' UNION ALL
SELECT 9, 57325, '2012-02-28' UNION ALL
SELECT 10, 34823, '2012-02-28' UNION ALL
SELECT 11, 95234, '2012-03-05' UNION ALL
SELECT 12, 25383, '2012-03-03' UNION ALL
SELECT 13, 32845, '2012-03-16' UNION ALL
SELECT 14, 34250, '2012-03-28' UNION ALL
SELECT 15, 12412, '2012-03-28' UNION ALL
SELECT 16, 53535, '2012-04-05' UNION ALL
SELECT 17, 23232, '2012-04-03' UNION ALL
SELECT 18, 57325, '2012-04-16' UNION ALL
SELECT 19, 34250, '2012-04-28' UNION ALL
SELECT 20, 12343, '2012-04-28'
GO
SELECT DATEPART(MONTH, td.create_date) AS monthNumber
,DATENAME(MONTH, td.create_date) AS [monthName]
,SUM(CASE WHEN xd.duplicate IS NULL THEN 1 ELSE 0 END) AS nNew
,COUNT(td.ID) - SUM(CASE WHEN xd.duplicate IS NULL THEN 1 ELSE 0 END) as nDuplicates
FROM dbo.testData td
OUTER APPLY (SELECT TOP(1) t.ID AS duplicate FROM dbo.testData t WHERE t.ID < td.ID AND t.number = td.number) xd
GROUP BY DATEPART(MONTH, td.create_date), DATENAME(MONTH, td.create_date)
ORDER BY DATEPART(MONTH, td.create_date)
GO
DxROP TABLE dbo.testData
Two notes about this code:
- It works no matter how many rows you have per month.
- For a larger data set, proper indexing will yield better performance.
Also, notice how I provided DDL for the dbo.testData table and converted your sample data into an INSERT . . . SELECT statement to make it "consumable" for others who may want to work on your question. It's considered good form on this forum to provide DDL, *consumable* test data, and expected results when you ask a question about manipulating specific data so that we can easily set up test environments and provide tested code.
Jason Wolfkill
May 10, 2013 at 10:54 am
That worked great. Thanks a lot wolfkillj.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply