July 22, 2010 at 6:21 pm
First, thank you for posting the table DDL and expected output.
Second, it would really help if you were to actually post insert statements for your data. Please read the article in the first link in my signature for how to automatically generate this. (FYI, it took me longer to get a working environment with your data set up than what it took to code a solution for you.)
And this should do what you're looking for. You may need to adjust your DATEFIRST setting.
;WITH CTE AS
(
select [CP Code],
StartDate = DateAdd(day, -DatePart(weekday, [Date])+1, [Date]),
[Total Pageviews],
MB = convert(numeric(20,4), [Total Volume in MB])
from DSA
)
SELECT [CP Code],
StartDate,
PageViews = sum([Total Pageviews]),
MB = sum(MB)
FROM CTE
GROUP BY [CP Code],StartDate
order by [CP Code],StartDate
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
July 22, 2010 at 6:23 pm
Just for future reference, it helps if you put your data in a usable format like this:
CREATE TABLE #Temp
(
[CP Code] varchar(50),
[Date] date,
[Total Pageviews] bigint,
[Total Volume in MB] decimal(16,4),
)
INSERT INTO #Temp
SELECT 51014, '2010-06-27', 2623221, 99636.0095 UNION ALL
SELECT 51014, '2010-06-28', 1985718, 88588.3845 UNION ALL
SELECT 51014, '2010-06-29', 1815712, 81359.7489 UNION ALL
SELECT 51014, '2010-06-30', 2126104, 93943.4801 UNION ALL
SELECT 51014, '2010-07-01', 1702672, 82257.0021 UNION ALL
SELECT 51014, '2010-07-02', 1666468, 79573.3334 UNION ALL
SELECT 51014, '2010-07-03', 2019123, 86187.5129 UNION ALL
SELECT 51014, '2010-07-04', 1783373, 82654.0857 UNION ALL
SELECT 51014, '2010-07-05', 1890672, 86256.3637 UNION ALL
SELECT 51014, '2010-07-06', 2424314, 98458.3025 UNION ALL
SELECT 51014, '2010-07-07', 2197743, 89729.0344 UNION ALL
SELECT 51014, '2010-07-08', 2097884, 93912.9365 UNION ALL
SELECT 51014, '2010-07-09', 2029226, 87494.0143 UNION ALL
SELECT 51014, '2010-07-10', 1989296, 84737.8993 UNION ALL
SELECT 51014, '2010-07-11', 1833109, 80668.2048 UNION ALL
SELECT 51014, '2010-07-12', 2398424, 97892.0837 UNION ALL
SELECT 51014, '2010-07-13', 2365550, 100190.4429 UNION ALL
SELECT 51014, '2010-07-14', 1810926, 85285.4454 UNION ALL
SELECT 51014, '2010-07-15', 2483211, 100816.9027 UNION ALL
SELECT 51014, '2010-07-16', 2327007, 95540.1672 UNION ALL
SELECT 51014, '2010-07-17', 1758154, 81510.6331 UNION ALL
SELECT 53037, '2010-06-27', 6059, 117383.7422 UNION ALL
SELECT 53037, '2010-06-28', 6310, 124464.7691 UNION ALL
SELECT 53037, '2010-06-29', 6394, 115046.3884 UNION ALL
SELECT 53037, '2010-06-30', 4627, 111166.2992 UNION ALL
SELECT 53037, '2010-07-01', 3402, 101552.575 UNION ALL
SELECT 53037, '2010-07-02', 3115, 93708.7741 UNION ALL
SELECT 53037, '2010-07-03', 3032, 87156.6502 UNION ALL
SELECT 53037, '2010-07-04', 2995, 92872.0334 UNION ALL
SELECT 53037, '2010-07-05', 3122, 102125.7448 UNION ALL
SELECT 53037, '2010-07-07', 3040, 94791.2053 UNION ALL
SELECT 53037, '2010-07-08', 3638, 106615.8714 UNION ALL
SELECT 53037, '2010-07-09', 3767, 97252.8935 UNION ALL
SELECT 53037, '2010-07-10', 3745, 94446.8815 UNION ALL
SELECT 53037, '2010-07-11', 3576, 93185.0068 UNION ALL
SELECT 53037, '2010-07-12', 3860, 104106.1198 UNION ALL
SELECT 53037, '2010-07-13', 3123, 96444.6582 UNION ALL
SELECT 53037, '2010-07-14', 3458, 98725.4491 UNION ALL
SELECT 53037, '2010-07-15', 3516, 96911.039 UNION ALL
SELECT 53037, '2010-07-16', 3233, 90902.3766 UNION ALL
SELECT 53037, '2010-07-17', 3176, 90590.8028
Also, please note that I changed the [Total Volume in MB] from a varchar to a decimal. It's really not a good idea to store numeric data in varchars.
And the query:
SELECT [CP Code],
DATEADD(dd, -1, DATEADD(wk, DATEDIFF(wk,0,[Date]), 0)) as [Date],
SUM([Total Pageviews]) as [Total Pageviews],
SUM([Total Volume in MB]) as [Total Volume in MB]
FROM #Temp
GROUP BY [CP Code], DATEDIFF(wk,0,[Date])
ORDER BY [CP Code], DATEDIFF(wk,0,[Date])
If you want more information about date functions, Seth Phelabaum wrote a nice article on this site and it can be found here[/url].
July 22, 2010 at 6:24 pm
July 22, 2010 at 7:45 pm
Wayne and BT...
Ummm... what happens if you're missing a week of data with either of your fine solutions? 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
July 22, 2010 at 8:51 pm
Jeff Moden (7/22/2010)
Wayne and BT...Ummm... what happens if you're missing a week of data with either of your fine solutions? 🙂
Well, it depends! That would obviously be a problem if reporting on missing weeks is part of his requirement, but nothing a little join on a tally wouldn't fix.
July 22, 2010 at 9:00 pm
bteraberry (7/22/2010)
Jeff Moden (7/22/2010)
Wayne and BT...Ummm... what happens if you're missing a week of data with either of your fine solutions? 🙂
Well, it depends! That would obviously be a problem if reporting on missing weeks is part of his requirement, but nothing a little join on a tally wouldn't fix.
Heh... of course it depends. So let's pretend it's your neck on the line... would you make it so it catches missing weeks even if it weren't written into a requirement? I know I would. At least I would ask the question. So why not give the OP the same break and ask the question? 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
July 22, 2010 at 9:40 pm
Joe Celko (7/22/2010)
Thank you for the DDL, but it stinks....
I can tell you never read a book on data modeling.
Heh... there ya go being all friendly like again, Joe. 😛 You really need to work on your people skills. I'm sure there's an ISO document on the subject. :w00t:
Seriously. Technically, you're mostly correct but your habit is to kill the patient. People might actually listen to you (translation: Buy more of your books ;-)) if you left the daggers at home, Joe. I know killing the patient can be fun but it's more challenging to teach instead. 😛
pageview_date DATE DEFAULT CURRENT TIMESTAMP NOT NULL,
Just a suggestion... I wouldn't put a default on this column. It allows the user to make the huge mistake of not knowing the correct value for this important column.
You do not convert it; you aggregate it. Very different mindset. What kind of weeks? Fiscal? ISO? US? Let's ISO-8601 Standards for the week names.
CREATE TABLE WeekRanges
(week_name [font="Arial Black"]CHAR() [/font]NOT NULL PRIMARY KEY
CHECK (week_name LIKE '[12][0-9][0-9][0-9][font="Arial Black"]W[0-9][0-9][0-9][/font]'),
week_start_date DATE NOT NULL UNIQUE,
week_end_date DATE NOT NULL,
CHECK (week_start_date < week_end_date)
);
Is that check for week number correct? 3 Digits? If so, that would be a good reason to not follow ISO standards. :hehe: And CHAR()? Since when did you fall into the habit of trusting defaults?
Also, "week_end_date" should be "next_week_start_date" so that the WeekRange table works with dates AND times. Not all queries are going to be "date-only". Of course, if you actually did that, then the check would be (week_start_date = DATEADD(dd, -7, next_week_start_date) instead of allowing the user to enter ANY date less than the end date. While were at it, you should also do a check to validate the start date.
Now do a JOIN and use a BETWEEN predicate:
SELECT W.week_name, DSA.cp_code,
SUM (pageview_cnt) AS wkly_pageview_cnt,
SUM (something_volume) AS wkly_something_volume
FROM DSA, WeekRanges AS W
[font="Arial Black"]WHERE DSA.pageview_date BETWEEN W.week_start_date AND W.eeek_end_date[/font]
GROUP BY W.week_name, DSA.cp_code;
Ummmm... nope. Like I said, the WeekRange table should be able to handle times, as well. BETWEEN is a great way to get yourself in trouble with dates and times. The highlighted line above should be...
WHERE DSA.pageview_date >= W.week_start_date
AND DSA.pageview_date < W.next_week_start_date
Of course, that whole query needs a bit of rework because the requirement is to return the starting date of the week... not the week name as you have it.
If you're going to be nasty to people, you should probably get stuff like that right, ya know? 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
July 23, 2010 at 12:13 pm
Hi
First of all Thankyou very much guys for ur help. Looks like its working.
Actually I'm importing data from a CSV file with BULK INSERT and the problem is that the CSV file requires lot of cleanup before I can BULK INSERT it into Table. After I clean it up, the BULK INSERT still is not able to load any data. So I first save it as tabbed delimited .txt file and then create a table with all VARCHAR datatypes and then load data. After doing that, then I change the data types and column names. So my first priority was to just load the data first and then to do other tasks. My Table actually has 34 Columns and about 250 rows. I cut it short just to understand how it is done. My data doesnt have any unique values and hence no Primary Key. I tried using IDENTITY Column but I have to load data every week and my table keeps growing, so after adding IDENTITY Column to table, I cannot BULK INSERT more data into it as I don't have IDENTITY Column in my CSV file. I tried using file format file, but for some reason its not working. It gives me "Invalid # of Columns" error.
I'm new to all this and have really been looking to find a better way to do this. I'm using this data to build reports in report builder where weekly data is required. Also I was not sure if all these tasks can be done in Report Builder 1.0. So I decided to do it in SQL.
Thanks for your help guys.... Ill ask if I have any other problem.
July 23, 2010 at 1:00 pm
July 23, 2010 at 2:44 pm
Hi
When I Run this query As:
CREATE VIEW WeeklyData
AS
WITH CTE AS
(
select [CP Code],
StartDate = DateAdd(day, -DatePart(weekday, [Date])+1, [Date]),
[Total Pageviews],
MB = convert(numeric(20,4), [Total Volume in MB])
from DSA
)
SELECT [CP Code],
StartDate,
PageViews = sum([Total Pageviews]),
MB = sum(MB)
FROM CTE
GROUP BY [CP Code],StartDate
It gives me the required result i.e on weekly basis.
But I want to delete "MB = convert(numeric(20,4), [Total Volume in MB])" and "MB = sum(MB) from query as I dont want that Column.
Can this query be used with any Number of Columns?
Thanks
July 23, 2010 at 3:29 pm
July 23, 2010 at 3:56 pm
Novicejatt (7/23/2010)
HiFirst of all Thankyou very much guys for ur help. Looks like its working.
Actually I'm importing data from a CSV file with BULK INSERT and the problem is that the CSV file requires lot of cleanup before I can BULK INSERT it into Table. After I clean it up, the BULK INSERT still is not able to load any data. So I first save it as tabbed delimited .txt file and then create a table with all VARCHAR datatypes and then load data. After doing that, then I change the data types and column names. So my first priority was to just load the data first and then to do other tasks. My Table actually has 34 Columns and about 250 rows. I cut it short just to understand how it is done. My data doesnt have any unique values and hence no Primary Key. I tried using IDENTITY Column but I have to load data every week and my table keeps growing, so after adding IDENTITY Column to table, I cannot BULK INSERT more data into it as I don't have IDENTITY Column in my CSV file. I tried using file format file, but for some reason its not working. It gives me "Invalid # of Columns" error.
I'm new to all this and have really been looking to find a better way to do this. I'm using this data to build reports in report builder where weekly data is required. Also I was not sure if all these tasks can be done in Report Builder 1.0. So I decided to do it in SQL.
Thanks for your help guys.... Ill ask if I have any other problem.
The invalid number of columns error can be fixed in the format file but it's actually easier to create a view that looks like the file and you import into the view. So far as a PK goes, the view would allow you to use an IDENTITY column but doesn't "the code" and "the date" columns produce unique entries?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 23, 2010 at 4:01 pm
I get data daily for same code, so its repeating and I have 30 such codes which means that I have 30 codes with same date. So no Column has unique value.
Thanks
July 25, 2010 at 1:03 pm
Novicejatt (7/23/2010)
I get data daily for same code, so its repeating and I have 30 such codes which means that I have 30 codes with same date. So no Column has unique value.Thanks
Sorry for the delay...
You've lost me just a bit there... are you saying that combinations of CP_Code and Date form duplicates in the data?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 26, 2010 at 2:28 pm
Hi
This is how my Table Looks like with 3 weeks of data on daily basis for two CP Codes:
I have data for 30 such Codes.
CP CodeDateTotal Pageviews
510142010-06-272623221
510142010-06-281985718
510142010-06-291815712
510142010-06-302126104
510142010-07-011702672
510142010-07-021666468
510142010-07-032019123
510142010-07-041783373
510142010-07-051890672
510142010-07-062424314
510142010-07-072197743
510142010-07-082097884
510142010-07-092029226
510142010-07-101989296
510142010-07-111833109
510142010-07-122398424
510142010-07-132365550
510142010-07-141810926
510142010-07-152483211
510142010-07-162327007
510142010-07-171758154
530372010-06-276059
530372010-06-286310
530372010-06-296394
530372010-06-304627
530372010-07-013402
530372010-07-023115
530372010-07-033032
530372010-07-042995
530372010-07-053122
530372010-07-073040
530372010-07-083638
530372010-07-093767
530372010-07-103745
530372010-07-113576
530372010-07-123860
530372010-07-133123
530372010-07-143458
530372010-07-153516
530372010-07-163233
530372010-07-173176
After converting this data into Weekly basis, it looks like this:
Its actually a view that I created from daily data.
CP CodeWeekStartDateTotal Pageviews
510142010-06-2713939018
510142010-07-0414412508
510142010-07-1114976381
530372010-06-2732939
530372010-07-0420307
530372010-07-1123942
"Total Pageviews" is the aggregate for each week.
I have created two more views from same same table. They have same CP Code Column in them. Is there any Solution to this as I'm using this for report builder and having no PK in any of the views doesnt give the desired results.
Thanks
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply