Sum Daily Data Into Weekly Data

  • First, thank you for posting the table DDL and expected output.

    Second, it would really help if you were to actually post insert statements for your data. Please read the article in the first link in my signature for how to automatically generate this. (FYI, it took me longer to get a working environment with your data set up than what it took to code a solution for you.)

    And this should do what you're looking for. You may need to adjust your DATEFIRST setting.

    ;WITH CTE AS

    (

    select [CP Code],

    StartDate = DateAdd(day, -DatePart(weekday, [Date])+1, [Date]),

    [Total Pageviews],

    MB = convert(numeric(20,4), [Total Volume in MB])

    from DSA

    )

    SELECT [CP Code],

    StartDate,

    PageViews = sum([Total Pageviews]),

    MB = sum(MB)

    FROM CTE

    GROUP BY [CP Code],StartDate

    order by [CP Code],StartDate

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Just for future reference, it helps if you put your data in a usable format like this:

    CREATE TABLE #Temp

    (

    [CP Code] varchar(50),

    [Date] date,

    [Total Pageviews] bigint,

    [Total Volume in MB] decimal(16,4),

    )

    INSERT INTO #Temp

    SELECT 51014, '2010-06-27', 2623221, 99636.0095 UNION ALL

    SELECT 51014, '2010-06-28', 1985718, 88588.3845 UNION ALL

    SELECT 51014, '2010-06-29', 1815712, 81359.7489 UNION ALL

    SELECT 51014, '2010-06-30', 2126104, 93943.4801 UNION ALL

    SELECT 51014, '2010-07-01', 1702672, 82257.0021 UNION ALL

    SELECT 51014, '2010-07-02', 1666468, 79573.3334 UNION ALL

    SELECT 51014, '2010-07-03', 2019123, 86187.5129 UNION ALL

    SELECT 51014, '2010-07-04', 1783373, 82654.0857 UNION ALL

    SELECT 51014, '2010-07-05', 1890672, 86256.3637 UNION ALL

    SELECT 51014, '2010-07-06', 2424314, 98458.3025 UNION ALL

    SELECT 51014, '2010-07-07', 2197743, 89729.0344 UNION ALL

    SELECT 51014, '2010-07-08', 2097884, 93912.9365 UNION ALL

    SELECT 51014, '2010-07-09', 2029226, 87494.0143 UNION ALL

    SELECT 51014, '2010-07-10', 1989296, 84737.8993 UNION ALL

    SELECT 51014, '2010-07-11', 1833109, 80668.2048 UNION ALL

    SELECT 51014, '2010-07-12', 2398424, 97892.0837 UNION ALL

    SELECT 51014, '2010-07-13', 2365550, 100190.4429 UNION ALL

    SELECT 51014, '2010-07-14', 1810926, 85285.4454 UNION ALL

    SELECT 51014, '2010-07-15', 2483211, 100816.9027 UNION ALL

    SELECT 51014, '2010-07-16', 2327007, 95540.1672 UNION ALL

    SELECT 51014, '2010-07-17', 1758154, 81510.6331 UNION ALL

    SELECT 53037, '2010-06-27', 6059, 117383.7422 UNION ALL

    SELECT 53037, '2010-06-28', 6310, 124464.7691 UNION ALL

    SELECT 53037, '2010-06-29', 6394, 115046.3884 UNION ALL

    SELECT 53037, '2010-06-30', 4627, 111166.2992 UNION ALL

    SELECT 53037, '2010-07-01', 3402, 101552.575 UNION ALL

    SELECT 53037, '2010-07-02', 3115, 93708.7741 UNION ALL

    SELECT 53037, '2010-07-03', 3032, 87156.6502 UNION ALL

    SELECT 53037, '2010-07-04', 2995, 92872.0334 UNION ALL

    SELECT 53037, '2010-07-05', 3122, 102125.7448 UNION ALL

    SELECT 53037, '2010-07-07', 3040, 94791.2053 UNION ALL

    SELECT 53037, '2010-07-08', 3638, 106615.8714 UNION ALL

    SELECT 53037, '2010-07-09', 3767, 97252.8935 UNION ALL

    SELECT 53037, '2010-07-10', 3745, 94446.8815 UNION ALL

    SELECT 53037, '2010-07-11', 3576, 93185.0068 UNION ALL

    SELECT 53037, '2010-07-12', 3860, 104106.1198 UNION ALL

    SELECT 53037, '2010-07-13', 3123, 96444.6582 UNION ALL

    SELECT 53037, '2010-07-14', 3458, 98725.4491 UNION ALL

    SELECT 53037, '2010-07-15', 3516, 96911.039 UNION ALL

    SELECT 53037, '2010-07-16', 3233, 90902.3766 UNION ALL

    SELECT 53037, '2010-07-17', 3176, 90590.8028

    Also, please note that I changed the [Total Volume in MB] from a varchar to a decimal. It's really not a good idea to store numeric data in varchars.

    And the query:

    SELECT [CP Code],

    DATEADD(dd, -1, DATEADD(wk, DATEDIFF(wk,0,[Date]), 0)) as [Date],

    SUM([Total Pageviews]) as [Total Pageviews],

    SUM([Total Volume in MB]) as [Total Volume in MB]

    FROM #Temp

    GROUP BY [CP Code], DATEDIFF(wk,0,[Date])

    ORDER BY [CP Code], DATEDIFF(wk,0,[Date])

    If you want more information about date functions, Seth Phelabaum wrote a nice article on this site and it can be found here[/url].

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Wayne!

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Wayne and BT...

    Ummm... what happens if you're missing a week of data with either of your fine solutions? 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (7/22/2010)


    Wayne and BT...

    Ummm... what happens if you're missing a week of data with either of your fine solutions? 🙂

    Well, it depends! That would obviously be a problem if reporting on missing weeks is part of his requirement, but nothing a little join on a tally wouldn't fix.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • bteraberry (7/22/2010)


    Jeff Moden (7/22/2010)


    Wayne and BT...

    Ummm... what happens if you're missing a week of data with either of your fine solutions? 🙂

    Well, it depends! That would obviously be a problem if reporting on missing weeks is part of his requirement, but nothing a little join on a tally wouldn't fix.

    Heh... of course it depends. So let's pretend it's your neck on the line... would you make it so it catches missing weeks even if it weren't written into a requirement? I know I would. At least I would ask the question. So why not give the OP the same break and ask the question? 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Joe Celko (7/22/2010)


    Thank you for the DDL, but it stinks.

    ...

    I can tell you never read a book on data modeling.

    Heh... there ya go being all friendly like again, Joe. 😛 You really need to work on your people skills. I'm sure there's an ISO document on the subject. :w00t:

    Seriously. Technically, you're mostly correct but your habit is to kill the patient. People might actually listen to you (translation: Buy more of your books ;-)) if you left the daggers at home, Joe. I know killing the patient can be fun but it's more challenging to teach instead. 😛

    pageview_date DATE DEFAULT CURRENT TIMESTAMP NOT NULL,

    Just a suggestion... I wouldn't put a default on this column. It allows the user to make the huge mistake of not knowing the correct value for this important column.

    You do not convert it; you aggregate it. Very different mindset. What kind of weeks? Fiscal? ISO? US? Let's ISO-8601 Standards for the week names.

    CREATE TABLE WeekRanges

    (week_name [font="Arial Black"]CHAR() [/font]NOT NULL PRIMARY KEY

    CHECK (week_name LIKE '[12][0-9][0-9][0-9][font="Arial Black"]W[0-9][0-9][0-9][/font]'),

    week_start_date DATE NOT NULL UNIQUE,

    week_end_date DATE NOT NULL,

    CHECK (week_start_date < week_end_date)

    );

    Is that check for week number correct? 3 Digits? If so, that would be a good reason to not follow ISO standards. :hehe: And CHAR()? Since when did you fall into the habit of trusting defaults?

    Also, "week_end_date" should be "next_week_start_date" so that the WeekRange table works with dates AND times. Not all queries are going to be "date-only". Of course, if you actually did that, then the check would be (week_start_date = DATEADD(dd, -7, next_week_start_date) instead of allowing the user to enter ANY date less than the end date. While were at it, you should also do a check to validate the start date.

    Now do a JOIN and use a BETWEEN predicate:

    SELECT W.week_name, DSA.cp_code,

    SUM (pageview_cnt) AS wkly_pageview_cnt,

    SUM (something_volume) AS wkly_something_volume

    FROM DSA, WeekRanges AS W

    [font="Arial Black"]WHERE DSA.pageview_date BETWEEN W.week_start_date AND W.eeek_end_date[/font]

    GROUP BY W.week_name, DSA.cp_code;

    Ummmm... nope. Like I said, the WeekRange table should be able to handle times, as well. BETWEEN is a great way to get yourself in trouble with dates and times. The highlighted line above should be...

    WHERE DSA.pageview_date >= W.week_start_date

    AND DSA.pageview_date < W.next_week_start_date

    Of course, that whole query needs a bit of rework because the requirement is to return the starting date of the week... not the week name as you have it.

    If you're going to be nasty to people, you should probably get stuff like that right, ya know? 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi

    First of all Thankyou very much guys for ur help. Looks like its working.

    Actually I'm importing data from a CSV file with BULK INSERT and the problem is that the CSV file requires lot of cleanup before I can BULK INSERT it into Table. After I clean it up, the BULK INSERT still is not able to load any data. So I first save it as tabbed delimited .txt file and then create a table with all VARCHAR datatypes and then load data. After doing that, then I change the data types and column names. So my first priority was to just load the data first and then to do other tasks. My Table actually has 34 Columns and about 250 rows. I cut it short just to understand how it is done. My data doesnt have any unique values and hence no Primary Key. I tried using IDENTITY Column but I have to load data every week and my table keeps growing, so after adding IDENTITY Column to table, I cannot BULK INSERT more data into it as I don't have IDENTITY Column in my CSV file. I tried using file format file, but for some reason its not working. It gives me "Invalid # of Columns" error.

    I'm new to all this and have really been looking to find a better way to do this. I'm using this data to build reports in report builder where weekly data is required. Also I was not sure if all these tasks can be done in Report Builder 1.0. So I decided to do it in SQL.

    Thanks for your help guys.... Ill ask if I have any other problem.

  • If you're doing a lot of cleanup before you can import, I would suggest that you take a look at SSIS. It's a very handy tool with a lot of flexibility.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Hi

    When I Run this query As:

    CREATE VIEW WeeklyData

    AS

    WITH CTE AS

    (

    select [CP Code],

    StartDate = DateAdd(day, -DatePart(weekday, [Date])+1, [Date]),

    [Total Pageviews],

    MB = convert(numeric(20,4), [Total Volume in MB])

    from DSA

    )

    SELECT [CP Code],

    StartDate,

    PageViews = sum([Total Pageviews]),

    MB = sum(MB)

    FROM CTE

    GROUP BY [CP Code],StartDate

    It gives me the required result i.e on weekly basis.

    But I want to delete "MB = convert(numeric(20,4), [Total Volume in MB])" and "MB = sum(MB) from query as I dont want that Column.

    Can this query be used with any Number of Columns?

    Thanks

  • Yes, you can just take that out if you don't need it.

    └> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Novicejatt (7/23/2010)


    Hi

    First of all Thankyou very much guys for ur help. Looks like its working.

    Actually I'm importing data from a CSV file with BULK INSERT and the problem is that the CSV file requires lot of cleanup before I can BULK INSERT it into Table. After I clean it up, the BULK INSERT still is not able to load any data. So I first save it as tabbed delimited .txt file and then create a table with all VARCHAR datatypes and then load data. After doing that, then I change the data types and column names. So my first priority was to just load the data first and then to do other tasks. My Table actually has 34 Columns and about 250 rows. I cut it short just to understand how it is done. My data doesnt have any unique values and hence no Primary Key. I tried using IDENTITY Column but I have to load data every week and my table keeps growing, so after adding IDENTITY Column to table, I cannot BULK INSERT more data into it as I don't have IDENTITY Column in my CSV file. I tried using file format file, but for some reason its not working. It gives me "Invalid # of Columns" error.

    I'm new to all this and have really been looking to find a better way to do this. I'm using this data to build reports in report builder where weekly data is required. Also I was not sure if all these tasks can be done in Report Builder 1.0. So I decided to do it in SQL.

    Thanks for your help guys.... Ill ask if I have any other problem.

    The invalid number of columns error can be fixed in the format file but it's actually easier to create a view that looks like the file and you import into the view. So far as a PK goes, the view would allow you to use an IDENTITY column but doesn't "the code" and "the date" columns produce unique entries?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I get data daily for same code, so its repeating and I have 30 such codes which means that I have 30 codes with same date. So no Column has unique value.

    Thanks

  • Novicejatt (7/23/2010)


    I get data daily for same code, so its repeating and I have 30 such codes which means that I have 30 codes with same date. So no Column has unique value.

    Thanks

    Sorry for the delay...

    You've lost me just a bit there... are you saying that combinations of CP_Code and Date form duplicates in the data?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi

    This is how my Table Looks like with 3 weeks of data on daily basis for two CP Codes:

    I have data for 30 such Codes.

    CP CodeDateTotal Pageviews

    510142010-06-272623221

    510142010-06-281985718

    510142010-06-291815712

    510142010-06-302126104

    510142010-07-011702672

    510142010-07-021666468

    510142010-07-032019123

    510142010-07-041783373

    510142010-07-051890672

    510142010-07-062424314

    510142010-07-072197743

    510142010-07-082097884

    510142010-07-092029226

    510142010-07-101989296

    510142010-07-111833109

    510142010-07-122398424

    510142010-07-132365550

    510142010-07-141810926

    510142010-07-152483211

    510142010-07-162327007

    510142010-07-171758154

    530372010-06-276059

    530372010-06-286310

    530372010-06-296394

    530372010-06-304627

    530372010-07-013402

    530372010-07-023115

    530372010-07-033032

    530372010-07-042995

    530372010-07-053122

    530372010-07-073040

    530372010-07-083638

    530372010-07-093767

    530372010-07-103745

    530372010-07-113576

    530372010-07-123860

    530372010-07-133123

    530372010-07-143458

    530372010-07-153516

    530372010-07-163233

    530372010-07-173176

    After converting this data into Weekly basis, it looks like this:

    Its actually a view that I created from daily data.

    CP CodeWeekStartDateTotal Pageviews

    510142010-06-2713939018

    510142010-07-0414412508

    510142010-07-1114976381

    530372010-06-2732939

    530372010-07-0420307

    530372010-07-1123942

    "Total Pageviews" is the aggregate for each week.

    I have created two more views from same same table. They have same CP Code Column in them. Is there any Solution to this as I'm using this for report builder and having no PK in any of the views doesnt give the desired results.

    Thanks

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply