SQL Query Divide Amount between months

  • Hi All,

    Background:

    Void Start Date: When a property becomes empty or vacant

    Let Date: When the property is filled in again

    I have the sample data below and like to show the void loss per month basis as below:

    1) Allocate the amount from voidloss column between months based on voiddays:

    for example for propcode 3698 the amount 13,612.56 needs to be divided between September and October based on the VoidDays. i.e of the 39 voiddays, 25 where in September and 14 in October hence 8726 will be allocated to September and 4886.56 to October

    2) After allocating the amount sum the amount by controlgroup and total the voiddays per month. It will be great if we can divide the voiddays between months and sum them by controlgroup as well.

    So in the end result we should have

    ControlGroup, Month, Year, VoidLoss, VoidsdayinMonth

    106 September 2014 8726 25

    106 October 2014 4886.56 14

    106 December 2014 2940 7

    Declare @voidloss Table

    ( History_IND INT

    ,PropCode VARCHAR(10)

    ,VoidCategory VARCHAR(10)

    ,ControlGroup VARCHAR(10)

    ,VoidStartDate date

    ,VoidMonth VARCHAR(10)

    ,VoidYear VARCHAR(10)

    ,LetDate date

    ,LetMonth VARCHAR(10)

    ,LetYear VARCHAR(10)

    ,VoidDays VARCHAR(10)

    ,MarketRent VARCHAR(10)

    ,VoidLoss VARCHAR(10)

    )

    INSERT INTO @voidloss (HISTORY_IND,PropCode,VoidCategory,ControlGroup,VoidStartDate,VoidMonth,VoidYear,LetDate,LetMonth,LetYear,VoidDays,MarketRent,VoidLoss)

    VALUES

    (1,3724,'VOI' ,106,'2014/12/01','December',2014,'2014/12/08','December',2014,7,420,2940)

    ,(1,1702,'HB' ,120,'2014/11/19','November',2014,'2014/12/11','December',2014,22,239.73,5274.06)

    ,(2,3730,'ATL' ,108,'2014/11/06','November',2014,'2014/11/27','November',2014,21,390,8190)

    ,(1,3390,'HB' ,100,'2014/10/03','October',2014,'2014/10/16','October',2014,13,400,5200)

    ,(2,3725,'VOI' ,108,'2014/10/03','October',2014,'2014/10/24','October',2014,21,385.01,8085.21)

    ,(2,3698,'ATL' ,106,'2014/09/06','September',2014,'2014/10/15','October',2014,39,250.04,13612.56)

    ,(1,3965,'VOI' ,100,'2015/02/05','February',2015,'2015/02/06','February',2015,1,475,475)

    ,(1,3969,'VOI' ,100,'2015/02/13','February',2015,'2015/02/19','February',2015,6,550,3300)

    ,(3,3791,'ATL' ,108,'2015/01/05','January',2015,'2015/01/23','January',2015,18,360,6480)

    Select * from @voidloss

  • Use a calendar table or equivalent:

    WITH Months AS

    (

    SELECT m=1, mn='January'

    UNION ALL SELECT 2, 'February'

    UNION ALL SELECT 3, 'March'

    UNION ALL SELECT 4, 'April'

    UNION ALL SELECT 5, 'May'

    UNION ALL SELECT 6, 'June'

    UNION ALL SELECT 7, 'July'

    UNION ALL SELECT 8, 'August'

    UNION ALL SELECT 9, 'September'

    UNION ALL SELECT 10, 'October'

    UNION ALL SELECT 11, 'November'

    UNION ALL SELECT 12, 'December'

    )

    SELECT ControlGroup, [Month]=MAX(mn), [Year]=MAX(VoidYear), VoidLoss=MAX(VoidLoss)

    ,VoidAllocated=MAX(CAST(VoidLoss AS MONEY))*COUNT(*)/MAX(VoidDays)

    FROM @VoidLoss a

    CROSS APPLY dbo.GenerateCalendar(VoidStartDate, DATEDIFF(day, VoidStartDate, LetDate)) b

    JOIN Months c ON b.[Month] = c.m

    --WHERE a.ControlGroup = 106

    GROUP BY ControlGroup, YYYYMM;

    The GenerateCalendar function referenced (my calendar table equivalent) can be found here:

    Calendar Tables in T-SQL[/url]

    Note that your VoidLoss column should not be VARCHAR, hence my needing the explicit CAST when using it in the calculation of VoidAllocated above.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • CELKO (3/19/2015)


    Please follow basic Netiquette and post the DDL we need to answer this.

    Uh, I think he did. That's how I was able to give him a tested and working solution.

    CELKO (3/19/2015)


    Follow industry and ANSI/ISO standards in your data. You should follow ISO-11179 rules for naming data elements. You should follow ISO-8601 rules for displaying temporal data. For future reference, the only version allowed in ANSI/ISO Standard SQL (and many other ISO Standards) use a dash and not a slash.

    You made serious errors here. The goal of a database (Network, RDBMS, multi-valued, etc) is to reduce redundancy; you increased it! The goal of any tiered architecture is to do all display formatting in a presentation layer; you embed it in the base table. WHY?

    What math do you do with “history_ind”? That is the only reason to use a numeric data type. Why “void”? Isn't the term “vacant” or “vacancy” in Real Estate? Void is a legal term for a contract having no legal status. Your “let_date” looks like a “vacancy_end_date”; if the property is destroyed, sold or otherwise removed from the market without a rental, you schema does not include it. Finally, the count of vacancy days is computed from the start and end dates, so the column is redundant. If you wish you can use a computed column or a VIEW.

    We need to know the data types, keys and constraints on the table. Avoid dialect in favor of ANSI/ISO Standard SQL. But you live in a magic world of VARCHAR(10) null-able columns! There is no way to have a key. Most encoding schemes use fixed length strings, so that they are easier to display and to validate with a regular expression.

    No arguments from me here. There are a lot of design issues that could have been done better.

    CELKO (3/19/2015)


    Putting a comma at the start of a line is old COBOL convention from punch cards. We did this so could re-arrange and re-use the cards.

    But I like my commas at the beginning! OK, OK, I grew up around punch cards too. Maybe we're both dinosaurs.

    CELKO (3/19/2015)


    This is why you need to read and download the PDF for:

    https://www.simple-talk.com/books/sql-books/119-sql-code-smells/

    CREATE TABLE Vacancy_Losses

    (-- history_ind INTEGER NOT NULL, --- what is this?

    prop_code CHAR(4) NOT NULL PRIMARY KEY, --- my guess

    CHECK (prop_code LIKE '[0-9][0-9][0-9][0-9]'),

    vacancy_category CHAR(3) NOT NULL,

    control_grp CHAR(3) NOT NULL,

    vacancy_start_date DATE NOT NULL,

    vacancy_end_date DATE NOT NULL,

    CHECK (vacancy_start_date < vacancy_end_date),

    market_rent DEC(10,2) NOT NULL,

    vacancy_loss_amt DEC(10,2) NOT NULL);

    INSERT INTO Vacancy_Losses

    VALUES

    ('3724', 'VOI', '106', '2014-12-01', '2014-12-08', 420.00, 2940.00),

    ('1702', 'HB ', '120' '2014-11-19', '2014-12-11', 239.73, 5274.06),

    ('3730', 'ATL', '108' '2014-11-06', '2014-11-27', 390.00, 8190.00),

    ('3390', 'HB ', '100', '2014-10-03', '2014-10-16', 400.00, 5200.00),

    ('3965', 'VOI', '100', '2015-02-05', '2015-02-06', 475.00, 475.00),

    ('3969', 'VOI', '100', '2015-02-13', '2015-02-19', 550.00, 3300.00),

    ('3698', 'ATL', '106', '2014-09-06', '2014-10-15', 250.04, 13612.56),

    ('3725', 'VOI', '108' '2014-10-03', '2014-10-24', 385.01, 8085.21),

    ('3791', 'ATL', '108', '2015-01-05', '2015-01-23', 360.00, 6480.00);

    Since SQL is a database language, we prefer to do look ups and not calculations. They can be optimized while temporal math messes up optimization and your denormalized strings are totally useless. A useful idiom is a report period calendar that everyone uses so there is no way to get disagreements in the DML.

    Calculations are so much more fun though!

    CELKO (3/19/2015)


    The report period table gives a name to a range of dates that is common to the entire enterprise.

    CREATE TABLE Month_Periods

    (month_name CHAR(10) NOT NULL PRIMARY KEY

    CHECK (month_name LIKE <pattern>),

    month_start_date DATE NOT NULL,

    month_end_date DATE NOT NULL,

    CONSTRAINT date_ordering

    CHECK (month_start_date <= month_end_date),

    etc);

    How many years of data do you propose gets stored in that bad boy? Looks like a lot of near-redundancy to me, but then I'm not much on relational purity either.

    CELKO (3/19/2015)


    These report periods can overlap or have gaps. I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantages are that it will sort with the ISO-8601 data format required by Standard SQL and it is language independent. The pattern for validation is '[12][0-9][0-9][0-9]-00-00' and '[12][0-9][0-9][0-9]-[01][0-9]-00'

    Is this enough or do you need more help?

    Isn't the MySQL convention for dates you mention dialectical?

    Oh my! I seem to have fallen into the trap of other posters to this forum that are critical of your robustly critical suggestions. My bad! And my apologies!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • CELKO (3/19/2015)


    The goal of any tiered architecture is to do all display formatting in a presentation layer; you embed it in the base table.

    {snip}

    Most encoding schemes use fixed length strings, so that they are easier to display and to validate with a regular expression.

    I may be misunderstanding something here, but that seems to be a direct contradiction.

    Putting a comma at the start of a line is old COBOL convention from punch cards. We did this so could re-arrange and re-use the cards.

    They still do it for the very same reason except it's electronic now. I didn't like it when I first saw it but it makes life real easy to add tail comments even if you never intend to rearrange the "cards".

    Is this enough or do you need more help?

    Now that you've shown what you think the data should look like (see Dwain's comments on THAT), you could actually post some code that solves the original problem. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply