average of percentage

  • Hello, I am looking to get average of percentages (to be calculated in same query).

     

    create table #temp(coldate date, col1 varchar(20), col2 varchar(50), amt numeric(15,5), col3 varchar(20))

    insert into #temp
    select '2022-11-01', 'a1234', 'Tom Alter', 200.50, 'Yes' union all
    select '2022-10-08', 'd9583', 'Ben Richardson', 68.93, 'No' union all
    select '2022-11-12', 'x4826', 'Kenny Patrick', 5.24, 'Yes' union all
    select '2022-10-24', 'g2507', 'Jiten Patel', 60.85, 'Yes' union all
    select '2022-11-05', 't9462', 'Arthur Mesh', 153.66, 'No' union all
    select '2022-09-10', 'b4483', 'Joseph', 182.20, 'No' union all
    select '2022-10-01', 'u4927', 'Sam Terry', 202.35, 'Yes' union all
    select '2022-09-22', 's1836', 'Rashid Khan', 18.90, 'No' union all
    select '2022-09-14', 'w9382', 'Marcus Ben', 38.65, 'Yes' union all
    select '2022-10-07', 'r2454', 'Nancy Joel', 74.37, 'Yes' union all
    select '2022-08-28', 'v5642', 'Rakesh Sharma', 112.85, 'No'

    select * from #temp order by 1
    -- I am trying to make a query to get data like this on rolling basis, means whichever month i run, it gives avg of current plus past three months:
    -- Avg of %:: (% of Aug + % of Sep + % of Oct + % of Nov)/4
    select ((1/1)*100 + (2/3)*100 + (1/4)*100 + (1/3)/100)/4-- (% of (Total 'No' per month/Total count per month)/4-- as there are 4 months
  • By default SQL Server performs integer division on integers which means decimal remainders are truncated.  The DB Engine switches  to floating point math if the denominator of a fraction is DECIMAL or FLOAT.  The code below multiplies the denominator by 1.0 (in order to avoid integer truncation)

    select year(coldate) yr, month(coldate) mo,
    sum(iif(col3='No', 1, 0))/(count(*)*1.0)*100 as calc_pct
    from #temp
    group by year(coldate), month(coldate);

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Hi Steve, thanks but what you are providing here is calculating percentage. My ask is to get an average of 4 month percentage (or 5 or 6 month depending on requirement) in a way where % for each month is calculated first and then average is taken of the % values. like:

    Running in Nov month = (% value for Aug + % value for Sep + % value for Oct + % value for Nov)/4

  • What's the new question?

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • The original question is how to calculate average of percentages in most efficient way.

  • Steve Collins wrote:

    What's the new question?

    I apologize if am not able to put my question in better way but i was looking to get average of percentages in one query.

  • sqlenthu 89358 wrote:

    Hi Steve, thanks but what you are providing here is calculating percentage. My ask is to get an average of 4 month percentage (or 5 or 6 month depending on requirement) in a way where % for each month is calculated first and then average is taken of the % values. like:

    Running in Nov month = (% value for Aug + % value for Sep + % value for Oct + % value for Nov)/4

    You missed the point that Steve made.  You posted the following...

    Select ((1/1)*100 + (2/3)*100 + (1/4)*100 + (1/3)/100)/4

    What percentage would you expect from the following, for example...

    SELECT (2/3)*100

    I'm thinking that you didn't expect it to return a big, fat goose-egg like it does.  And that's what Steve is talking about.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    sqlenthu 89358 wrote:

    Hi Steve, thanks but what you are providing here is calculating percentage. My ask is to get an average of 4 month percentage (or 5 or 6 month depending on requirement) in a way where % for each month is calculated first and then average is taken of the % values. like:

    Running in Nov month = (% value for Aug + % value for Sep + % value for Oct + % value for Nov)/4

    You missed the point that Steve made.  You posted the following...

    Select ((1/1)*100 + (2/3)*100 + (1/4)*100 + (1/3)/100)/4

    What percentage would you expect from the following, for example...

    SELECT (2/3)*100

    I'm thinking that you didn't expect it to return a big, fat goose-egg like it does.  And that's what Steve is talking about.

     

    Hi Jeff & Steve, Apologies for late response but I was out due to health issues. To answer your (& Steve's) question, the above sample which is:

    Select ((1/1)*100 + (2/3)*100 + (1/4)*100 + (1/3)/100)/4

    will give me the following average of percentage:

    (100.00 + 66.67 + 25.00 + 33.33)/4

    = 56.25

    I am giving a new example now:

    drop table if exists #standards
    drop table if exists #fact


    create table #standards
    (std_id int, std_name varchar(50), std_start_dt datetime, std_end_date datetime)

    insert into #standards
    select 1, '% of xyz', '2022-01-01', '9999-12-31' union all
    select 1, '% of abc', '2022-04-01', '9999-12-31' union all
    select 1, '% of mnop', '2022-05-01', '9999-12-31'

    create table #fact
    (cobdate datetime, rationale varchar(50), amount numeric(28,2), is_needed bit, is_done bit, createdate datetime)

    insert into #fact
    select '2022-01-01', 'abc', 20.580, 1, 1, getdate() union all
    select '2022-01-01', 'abc', 232.32, 0, 1, getdate() union all
    select '2022-03-01', 'abc', 23.532, 1, 1, getdate() union all
    select '2022-02-01', 'abc', 684.2323, 1, 0, getdate() union all
    select '2022-04-01', 'abc', 895.2343, 1, 0, getdate() union all
    select '2022-05-01', 'abc', 756.123, 1, 1, getdate() union all
    select '2022-05-01', 'abc', 47.50, 1, 0, getdate() union all
    select '2022-01-01', 'xyz', 85.8967, 1, 1, getdate() union all
    select '2022-02-01', 'xyz', 93.567, 1, 1, getdate() union all
    select '2022-02-01', 'xyz', 84.938, 1, 0, getdate() union all
    select '2022-01-01', 'xyz', 54.68, 1, 1, getdate() union all
    select '2022-03-01', 'xyz', 12.43, 1, 1, getdate() union all
    select '2022-06-01', 'xyz', 95.263, 0, 1, getdate() union all
    select '2022-06-01', 'xyz', 952.574, 1, 0, getdate() union all
    select '2022-05-01', 'xyz', 857.23, 1, 1, getdate() union all
    select '2022-05-01', 'xyz', 87.65, 1, 0, getdate() union all
    select '2022-05-01', 'xyz', 389.50, 1, 1, getdate() union all
    select '2022-04-01', 'xyz', 12.85, 1, 1, getdate() union all
    select '2022-04-01', 'mnop', 195.50, 1, 1, getdate() union all
    select '2022-04-01', 'mnop', 85.875, 0, 1, getdate() union all
    select '2022-06-01', 'mnop', 25.25, 1, 1, getdate() union all
    select '2022-05-01', 'mnop', 8947.38, 1, 1, getdate() union all
    select '2022-05-01', 'mnop', 895.37, 1, 0, getdate() union all
    select '2022-05-01', 'ftad', 85.75, 1, 1, getdate() union all
    select '2022-05-01', 'ftad', 90.20, 1, 0, getdate()


    select * from #standards
    select * from #fact

    From this example i want a result like this:

    select s.std_name, 
    (sum(case when f.is_needed = 1 and f.is_done = 1 then 1.0 else 0 end)/sum(case when f.is_needed = 1 then 1.0 else 0 end))*100
    from #fact f cross join #standards s
    where s.std_name = '% of abc'
    and f.rationale = 'abc'
    and f.cobdate >= s.std_start_dt
    group by s.std_name

    Here as you see, I am getting percentage as output for the std_name but it is for entire set of records which were created after the "std_start_dt" of the std. But I need the "average of %" for 4 months. So for std_name '% of abc', if I am running the report for '2022-05-01' then i want the avg((% of feb 2022), (% of mar 2022), (% of apr 2022), (% of may 2022)). I am not sure how to do it in one single query.

  • This sounds like it's supposed to be done in PowerBI, using DAX. Doesn't sound like a SQL question at all.

  • The following SQL solution should get you there

    DECLARE @ReportDate   date = '2022-06-05';
    DECLARE @ReportMonths int = 6; -- This value will be the number of months reported per std_name
    DECLARE @WindowMonths int = 4; -- This value will be the number of months in the rolling window per std_name

    DECLARE @StartMonth date = DATEADD(mm, DATEDIFF(mm, 0, @ReportDate) -@ReportMonths +1, 0);


    -- Get a list of Month Start Dates to work with
    ;WITH H2 (N) AS ( SELECT 1 FROM (VALUES
    (1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
    (1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
    ) AS V(N))
    , NUMS(N) AS (SELECT TOP(@ReportMonths) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM H2)

    -- Pre-aggregate the data for the calculations
    , cteAggData AS (
    SELECT s.std_name
    , DataMonth = DATEADD(mm, DATEDIFF(mm, 0, f.cobdate), 0)
    , NumNeeded = SUM(CAST(f.is_needed AS decimal(18,0)))
    , NumDone = SUM(CAST(f.is_done AS decimal(18,0)))
    FROM #fact AS f
    CROSS JOIN #standards AS s
    WHERE s.std_name = '% of ' + f.rationale
    AND f.cobdate >= s.std_start_dt
    AND f.cobdate < s.std_end_date
    GROUP BY s.std_name, DATEADD(mm, DATEDIFF(mm, 0, f.cobdate), 0)
    )

    -- Create a matrix of data to query the rolling windows
    , cteMatrix AS (
    SELECT src.std_name
    , ReportMonth = DATEADD(mm, x.N -1, @StartMonth)
    , NumNeeded = MAX(CASE WHEN src.DataMonth = DATEADD(mm, x.N -1, @StartMonth) THEN src.NumNeeded ELSE 0 END)
    , NumDone = MAX(CASE WHEN src.DataMonth = DATEADD(mm, x.N -1, @StartMonth) THEN src.NumDone ELSE 0 END)
    -- The next line depends on whether you want to show 0% or 100% if the NumNeeded = 0
    , percentDone = MAX(CASE WHEN src.DataMonth = DATEADD(mm, x.N -1, @StartMonth) THEN src.NumDone / CASE WHEN src.NumNeeded = 0 THEN 1 ELSE src.NumNeeded END * 100 ELSE 0 END) -- When 0/0 = 0%
    --, percentDone = MIN(CASE WHEN src.DataMonth = DATEADD(mm, x.N -1, @StartMonth) THEN src.percDone ELSE 100 END) -- When 0/0 = 100%
    FROM NUMS AS x
    CROSS JOIN cteAggData AS src
    GROUP BY src.std_name, x.N
    )
    SELECT m.std_name
    , m.ReportMonth
    , m.NumNeeded
    , m.NumDone
    , m.percentDone
    , rolPercAvg = SUM(m.percentDone) OVER (PARTITION BY m.std_name ORDER BY m.ReportMonth
    ROWS BETWEEN 3 PRECEDING AND CURRENT ROW -- NOTE!!! this value must be hard-coded to [ @WindowMonths -1 ]
    ) / @WindowMonths
    FROM cteMatrix AS m
    ORDER BY m.std_name, m.ReportMonth;

  • sqlenthu 89358 wrote:

    Jeff Moden wrote:

    sqlenthu 89358 wrote:

    Hi Steve, thanks but what you are providing here is calculating percentage. My ask is to get an average of 4 month percentage (or 5 or 6 month depending on requirement) in a way where % for each month is calculated first and then average is taken of the % values. like:

    Running in Nov month = (% value for Aug + % value for Sep + % value for Oct + % value for Nov)/4

    You missed the point that Steve made.  You posted the following...

    Select ((1/1)*100 + (2/3)*100 + (1/4)*100 + (1/3)/100)/4

    What percentage would you expect from the following, for example...

    SELECT (2/3)*100

    I'm thinking that you didn't expect it to return a big, fat goose-egg like it does.  And that's what Steve is talking about.

    Hi Jeff & Steve, Apologies for late response but I was out due to health issues. To answer your (& Steve's) question, the above sample which is:

    Select ((1/1)*100 + (2/3)*100 + (1/4)*100 + (1/3)/100)/4

    will give me the following average of percentage:

    (100.00 + 66.67 + 25.00 + 33.33)/4

    = 56.25

    I am giving a new example now:

    drop table if exists #standards
    drop table if exists #fact


    create table #standards
    (std_id int, std_name varchar(50), std_start_dt datetime, std_end_date datetime)

    insert into #standards
    select 1, '% of xyz', '2022-01-01', '9999-12-31' union all
    select 1, '% of abc', '2022-04-01', '9999-12-31' union all
    select 1, '% of mnop', '2022-05-01', '9999-12-31'

    create table #fact
    (cobdate datetime, rationale varchar(50), amount numeric(28,2), is_needed bit, is_done bit, createdate datetime)

    insert into #fact
    select '2022-01-01', 'abc', 20.580, 1, 1, getdate() union all
    select '2022-01-01', 'abc', 232.32, 0, 1, getdate() union all
    select '2022-03-01', 'abc', 23.532, 1, 1, getdate() union all
    select '2022-02-01', 'abc', 684.2323, 1, 0, getdate() union all
    select '2022-04-01', 'abc', 895.2343, 1, 0, getdate() union all
    select '2022-05-01', 'abc', 756.123, 1, 1, getdate() union all
    select '2022-05-01', 'abc', 47.50, 1, 0, getdate() union all
    select '2022-01-01', 'xyz', 85.8967, 1, 1, getdate() union all
    select '2022-02-01', 'xyz', 93.567, 1, 1, getdate() union all
    select '2022-02-01', 'xyz', 84.938, 1, 0, getdate() union all
    select '2022-01-01', 'xyz', 54.68, 1, 1, getdate() union all
    select '2022-03-01', 'xyz', 12.43, 1, 1, getdate() union all
    select '2022-06-01', 'xyz', 95.263, 0, 1, getdate() union all
    select '2022-06-01', 'xyz', 952.574, 1, 0, getdate() union all
    select '2022-05-01', 'xyz', 857.23, 1, 1, getdate() union all
    select '2022-05-01', 'xyz', 87.65, 1, 0, getdate() union all
    select '2022-05-01', 'xyz', 389.50, 1, 1, getdate() union all
    select '2022-04-01', 'xyz', 12.85, 1, 1, getdate() union all
    select '2022-04-01', 'mnop', 195.50, 1, 1, getdate() union all
    select '2022-04-01', 'mnop', 85.875, 0, 1, getdate() union all
    select '2022-06-01', 'mnop', 25.25, 1, 1, getdate() union all
    select '2022-05-01', 'mnop', 8947.38, 1, 1, getdate() union all
    select '2022-05-01', 'mnop', 895.37, 1, 0, getdate() union all
    select '2022-05-01', 'ftad', 85.75, 1, 1, getdate() union all
    select '2022-05-01', 'ftad', 90.20, 1, 0, getdate()


    select * from #standards
    select * from #fact

    From this example i want a result like this:

    select s.std_name, 
    (sum(case when f.is_needed = 1 and f.is_done = 1 then 1.0 else 0 end)/sum(case when f.is_needed = 1 then 1.0 else 0 end))*100
    from #fact f cross join #standards s
    where s.std_name = '% of abc'
    and f.rationale = 'abc'
    and f.cobdate >= s.std_start_dt
    group by s.std_name

    Here as you see, I am getting percentage as output for the std_name but it is for entire set of records which were created after the "std_start_dt" of the std. But I need the "average of %" for 4 months. So for std_name '% of abc', if I am running the report for '2022-05-01' then i want the avg((% of feb 2022), (% of mar 2022), (% of apr 2022), (% of may 2022)). I am not sure how to do it in one single query.

    I'm even more confused now...

    You say that for the report date of May 2022, that you want Feb, Mar, Apr, and May but the join to you #Standards table limits the data to only Apr 2022 and above, which totally ignores Feb and Mar.  You also have data for ABC prior to Apr 2022 but your "#standards" won't let us see Feb and Mar.

    Would you please explain the disparities between the actual data and the requirements that you've provided so far?

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If you want to build dynamic rolling dates, consider building and joining to a date dimension (as in a data warehouse).

    The date dimension would have an integer datekey and a granularity of one row per day.

    Then each date would have a month offset so that as of today (12th Jan 2023) anything in Jan 2023 would have a month offset of 0, Dec 2022 an offset of 1, Nov 22 an offset of 2 and so on. Think of this month offset as "months ago".

    You can do the same for day, week and year

    The table gets refreshed each night to reflect the change in dates

    You would then limit your data based on the SQL clause of:

    WHERE DateDimension.MonthOffset in (0,1,2,3)

    In terms of your earlier assertion, have you tried running the SQL?

    Select ((1/1)*100 + (2/3)*100 + (1/4)*100 + (1/3)/100)/4

    will actually give you the following average of percentage: 25 (let's forget the typo after (1/3) which should be a * not a / as it will still give you 25.

    Bizarrely:

    Select ((1.0/1.0)*100.0 + (2.0/3.0)*100.0 + (1.0/4.0)*100.0 + (1.0/3.0)*100.0)/4.0

    will actually return 56.2499750000

    SELECT (100.00 + 66.67 + 25.00 + 33.33)/4

    will give you the answer 56.25

     

  • This was removed by the editor as SPAM

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply