Finding the Percentage

  • Comments posted to this topic are about the item Finding the Percentage

  • A typo may cause some people to disregard one of the answers,

    "CASE one of the COUNT() statements to a decimal."

    Should have said

    CAST not CASE

  • Wouldn't just putting 100.0 at the front of the calculation be the more elegant and parsimonious way to go?

  • Personally I would cast everything in sight to a decimal - implicit conversions are a complete pain, explicit casting will guarantee the correct result.

  • Should the first COUNT not be a SUM?

    5ilverFox
    Consulting DBA / Developer
    South Africa

  • @5ilver Fox
    No, as that would count the total number of runs, rather than the number of batsmen that had hit the runs.

    @MorlinkD
    I THINK that would just multiply 0 by 100.0.

    Just me, but I'd definitely parenthesise the COUNT bits to make the desired precedence clearer, as well as what Toreador said.

  • I've been doing this in Dapper a lot recently - casting everything to decimal has been the way to go!

  • Japie Botma - Friday, October 19, 2018 2:10 AM

    Should the first COUNT not be a SUM?

    It would work with SUM but it works with COUNT because when hr < 40 the the CASE statement will return NULL and that won't be included in the count.

  • Toreador - Friday, October 19, 2018 1:56 AM

    Personally I would cast everything in sight to a decimal - implicit conversions are a complete pain, explicit casting will guarantee the correct result.

    same

    ---------------------------------------------------------------------------------------
    The more you know, the more you know that you dont know

  • Jonathan AC Roberts - Friday, October 19, 2018 6:12 AM

    Japie Botma - Friday, October 19, 2018 2:10 AM

    Should the first COUNT not be a SUM?

    It would work with SUM but it works with COUNT because when hr < 40 the the CASE statement will return NULL and that won't be included in the count.

    Jonathan is correct, as illustrated via the following queries, both of which return the same result:

    SELECT COUNT( CASE
           WHEN [schema_id] = 1 THEN     1
             -- ELSE NULL -- optional: more readable, but same calculation
          END
        ) / CONVERT(FLOAT, COUNT(*)) * 100
    FROM [master].sys.objects;
    -- 11.8181818181818

    SELECT SUM( CASE
           WHEN [schema_id] = 1 THEN     1
             -- ELSE 0 -- optional: more readable, but same calculation
          END
        ) / CONVERT(FLOAT, COUNT(*)) * 100
    FROM [master].sys.objects;
    -- 11.8181818181818

    You can uncomment the ELSEs  to see that the song calculation remains the same.

    Take care, Solomon..

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • morlindk - Friday, October 19, 2018 1:38 AM

    Wouldn't just putting 100.0 at the front of the calculation be the more elegant and parsimonious way to go?

    Correct, as demonstrated by the following query:

    SELECT 100.0 * COUNT( CASE
           WHEN [schema_id] = 1 THEN 1
           ELSE NULL -- optional: more readable, but same calculation
          END
        ) / COUNT(*)
    FROM [master].sys.objects;
    -- 11.818181818181

    However, it is only slightly more elegant because at this point you are really just playing chicken with operator precedence and order of operations. Moving the "100.0" to the front works due to rules that are not obvious to most. Relying upon default / implied ordering can very easily lead to changes in calculations if someone updates this later, making what might appear to be a superficial change, or a change to the calculation that is not expected to have the effect that it will.

    call.copse - Friday, October 19, 2018 2:59 AM

    @5ilver Fox
    No, as that would count the total number of runs, rather than the number of batsmen that had hit the runs.

    @MorlinkD
    I THINK that would just multiply 0 by 100.0.

    Just me, but I'd definitely parenthesise the COUNT bits to make the desired precedence clearer, as well as what Toreador said.

    For #1, as also mentioned by Jonathon, it would work due to the implied "ELSE NULL". Remember, only COUNT(*) counts NULLs. This is probably a reason to prefer using "SUM(CASE ... THEN 1 ELSE 0 END)" instead as it is more readable.
    For #2, that actually would work, as shown directly above in this post, and for the reason that you a recommending to always explicitly group operations with parenthesis :). So, I completely agree with your recommendation, and with Toreador's recommendation to CAST / CONVERT everything. Doing both of these things makes the calculation quite explicitly stated, more easily understood for someone new looking at it the first time, and less error-prone as it is not subject to implied default rules about operator and datatype precedence.

    For me, the ideal expression of this calculation is as follows (explicitly controlling both the order of operations via parenthesis, and the datatypes via converting everything; if this were a financial calculation, I would have converted to DECIMAL(x,y) instead):

    SELECT (CONVERT(FLOAT, SUM(
          CASE
           WHEN [schema_id] = 1 THEN 1
           ELSE 0 -- optional: more readable, but same calculation
          END
         )) / CONVERT(FLOAT, COUNT(*))
       ) * 100.0
    FROM [master].sys.objects;
    -- 11.8181818181818

    Take care, Solomon..

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Is anyone else more disturbed by the query allowing for the possibility of a divide by zero error than by the correct way to cast to a decimal?

  • lmalatesta - Friday, October 19, 2018 9:54 AM

    Is anyone else more disturbed by the query allowing for the possibility of a divide by zero error than by the correct way to cast to a decimal?

    No, I don't believe there is the possibility of a divide by zero error. In order for COUNT(*) to return 0 there would need to be 0 rows. And if there aren't any rows, then the SELECT is not processed and only an empty result set is returned. For example, the following does get a "divide by zero" error:

    SELECT 1 / 0;

    But the following query does not:

    SELECT 1 / 0 WHERE 1 = 0;

    Take care, Solomon..

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Solomon Rutzky - Friday, October 19, 2018 10:11 AM

    No, I don't believe there is the possibility of a divide by zero error. In order for COUNT(*) to return 0 there would need to be 0 rows. And if there aren't any rows, then the SELECT is not processed and only an empty result set is returned.

    The query as supplied does not have a where clause.
    SELECT COUNT( CASE WHEN hr >= 40 THEN  1 END  ) / COUNT(*)  * 100
    FROM dbo.Batting AS b;

    This query most certainly will return 0 rows if dbo.Batting has no rows.

  • lmalatesta - Friday, October 19, 2018 10:25 AM

    Solomon Rutzky - Friday, October 19, 2018 10:11 AM

    No, I don't believe there is the possibility of a divide by zero error. In order for COUNT(*) to return 0 there would need to be 0 rows. And if there aren't any rows, then the SELECT is not processed and only an empty result set is returned.

    The query as supplied does not have a where clause.
    SELECT COUNT( CASE WHEN hr >= 40 THEN  1 END  ) / COUNT(*)  * 100
    FROM dbo.Batting AS b;

    This query most certainly will return 0 rows if dbo.Batting has no rows.

    I see. I just tested and yes, you are correct, as shown in the two queries below, both of which get a "divide by zero" error:

    DECLARE @Batting TABLE (hr INT);
    SELECT COUNT( CASE WHEN hr >= 40 THEN 1 END ) / COUNT(*) * 100
    FROM @Batting AS b;

    DECLARE @test-2 TABLE (hr INT);
    SELECT 1 / 0 FROM @test-2;

    Still, for the purpose of coming up with a minimal example to express the question, I don't think it is all that disturbing. Certainly easy enough to fix with a simple HAVING:

    DECLARE @Batting TABLE (hr INT);
    SELECT COUNT( CASE WHEN hr >= 40 THEN 1 END ) / COUNT(*) * 100
    FROM @Batting AS b
    HAVING COUNT(*) > 0;

    Take care,
    Solomon...

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply