Cost Threshhold for Parallelism and MDOP - will these always have an effect?

  • Sorry for the newb question, but as I explore the Cost Threshhold for Parallelism and Max Degree of Parallelism configurations in SQL Server and talk with colleagues, I'm getting more confused.

    My understanding is that the SQL Server engine will recognize larger queries expected to take a lot of time/resources, and will automatically make use of parallelism, or processing different parts of the query through multiple processors.  Someone I'm speaking with seems to think it is pointless to set these values for certain applications that "don't support multithreading," but I would have thought any large query could be subjected to parallelism.  I would greatly appreciate any help in having some of this explained to me.

  • multi-threading is completely unrelated to how SQL server internally uses multiple cores.

    a multi threaded operation is how the applciation does it's thing, SQL server is a black box as far as the app is concerned.

    SQL server evaluates every query and assigns it a cost value; that number is used to determine whether the SQL engine would divide the work across multiple cores, and then assemble the data together and send it back.

    part of that logic is to use an appropriate number of threads. if i have 64 cores, i do not want every query trying to use all 64 cores, potentially, so a maxdop of 4 or 8 is usually what i set up, unless it's a data warehouse situation, where even more cores and heavier processing takes place.

    the default value of cost threshhold for parallelism is 5, a throwback to the SQL  2000 days;

    that means even relatively simple queries could use multiple threads; for simple queries, that means they could actually be a bit slower than if it was using only one thread, because of the divide+process+reassemble results

    most DBA's use a cost threshold for parallelism of at least 50 until they know a bit more about the serve.r

    my query below is what i use to analyze. it gets me means and medians and averages of the current workload in the cache; my idea is i want the 50% easiest queries to be single threaded, and anything above that 50% to use parallelism.

    on one of my servers, for example, i get a value of 69 for my average, so i would set that server to 69 or 70 instead of my typical value of 50.

    DECLARE @ServerLastRestarted DATETIME
    SELECT @ServerLastRestarted=create_date FROM sys.databases WHERE name = 'tempdb';
    IF OBJECT_ID('tempdb..[#SubtreeCost]') IS NOT NULL
    DROP TABLE [#SubtreeCost]


    DECLARE @Value VARCHAR(30);
    SELECT @Value = CONVERT(VARCHAR(50), [Value]) FROM sys.[configurations] WHERE name = 'cost threshold for parallelism';

    ;WITH XMLNAMESPACES
    (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')

    SELECT
    TRY_CAST(n.value('(@StatementSubTreeCost)[1]', 'VARCHAR(128)') AS DECIMAL(18,2)) AS StatementSubtreeCost
    INTO #SubtreeCost
    FROM sys.dm_exec_cached_plans AS cp
    CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS qp
    CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS qn(n)
    WHERE n.query('.').exist('//RelOp[@PhysicalOp="Parallelism"]') = 1;

    --SELECT StatementSubtreeCost
    --FROM #SubtreeCost
    --ORDER BY 1;

    SELECT
    @ServerLastRestarted AS ServerLastRestarted,
    AVG(StatementSubtreeCost) AS AverageSubtreeCost,
    COUNT(*) AS StatementCount,@Value AS CurrentCostThreshold
    FROM #SubtreeCost;

    SELECT @ServerLastRestarted AS ServerLastRestarted,
    ((SELECT TOP 1 StatementSubtreeCost
    FROM
    (
    SELECT TOP 50 PERCENT StatementSubtreeCost
    FROM #SubtreeCost
    ORDER BY StatementSubtreeCost ASC
    ) AS A
    ORDER BY StatementSubtreeCost DESC
    )
    +
    (SELECT TOP 1 StatementSubtreeCost
    FROM
    (
    SELECT TOP 50 PERCENT StatementSubtreeCost
    FROM #SubtreeCost
    ORDER BY StatementSubtreeCost DESC
    ) AS A
    ORDER BY StatementSubtreeCost ASC))
    /2 AS MEDIAN,
    @Value AS CurrentCostThreshold;

    SELECT TOP 1 @ServerLastRestarted AS ServerLastRestarted,StatementSubtreeCost AS MODE,COUNT(*) AS StatementCount,@Value AS CurrentCostThreshold
    FROM #SubtreeCost
    GROUP BY StatementSubtreeCost
    ORDER BY COUNT(1) DESC;

     

     

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Hi Lowell.  Thank you for the very helpful response (and for that Cost Threshhold for Parallelism query)!

    I'd be curious to hear a bit more about your approach for choosing a suitable MAXDOP, especially when there are fewer cores involved, say 2-4.  The know the default is 0, indicating that all cores can be involved for a given query.

  • Well for MaxDop recommendations, I've been using this pair of scripts  which have  calculations that someone else came up with.

    this script takes into account numa nodes and microsofts recommendations, and the updated takes into account that in 2017 and above, you can have MaxDop at the database level now.

    http://dbamastery.com/performance-tuning/maxdop-calculator/

    https://github.com/dbamaster/DBA-Mastery/tree/master/MAXDOP%20Calculator

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thanks again Lowell.  I'll give those scripts a look!

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply