SQLServerCentral.com Best Practices Clinic: Part 5

  • SET Twisted mind on

    I've had another inspiration about the execution plan. I think this is delibirately left out simply because some of the greatest DBAs all had a wack at it. Let's assume at least Brad and Steve if not 5-6 more of the same caliber. My best guess at this point is that this info is <considered> completely useless or they simply want us to consider other options because that path's been beaten to death already. Missing indexes are fine but there's no way in h*ll they have not considered that before. So while this seems like a viable option, I'm pretty sure they expect something completely different than that.

    With that in mind >> Do you guys really need to track those clicks real time? Are you opened to creating either a logging table with identity bigint, article_key and just inserting to that table with a single update every 15 minutes to the base table? Or god forbid, track that in aspx in the session variables and flush to the db once a minute (or whatever makes sense for your needs)

    AnalysisLog.Data is varchar(50). This is a hell of a long key for something that gets that much activity. Any opened-ess to change that key to identity as int (4 billions articles should get you a long way). I have no idea of the real cost of doing this as I don't know how much code change this would entail nor the actual cost of adding FKs to point to that and possibly new indexes to support that operation.

    Same thing for PostName nvarchar(256).

    Are you guys using compression, if so page or row? I checked out on SM but I couldn't find the sql version info anywhere (sorry if I missed it and also if it's not available). The "standard" is that you get more CPU and less disk activity. You seem to have a lot of room on the cpus, so maybe that could be a nice hit to take there and save on disks / ram (even if cache hit ratio is 100%). All in all maybe a 0 to 5% gain, but still something worth trying.

    Do you have a job that runs every hour around H:20-23? You get a 5X spike in page splits around that time. I would guess something like the top poster and other leaderboads being updated but I can't tell for sure. Log shipping, replication?

    Do you do anything special arount 11:00 to 17:00 on sqlservercentral db? It gets a hell of a spike on log flushes and transactions... or is it just your "normal" business hours.

    Any way we can get a trace for replay so we can prove our ideas make a real impact or are we supposed to guess all the way to the end of the contest?

  • bradmcgehee@hotmail.com (8/2/2011)


    Ninja's_RGR'us (8/2/2011)


    If we don't want to "win" SM. Can we pick another tool?

    Don't know this answer, but I will ask.

    P.S. I'd like a tool-belt upgrade. If that's too much I can accept just sql prompt upgrade. Support is nice but considering all the new suggestions I make with the products I use I think you should at least take that out of the "value" of the gift...

    not to mention all the times I recommend your stuff over these forums. Must have been 100+ times over the last few months :Whistling:

  • --Windows 2008 (64-bit)

    --SQL Server 2008 (64-bit)

    What editions? AND SP for that matter.

  • Re merge :

    My first thought was nice idea. But then I realized that you do "all that" extra work for an event that happens less than 0.01% of the lifetime of the article.

    Same thing for ISNULL(). Didn't have time to test and prove this but I think that separating those is a better path to take.

    Also 3 part update exists (or output). So no need for update + select there. I know all articles page display the viewed n times in the title. So maybe you can remove an extra select or join (excluded from the top 4) by doing something like this >

    UPDATE dbo.AnalysisLog SET @Count = Count = Count + 1 WHERE ...

    maybe add the "missing" index... depends on the datatype option.

  • Ninja's_RGR'us (8/2/2011)


    UPDATE dbo.AnalysisLog SET @Count = Count = Count + 1 WHERE ...

    Nevermind that part, I had missed the fact that there's 1 row per date where there's action on the page.

  • Contest Entry:

    For query 1

    I would put an index on Time and Data for the analysis log table.

    For query 2

    The query 1 item would take care of this query as well.

    For query 1 and 2. The interesting thing would be to cache the days counts off in server memory or another temporary table then do a nightly or hourly job to write all that down to the databaase or copy it to the analysis log. One other arc change I would make is down use a varchar(50) if possible, can these data events be a key of an int or bigint. That would far improve the indexing. Maybe provide a lookup table to find the description.

    For query 3 remove the order by if possible, the sort might be better selecting out of the temp table later if needed. So, add a PostDate column there. Alternately, add an index on cs_Posts with a ThreadID,IsApproved,PostDate,PostID,PostLevel,ApplicationPostType (though this is wide).

    Query 4:

    Looks like an index is needed on cs_weblog_weblogs on sectionID, but selectivity might just be poor. I would alias applicationtype in the where clause, also is W.* really needed? Expand the W.* items out. Write the , out as an ANSI join style just for best practice.

  • Thanks for everyone who has contributed to the contest so far. The goal of the contest was to try an interactive article format with reader feedback on “SQLServerCentral.com Best Practices Clinic: Part 5”, and have a little fun trying to solve a “puzzle of sorts” with the information provided. I feel that there is enough information in the article to provide a “good enough” answer. Sure, with more information you might be able to provide a “better” answer, but this "puzzle of sorts" was designed this way not to place too much impact on our reader’s time. Please keep the answers coming!

    Thanks!

    Brad M. McGehee
    DBA

  • Ninja's_RGR'us (8/3/2011)


    Ninja's_RGR'us (8/2/2011)


    UPDATE dbo.AnalysisLog SET @Count = Count = Count + 1 WHERE ...

    Nevermind that part, I had missed the fact that there's 1 row per date where there's action on the page.

    We both did. I was inspired to believe it was a single row first generated when the first click of the page was performed, primarily because of this quote from the article:

    When a web page is accessed, Query #2 runs first, looking to see if that particular article has ever been viewed before or not. If not, then another query (which is not in the top 10 list) is run to create a new record where clicks are tracked in a table;

    Brad, can you clarify please?


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Craig Farrell (8/4/2011)


    Ninja's_RGR'us (8/3/2011)


    Ninja's_RGR'us (8/2/2011)


    UPDATE dbo.AnalysisLog SET @Count = Count = Count + 1 WHERE ...

    Nevermind that part, I had missed the fact that there's 1 row per date where there's action on the page.

    We both did. I was inspired to believe it was a single row first generated when the first click of the page was performed, primarily because of this quote from the article:

    When a web page is accessed, Query #2 runs first, looking to see if that particular article has ever been viewed before or not. If not, then another query (which is not in the top 10 list) is run to create a new record where clicks are tracked in a table;

    Brad, can you clarify please?

    Based on a Profiler trace I took, here is what appears to happen. Query 1 runs first, and if it doesn't find a record for page that is viewed, then an INSERT is performed into the table so that its views can be tracked. This INSERT query is not in the 10 ten, so I did not list it. If Query 1 does find a record, then Query 2 runs, updating the record for the page that is viewed. The developer of this code was a contractor and did not document eactly what happens or why, so I have had to infer what happens based on what I see in the trace.

    Brad M. McGehee
    DBA

  • Contest Entry

    --------------------------------------------------------------

    Recommendation For Query#1 and @Query#2

    --------------------------------------------------------------

    Understanding: It seems that AnalysisLog table used to keep count of clicks on each artical(data) date wise.

    Logical Change Recommended : first execute Update query to increase count (query#1).

    if not row affected by Update Query then execute insert query to make first record. This will eliminate execution of query#2

    Solution:

    UPDATE dbo.AnalysisLog SET [Count] = [Count] + 1 WHERE [Time] = @RoundedTime AND [Data] = @data

    IF @@ROWCOUNT = 0

    **Insert query**

    --------------------------------------------------------------

    Recommendation For Query#3

    --------------------------------------------------------------

    Recommendation#1. use nolock with first select query, which is used to get ThreadID and PostID

    Recommendation#2. create index on [cs_Posts].[PostDate], this saves sorting time. Because in execution plan of Insert into #PageIndex query sort cost is 36%.

    Solution:

    CREATE NONCLUSTERED INDEX [] ON [dbo].[cs_Posts] ([PostDate])

    GO

    Select

    @ThreadID = p.ThreadID, @PostID = p.PostID

    From

    cs_Posts p with (nolock)

    Where

    p.SectionID = @SectionID and p.PostName = @PostName

    GO

    --------------------------------------------------------------

    Recommendation For Query #4

    --------------------------------------------------------------

    1. We can use inner join between cs_Sections F, cs_weblog_Weblogs W on F.SectionID = W.SectionID

    2. We can use nolock with both tables

    3. Make non-cluster index on F.SectionID and W.SectionID

    Solution:

    CREATE NONCLUSTERED INDEX [] ON [dbo].[cs_Sections] ([SectionID])

    GO

    CREATE NONCLUSTERED INDEX [] ON [dbo].[cs_weblog_Weblogs] ([SectionID])

    GO

    SELECT

    F.[SectionID], F.[SettingsID], F.[IsActive], F.[ParentID], F.[GroupID], F.[Name],

    F.[NewsgroupName], F.[Description], F.[DateCreated], F.[Url], F.[IsModerated],

    F.[DaysToView], F.[SortOrder], F.[TotalPosts], F.[TotalThreads], F.[DisplayMask],

    F.[EnablePostStatistics], F.[EnablePostPoints], F.[EnableAutoDelete],

    F.[EnableAnonymousPosting], F.[AutoDeleteThreshold], F.[MostRecentThreadID],

    F.[MostRecentThreadReplies], F.[PostsToModerate], F.[ForumType], F.[IsSearchable],

    F.[ApplicationType], F.[ApplicationKey], F.PropertyNames as SectionPropertyNames,

    F.PropertyValues as SectionPropertyValues, F.[DefaultLanguage], F.[DiskUsage], W.*,

    null as LastUserActivity

    FROM

    cs_Sections F with (nolock)

    inner join cs_weblog_Weblogs W with (nolock) on F.SectionID = W.SectionID

    WHERE

    F.SettingsID = @SettingsID AND

    ApplicationType = 1

    GO

    --------------------------------------------------------------

    Pradeep Chaurasia

    cpradeep@pisoftek.com

    P. I. Softek Ltd

    Sec-62, Noida, UP

    Pradeep Chaurasia
    P.I. Softek Ltd
    Sec-62, Noida, UP

  • Sorry for the delay in announcing the winner of this contest, but I was out last week traveling. After reviewing the entries, and talking about them to several other DBAs, we selected WilliamD's response as the winning entry. We all thought his answer seemed the most complete. Thanks to everyone who participated in this contest.

    Brad M. McGehee
    DBA

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply