Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Indexing and Querying a Frequently Updated Table Expand / Collapse
Author
Message
Posted Tuesday, October 15, 2013 6:31 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, July 25, 2014 5:08 AM
Points: 12, Visits: 69
Apologies in advance if this is posted under the wrong heading - it's really a general performance tuning question. I have a table in an Azure SQL Database that is used to log messages from a variety of applications. It is written to frequently, let's say an average of 10 inserts per second. I need to query this table hourly in order to notify users if certain messages have been logged and then less frequently in order to perform cleanup operations such as deleting messages older than a certain date.

So what's the best strategy for handling this? I don't want to slow writes down with a lot of indexing. But then I'm running select queries against a large and poorly indexed table. As a result, these select queries don't execute very quickly, so then I worry about locking. Because of this, whenever I run select queries I'm using the WITH (NOLOCK) hint - it's not the end of the world if I get dirty reads, but it seems like there has to be a better way.

Should I leave the logging table optimized for writes and copy data out to another table that is well indexed for querying? If that is the solution, is there a best practice for implementing that copy operation so that it's quick and doesn't impact my writes?
Post #1505035
Posted Wednesday, October 16, 2013 6:53 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 5:01 AM
Points: 2,840, Visits: 3,968
YOu can think about data partitioning here. in your scenario it will help
plus
you can also take advantage of Sliding window data archival approach to remove the unwanted historical data.


-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done
Post #1505167
Posted Wednesday, October 16, 2013 6:56 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 5:01 AM
Points: 2,840, Visits: 3,968
bcampbell-1136357 (10/15/2013)
I'm running select queries against a large and poorly indexed table.
And Why are you living with poor indexes ? that you need to work at first.


-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done
Post #1505168
Posted Wednesday, October 16, 2013 7:37 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 12:40 PM
Points: 12,995, Visits: 12,412
bcampbell-1136357 (10/15/2013)
Because of this, whenever I run select queries I'm using the WITH (NOLOCK) hint - it's not the end of the world if I get dirty reads, but it seems like there has to be a better way.


The NOLOCK hint produces dirty reads but it potentially far worse than just a couple of transaction that haven't yet been committed. You can (and will) get missing and/or duplicate rows. Given the frequency this table is being written to this is likely to occur very frequently.

http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/11/10/1280.aspx

http://blogs.msdn.com/b/davidlean/archive/2009/04/06/sql-server-nolock-hint-other-poor-ideas.aspx

http://www.jasonstrate.com/2012/06/the-side-effect-of-nolock/

Assuming you need to query this data frequently I would recommend copying the data you need to query to another table that is indexed. This process has the potential to be slow because you do have to run a query against the raw data. The advantage gained is you can then run multiple queries against the new table and it should be fast and will have no impact on your high frequency table.


_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1505197
Posted Wednesday, October 16, 2013 10:49 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, July 25, 2014 5:08 AM
Points: 12, Visits: 69
Thanks for the suggestions. That is direction I've been leaning. But that then leads to the question of what is the best method for copying out the data to another table without impacting the performance of the logging table? If I run a query hourly that copies over new data, something like...

INSERT INTO dbo.NewTable (LogID, LogMessage, DateStamp)
SELECT LogID, LogMessage, DateStamp
FROM dbo.OldTable
WHERE LogID > @MaxPreviouslyInsertedID

... without NOLOCK won't that query negatively impact the writes on the logging table?
Post #1505337
Posted Wednesday, October 16, 2013 12:03 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 9:18 AM
Points: 6,829, Visits: 13,290
I'm not sure what you mean by "a lot of indexing"... The number of indexes you described is moderate, at least from my perspective...

Depending on the ratio of the messages relevant for sending the hourly email it might be worth trying a filtered index limited to the relevant messages and the columns required to send the email. To avoid frequent page splits you could change the fill factor for that index.

Assuming the insert is performed using a stored procedure, another option would be a separate "notification table" holding all messages that require notification (again, depending on the ratio of relevant messages and the information that needs to get sent by email). After sending the email I'd probably delete the rows no longer needed to keep this table narrow.

Regarding the delete issue:
depending on the number of rows to deal with (in total and to be deleted) it might be worth to look into horizontal partitioning. You could then "just" switch the partition out of the original log table instead of performing the delete operation.




Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question
How to post performance related questions
Links for Tally Table , Cross Tabs and Dynamic Cross Tabs , Delimited Split Function
Post #1505356
Posted Thursday, October 17, 2013 2:47 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 5:01 AM
Points: 2,840, Visits: 3,968
bcampbell-1136357 (10/16/2013)
then leads to the question of what is the best method for copying out the data to another table without impacting the performance of the logging table?
Batch Approach http://www.sqlservercentral.com/articles/Top/63301/


-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done
Post #1505536
Posted Thursday, October 17, 2013 12:31 PM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, September 22, 2014 11:08 AM
Points: 80, Visits: 260
I think Bhuvnesh hit the mark. Optimize the indexes on the table if they are poor.

We have some logging tables and the development teams initial fear was to not create any indexes because it will impact performance! After actually testing, we found that a couple narrow indexes did not impact write performance more than read performance was impacted by not having them.

Get Red Gate Data Generator, a dev box, and populate that table indexed vs not indexed. Confirm the fears before you run.


Why is it that people who can't take advice always insist on giving it? - James Bond, Casino Royale
Post #1505864
Posted Thursday, October 17, 2013 2:57 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, July 25, 2014 5:08 AM
Points: 12, Visits: 69
Lots of food for thought here, thanks. The table has existed for awhile, so I certainly have data to test with - it's got about 3 million rows and growing as we add more users to our application. I have tried carefully adding several indexes to the table to help with reads - but my main concern is making sure that the writes don't suffer. And there are occasional ad hoc queries that I have to run against the table to diagnose problems, which aren't always covered by an index. Anyway, I've got some options to try out. I do particularly like the idea of batching the cleanup operations, which should help with availability.
Post #1505918
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse