Database Management for SharePoint 2010

With each revision, SharePoint becomes more a SQL Server Database application, with everything that implies for planning and deployment. There are advantages to this: SharePoint can make use of mirroring, data-compression and remote BLOB storage. It can employ advanced tools such as data file compression, and object-level restore. DBAs can employ familiar techniques to speed SharePoint applications. Bert explains the way that SharePoint and SQL Server interact.

Introduction

If you’re an IT Pro or dedicated DBA, and your company will soon be rolling out SharePoint within the organization, then this article is for you. If you’re a SharePoint consultant and you need to talk to the DBAs in your company, then hopefully this will help. I’m going to walk you through a  core overview of SharePoint 2010’s relationship with its databases, and what you should know before you start working with either.

As you probably know, SharePoint cannot do without databases. SharePoint’s relation to databases is like tracks to a train, like a reader to a smartcard… you get the idea. Specifically, SharePoint stores all of its information, both configuration data as well as content, in SQL databases. As much as this was true for Microsoft Office SharePoint Server 2007 (or MOSS, as some like to call it), it’s even more the case for SharePoint Server 2010 (SP2010).

In this article, to start with, we’ll touch upon how SharePoint’s relationship has changed in the its most recent incarnation, and then we’ll look at what’s stayed the same, and what new factors you’ll need to bear in mind. So, map in hand, let’s dive in.

Where There were Few, Now There are Many

You see, in MOSS, the number of databases in a typical farm would range from five to seven, depending on the way the farm was setup. That number will increase dramatically when you move to SharePoint 2010. Just for reference purposes, the table below lists the databases in a typical MOSS deployment:

Database

Purpose

Remarks

SharePoint_Config

Holds farm configuration data

One database per farm

SharedServiceProvider_Config

Holds configuration data for the Shared Service Provider, such as search configuration, My Sites configuration, Excel Services

One database per Shared Service Provider (SSP)

SharedServiceProvider_Search

Holds Search-related data, such as document metadata and NTFS permissions

One database per SSP

SharedServiceProvider_Content

Holds the content of the SSP management pages

One database per SSP

MySites_Content

Holds the My Site host web application, and a site collection for each My Site

At least one per SSP

<SiteName>_Content

Holds the content for a web application and its site collections

At least one, probably more, per web application.

Table 1: Databases in Office SharePoint Server 2007

Note: Companies often store site collections in their own database, which increases the number of databases significantly.

With the introduction of Service Applications as a replacement for the Shared Service Provider in SP2010, the number of databases has increased dramatically to at least 11 for a typical farm (i.e. without any fancy stuff, such as Excel Services, PerformancePoint Services or the Office Web Applications). To give you a sense of what that means, just Search, User Profiles, and Managed Metadata are configured in the example below:

Note: In the table below, I chose the database names myself, based on my own naming convention. We’ll get to naming conventions later but, for the sake of clarity, when SharePoint creates the databases through the Configuration Wizard, it will append GUIDs to the databases names, like this: User Profile Service Application_ProfileDB_bb44087df8ed4c57a19a0d4c3305b6ff. In my convention and the table below, the content databases use the _Content suffix, and the databases that are associated with a Service application use the _SA suffix.

Database

Purpose

Remarks

Service Application related databases

SharePoint_Config

Holds farm configuration data

One database per farm

Search_Crawler_SA

Holds crawl information, such as URLs and paths to crawl

Search_Admin_SA

Maintains Search Administration data

Search_Index_SA

Holds metadata and NTFS permissions for crawled content

ManagedMetadata_SA

Holds Managed Metadata configuration and content

UserProfiles_SA

Holds the user profile data, which is imported from Active Directory or another source

UserProfiles_Social_SA

Holds all social tagging information

 

UserProfiles_Staging_SA

Contains data that is to be synchronized to Active Directory or another source

Content related databases for web applications

MetadataHub_Content

Holds a site collection for publishing site columns and content types.

Required for the Managed Metadata service, and recommended to be a dedicated web application

My_Site_Content

Holds the MySite Host site collection, and a site collection for each user’s MySite.

Portal_Content

Holds content that is associated with the Corporate Portal

Table 2: SQL Databases in SharePoint Server 2010

Some Things Haven’t Changed: Planning and Deployment

While the number of databases has changed, what didn’t change is how you should manage those databases from initial creation to configuration and administration. The principles that were true for MOSS still apply to SP2010, and that’s generally a good thing for DBAs and SharePoint folks.

Planning your deployment is ever-so-important in SharePoint 2010, especially for a controlled and error-free roll out in either production or a test environments. Having scripts available for a fully automated installation will, of course, greatly reduce the time involved for IT-Pros, especially when you have an DTAP environment. How to write these scripts and what they should include is enough material for another article, so let’s not dip into that right now.

Database Servers

As you might expect, SharePoint will only (and I do mean exclusively) accept Microsoft SQL Server as its database server, keeping it all in the family. Depending on your version of SharePoint, you do have a choice as far as versions and builds are concerned, but keep in mind that SharePoint is a 64-bit application from front to back, and so SQL Server (together with your OS) must also be 64-bit. Although you could get a farm up-and-running with a 32-bit SQL Server, it’s just not supported.

SharePoint Version

SQL Server Version (64-bit)

SharePoint Server 2010

SQL Server 2008 R2

SQL Server 2008 SP2

SQL Server 2005 SP3

Office SharePoint Server 2007

Office SharePoint Server 2007 SP1

Office SharePoint Server 2007 SP2

SQL Server 2000 SP4

SQL Server 2005 SP1

SQL Server 2008

SQL Server 2008 R2

SharePoint Portal Server 2003

SharePoint Portal Server 2003 SP2

SQL Server 2000

SQL Server 2005

Table 3: SharePoint and SQL versions

For specific information regarding hardware and software requirements for SQL Server, read this article on Hardware and software requirements (SharePoint Server 2010).

Creating databases

Let’s start with creating the databases for SharePoint, for which there are two methods: you’ve guessed it – the default way, and the way that DBAs and SharePoint professionals prefer. We’re going to take a quick look at these two methods, and touch upon some points that the DBAs amongst us will want to know.

The default way to create a new database is to have SharePoint do it for you automatically during the creation of a web application, via the Central Administration console. In most cases you’ll be asked for a database name, and the database will be created on whichever server is designated as the default database server.

The problem with this is that SharePoint will happily ignore whatever the DBA has configured in the model database, and the databases will be created with whatever settings are configured as the server default. So, if you let SharePoint create your databases for you, be prepared to go in afterwards and modify settings.

In case you don’t know:

SQL uses the model database as a template for new databases, and this model is configured with specific configuration settings, such as file location, growth settings and more. Whenever a database is created from within SQL Management Studio, those settings are also applied to the new database.

The preferred way for creating databases is to have your friendly local DBA pre-create them for you, so that you can just reference the database names when you configure your farm, or create service or web applications. If they want to do this using PowerShell scripts, that is of course entirely up to them, and that is also the method which I would highly recommend. Either way, the goal is to have the databases available ahead of time, complete with all the settings required by SharePoint.

On PowerShell scripting:

The TechNet Script Center is a good starting point for finding and creating scripts to create your databases. You might like to start with this article on how to Create a SQL Database (using VBScript). Another script, using PowerShell and DMO (SQL Database Management Objects) can be found in this blog post by Phil Factor.

Things to Bear in Mind

To be fair, with regards to database creation, SharePoint only really cares about the Collation, which should be LATIN1_General_CI_AS_KS_WS. Growth settings, file locations and the like are configured at your own discretion. Naturally, you should figure them out when planning your deployment, taking into account the fact that content databases will grow larger (and more quickly) than configuration databases, and that databases for Search can experience some heavy hits in large environments.

You should also take into account the fact that Web applications are not restricted to a single content database, which means that you can spread site collections within that web application over multiple content databases, keeping them manageable from a backup point of view.

Security

SharePoint has this concept of a Farm Administrator account, which is responsible for everything that goes on in the farm, and therefore needs access to everything in the farm; I usually have an account named DOMAIN\SPFarm created for this role. During setup and configuration, this user account will be used to create new databases, or attach pre-created databases to web applications and service applications.

For manageability purposes, when a new content database is created for a web application that runs under the identity of another user account (e.g. DOMAIN\SPAppPool), that user account will be granted the appropriate permissions by the Farm Administrator. Therefore, to make that possible, this Farm Administrator account requires dbcreator and securityadmin privileges on the SQL box.

Naming Convention

Having a naming scheme that allowed for quick identification of a databases purpose and associated application already made sense in SharePoint 2007 and, with the vast increase in the number of databases in 2010, that argument now makes even more sense.

One rather annoying, albeit very useful habit of SharePoint is the use of GUIDS, which are all over the place. You’ll find them in logs, for instance, and ‘under the hood’, where they’re called CorrelationIDs. This latter implementation actually make sense, because a GUID allows you to track things rather well, and the CorrelationID is associated with a particular event, which can then be very easily traced in the logs.

However, when GUIDs start appearing in database names, that’s another story, and that’s also when it becomes frustrating. Unfortunately, in the situations where SharePoint creates databases for you (that is, without you specifying the exact database name), it will do exactly that.

The only way to avoid this is to use PowerShell for creating the different components that have associated databases, and provide the exact names you want to use, according to your preferred naming convention. I like to use prefix of SP2010_, followed by the name of the service or web application, and whether it is a content or service oriented database (as seen in Table 2). It should therefore be obvious what these two databases represent:

SP2010_UserProfileTagging_Service

The social tagging information of the UserProfile Service Application

SP2010_Portal_Content

The content that is in the Portal Web Application

Failover

SharePoint 2010 has the ability to take advantage of SQL’s database mirroring feature, which allows for automatic failover to another database server instance if the currently-active one should fail. For practically all databases, you can enter the name of a failover instance which SharePoint will revert to after a small timeout.

As you might expect, this requires teamwork and cooperation between SQL and SharePoint folks, so be prepared to discuss this between the teams.

Remote Blob Storage

Imagine a company that deals with large files, and I mean really, really, large files. Think images, video files, sound clips, etc. Files that require the information management capabilities of SharePoint in terms of versioning, workflows, or perhaps just search. These files could potentially take up so much space in the database that it would significantly (and negatively) influence backup schemes, and possibly impact on system performance.

To avoid these problems, SQL Server 2008 provides Remote Blob Storage or RBS, which moves (streams) Binary Large Objects (BLOB) from the database to the file system. By doing so, it decreases database size while still maintaining control over the files as if they were in the database, which means that SharePoint won’t see the difference when accessing these files.

My recommendation is that you only use RBS if you’ll need it, not just because you can. Make sure there’s a business case for it, because it requires specific expertise on the part of your DBAs.

Read more about Remote Blob Storage here in this overview of Remote BLOB Storage (SharePoint Server 2010)

Database Maintenance

In addition to the fact that SharePoint will generate a lot of databases, the size of all these new databases can grow at really high speed, depending on how popular your deployment is. Trying to regain some of the disk space that a database uses might be tempting, but be careful. As with all of your high-use systems, plan database shrink activities during normal maintenance windows, rather than implementing auto shrink through maintenance plans on a database.

Shrinking a database is possibly the most resource-intensive operation that SQL can perform, and you certainly don’t want to do that during office hours. Plus, it would only be beneficial to you when your database has lost a lot of its content, e.g. after moving a site collection.

There is plenty more information on database maintenance available in this Whitepaper: Database Maintenance for Microsoft SharePoint Products and Technologies.

Summary

If you didn’t know already, then you’ll certainly have gather from this article that SharePoint relies heavily on databases. Knowing the roles of those databases and how to manage them can have a dramatically positive impact on the performance of your SharePoint deployment, and hopefully this article was helpful in that regard. If you’re already familiar with SharePoint 2007, then the basic concepts behind the management of the relevant databases will also be familiar territory, albeit with a few new considerations. Either way, the pointers we’ve covered here should give you the edge you need to make sure you manage the databases, not the other way around.

If you want to learn more about managing the database elements of SharePoint 2010, here are the links mentioned in this article for further reading:

1555-ANTS_60x60_Logo_CP.gif How does your SharePoint application perform? The Beta release of ANTS Performance Profiler 8 adds support for SharePoint 2013, helping rapidly identify and understand SharePoint performance issues. Try the Beta.