Connection Strings - What is they? Why is they important?
A connection string is a string version of the initialization properties needed to connect to a data store. The connection string can be stored for future use and exchanged between applications. Without a connection string, one would need to create, manage and exchange a lot of complex data structures to access the required data. However, a connection to a Microsoft SQL Server serves a lot of purposes and is much more than just a link between the application and the SQL server. Just as bad roads slow the progress of a region, a bad connection string fails to make optimal use of the raw power that SQL Server is able to provide. A good connection string can, at a high-level:
- Ensure that all communication to the SQL Server is safe and secure
- Help the SQL Server and it's administrators identify the application and the originator initiating the data access request
- Aid in troubleshooting
- Determine the overall behaviour, performance and availability of the application
Connection Strings and Security
Privacy and security are two major aspects of our lives today. It is very true when said that the systems are only as secure as their weakest link. The weakest link in any networked system is the network itself. Communication of an application to a remote server is not secure if not encrypted.
Not only can a connection string enforce whether or not the data transfer will be encrypted, it can also prevent malicious use of the currently active connections. If the application code is compromised, and an attempt is made to fetch the connection string off a currently active connection, the connection string can ensure that the security credentials, if any, are stripped out before passing the connection string along.
SQL Server's SQL Native Client also provides the option to use and validate a SQL Server certificate for the connection, making it an absolutely secure session.
Security is not only about bringing security in what we transfer. Also important is how we transfer, or more precisely, how we initiate the transfer. All of us would have heard about something being whisked out of the back door while the inspection is on at the front door.
At the basic level, the application can specify a default database, protocol, port and connection timeout values for use during the particular session. Most production servers that I have seen do not have their SQL Servers listening on the default port# 1433. In such cases, attempt to connect to the SQL Server using the default values would simply fail. In other cases, different instances of SQL Server on the same physical machine listen in on different ports. For such scenarios, SQL Server allows the application to specify a custom address and/or port information for use. Also, depending upon the type of network (Internet or LAN) the application would be used in, the SQL Server administrator would decide which network protocols are to be allowed for use by the SQL Server (Reference: Choosing a network protocol, see references below). Instead of massively changing the application code and possibly the design, all that is required to be done is to change a single parameter on the connection string.
A connection string therefore, not only helps to establish a connection to the SQL Server, but also helps to efficiently and securely maintain the connection.
I was recently reading a novel by a very famous author about a stowaway on a ship with no native country and legal identification. The unfortunate passenger is denied entry into multiple countries simply because he does not have the proper identification. Similarly, no administrator would like to have unknown applications and people accessing their server, especially not from unknown locations. In fact, in most production systems, checks are in place to highlight and possibly block any such unknown or alien access. Just as our passports ensure safe travel across borders, our applications can help us and our administrators ensure protection from unauthorized or unknown access.
Moving away from the security context for a while, identification is also necessary for troubleshooting. Quite often, with legacy or poorly developed systems, we may find a server with the processor spiked and flat-lined at a particular time of the day. The first steps towards identifying the problem are answering the questions - what is being done on the server, and who is doing it? Answering the first question is reasonably easy with the proper profiling tools, but the second question cannot be answered without proper identification of the application and the originator of the request.
The SQL Server connection string provides parameters that would help us identify the application name and also additional information about the user and the workstation which issued the request. Once identified, either the user/workstation/operation can be blocked or further troubleshooting be done.
Specifically for ODBC connections
Connection strings used by ODBC provide us additional options for auditing and troubleshooting such as:
- Query logging for long running queries
- Saving connection attributes to a file if the connection is successful
- Capture of SQL Server Native Client ODBC driver performance data
Overall application behaviour
Security, auditing and identification are all silent features of the connection string. While they play a very important role, they do not actively influence the execution and response of the application and are hence some of the most neglected parameters in a connection string.
Effect on Application Performance & Availability
As most of us know, and as has been discussed multiple times on multiple blogs, forums and websites; performance is the biggest pain point of any application. Because this is the most visible to the users, any performance impact will never remain hidden for long. More often than not, it has been shown that performance is not a one-sided problem - the performance issues are equally on the code end of things, as they are on the data store end.
In addition to the usual suspects like improper use of connection pools, a major reason for poor performance is the mismatch between the SQL Server packet size and the packet size configured for the rest of the network. I have personally seen this in the networks at my employer's setups. We used a custom network packet size, and when left to the default, our data transfer to & from the SQL Servers became terribly slow. The moment the packet sizes were aligned, performance improved dramatically.
A connection string allows you to tweak the application performance by specifying a range of parameters specifically directed towards improving performance of an application. A few of these are:
- MARS Support (MARS = Multiple Active Result Sets)
- Packet Size specification (Mismatching packet sizes will cause the data transfer to slow down)
- Timeouts (Smaller time outs will cause frequent disconnects when working with large data sets)
- Communication Protocol (E.g.. TCP/IP is great for both LAN & Internet connections, Named Pipes may be faster for LAN, but will be slower for Internet connections)
- Failover partner information
Effect on Application Behaviour
The great Shakespeare wrote "All's the world a stage; and all the men and women merely players." These golden words stand true in today's world as well. With businesses increasing their reach across borders to the other side of the globe, the underlying software systems also need to keep up with the changing trends. Instead of being closed to a single region and geography, these systems also need to be able to comprehend and communicate data in different languages, numbering systems and measuring standards. A retail POS system for a global organization in the US must accept and process weights in ounces & pounds, while the same system for the same organization in the UK must be able to process weights in grams & kilograms - deducting the same quantity from a central stock or warehouse. While these conversions and standards scare the living daylights out of anyone, the SQL Server connection strings help to ease things out a little bit. The following are a few parameters that help in determining the overall behaviour of the application:
- Auto-translate between ANSI/OEM characters
- Data-type compatibility level
- Current Language
- ANSI NPW (NULL/Padding/Warnings behaviour)
- Regional Settings
- Quoted Identifiers
NOTE: An important behaviour to take note of is that SQL Server does not honour changes done to the application's culture information after the connection is established. All changes to the culture must be done before establishing the connection. Altering the regional behavior of a data source can cause application failure.
The above introduction justifies the theory that all computing systems are based on the fundamental principle of GIGO - Garbage In Garbage Out. A bad connection string, will hurt not only the application, but may also compromise the integrity of the data and the security of the organization.
More often that not, applications written by junior developers or those currently studying programming seem to have issues with getting their SQL connections right. This, along with the lack of a concise reference is what inspired me to write this article as a guide to help those new to SQL Integration and development to create better and efficient connection strings. I hope that this helps to make our jobs a little bit easier.
Before I close out on this introduction to SQL Server connection strings, I would request all my kind readers to also visit Appendix A and B, which contain further information on how to generate connection strings using MDAC and a concise lookup chart for the various connection string options for ADO, ODBC and OLE DB technologies.
Have a great rest-of-the-day! Happy coding!
Appendix A - Demo - Generating Connection Strings using MDAC
The Microsoft Data Access Components (MDAC) provide the Data Link Properties dialog box as the common user interface for specifying connection information to a data provider on Windows 2000 and later operating systems. You can use the Data Link Properties dialog box to save connection information in a universal data link (.udl) file. We will be using this *.udl file to configure the most common connection string parameters and also help us get the connection string for us in our application.
Disclaimer: This demo is designed around creating a very basic OLE DB connection string for SQL Server. The purpose of the demo is just to get the reader started on the use of this tool.
Please find below a step-by-step guide on using the Data Link Properties dialog box to generate the connection strings.
- On the desktop or in any folder in the Explorer, right click on an empty area, and say "New" -> "Text Document"
- Rename the Text Document to *.udl (eg. "SqlConn.udl")
- Double-click the newly created file. The "Data Link Properties" dialog box should come up
- In the "Providers" tab, specify the required provider
- In the "Connections" tab specify the Server/instance name, authentication method and default database
- You can specify Impersonation Level and other security details in the "Advanced" tab
- In the "All" tab, you will get a summary of all 3 tabs, plus the option to specify the Application Name and other connection string parameters
- Go back to the "Connections" tab and click "Test Connections" to ensure that you are able to connect successfully
- Click "OK" to close the "Data Link Properties" window
- Open the .udl file created & modified above in Notepad
- You will see the connection string ready for you (In the screenshot below, the Word Wrap feature has been enabled)
Appendix B - Connection Strings Quick Reference Sheet
Please find the Quick Reference Sheet attached with this article.