If you've read any Microsoft literature, you know the party line is to use Windows authentication whenever possible. This is sensible from a security perspective because it follows the concept of single sign-on. If you're not familiar with single sign-on, it's pretty easy to understand: a user should only have to sign on one time to gain access to any resources that user might need.
By using SQL Server authentication, most users are going to have to go through two sign-ons. The first sign-on comes when they log onto their computer systems. The second comes when they logon to SQL Server. Admittedly, the user may not manually enter the username and password but there are still two logins. Each additional login is another one for the user to keep track of. Too many logins and eventually users will start writing down username and password combinations or store them in unsecured Excel worksheets. However, if we can tie a user's rights within SQL Server (and the ability to access SQL Server) to the Windows login, we avoid the user having to keep track of his or her SQL Server logon. We can rely on the operating system and SQL Server to perform the tasks for authentication behind the scenes. It's all transparent to the user. Just what most people want! And there you have one reason why Windows authentication is preferred.
Another one is administration. If I grant access for a Windows group, any members of that group have access. If a system administrator adds a person to the group, the person has access and I as a lazy DBA have not had to lift a finger. If a system administrator terminates a user account in the domain, that drops the user from the group. Said user no longer has access to my system. Again, I've not had to lift a finger. This is the life!
Well, there's more to it than appeasing my laziness. Consider the scenario where you have a rogue user. The system administrator disables the account immediately and punts the user off any logged on sessions the user might have (there are ways to do this but I won't go into them). But said user has a laptop and has just enough time to plug up, connect to the SQL Server and grab some data. If the user has to rely on a Windows login the user is now out of luck. When the system administrator terminated said account, the user's access to SQL Server was subsequently terminated. However, if the user had access through a SQL Server login and the DBAs haven't been made aware of the situation, the user is in.
Of course, this type of scenario is rare. More often is the case where a user leaves a company and the Windows account is disabled, but the SQL Server logins aren't cleaned up on all systems. Slips in communication do happen. Even if you have nifty scripts to go and scrub all your database servers they are no good if you don't know to remove the user. So in the end, Windows authentication comes to the forefront for ease of use and ease of administration. But that's not what this article is about.
This article is about two well-known weaknesses in the SQL Server login passwords. One is a weakness as the password is transmitted across the wire. The other is a weakness in how the password is stored. None of this is new material. What does that mean? It means those who care about breaking into SQL Server systems have known about them for a while. I do need to point out there hasn't been any major compromises in the news because of these weaknesses. Keep that in mind as you read through the article (it puts things in perspective). However, a good DBA is a well-informed one. So if you're not familiar with these two vulnerabilities, read on.
Weak Encryption Across the Wire
Unless the communications between a SQL Server 2000 server and its client is encrypted, an attacker can watch the data exchanged by the two and be able to sift through and grab usernames and passwords. The one issue is the attacker has to be in a position to capture the network traffic. In today’s switched environment, this isn’t easy as it once was. Even in small office installations, most servers and workstations are connected via switches and not hubs. The one issue with hubs is that everyone sees everyone else’s traffic. Enter in the switch. The switch isolates my traffic. The only traffic I should see is traffic intended for just my system as well as any traffic intended for all systems (broadcast traffic). If I look at things from a performance perspective, this is what I want. I do not want to contend for bandwidth with the other 15 people I'm plugged in with on a 16-port network hub. Give me a 16-port switch any day so I get my bandwidth! Because of the advantages of a switch strictly from a performance viewpoint, hubs are quickly being phased out. With the reduced prices on switches, even the most miserly individual has a hard time cost-justifying a hub rather than a switch for a particular environment.
This is great from a security perspective because it isolates data streams so that I, by sniffing on my network port, can’t see what John is doing on his. This isolation mitigates the SQL Server password vulnerability with respect to most users. But it doesn’t eliminate everyone. The malicious attacker who also happens to be employed as a network engineer can gain the setup I’ve described above. While I don’t mean to suggest all network engineers are necessarily looking to do such things, the truth of the matter is there are some individuals who would. I don't have any hard numbers, but I wouldn't even put it at 1% (1 in 100). The problem is we have to deal with that one out of very many who would. Money is a big motivator and if such an individual can secure access to a critical SQL Server and walk away with valuable information, expect the individual to do just that. As a result, if the SQL Server passwords are easily decrypted (and they are), using SQL server logins should be an option of last resort, one used when Windows authentication just doesn’t work. How weak is the encryption? Let’s find out!
There are a lot of methods for encrypting data and quite a few of them are very secure. Some of these methods produce output that even if you know the encryption algorithm, you still need the key used to encrypt the data in order to decrypt the data within your lifetime. There are also methods that can be pulled apart by ten year-olds. Encryption by XOR fits the latter category. And that’s how a SQL Server password is encrypted when it goes across the network. Bruce Schneier, one of the best-known people in the field of cryptography, makes the comment, “An XOR might keep your kid sister from reading your files, but it won’t stop a cryptanalyst for more than a few minutes.” Ouch! But he’s absolutely right. XOR is a logic algorithm, not an encryption one. However, some applications tout having a fast encryption algorithm and it’s nothing more than XOR. So what’s wrong with XOR? To understand that, let’s look at the XOR operation in more detail.
For those that may not be familiar with XOR, I’ll start with the OR operation. If I’m comparing two items, if either one is true, my end result is true. In other words:
Item1 OR Item2 = ???
For instance, I may want to compare the following 2 statements:
|Item1:||The United States flag has the colors red, white, and blue.|
|Item2:||The Canadian flag has the colors red, white, and blue.|
If I look at these two items, I know Item1 is true. The US flag does have these three colors. Item2, however, is false because the Canadian flag only has red and white. The OR operation only cares if at least one of the two statements is true. Since Item1 is true, I can evaluate my expression:
Item1 OR Item2 = TRUE
The only way an OR operation evaluates to false is if both items are false. For instance:
|Item1:||Tokyo is the capital of the United States.|
|Item2:||Richmond is the capital of Canada.|
I have a scenario where both statements are false. In this case my OR operation will evaluate to false. So:
Item1 OR Item2 = FALSE
If both statements are true, I’ll get a true result as well. For instance:
|Item1:||The United States flag has the colors red, white, and blue.|
|Item2:||The Canadian flag has the colors red and white.|
Item1 and Item2 are both true. The OR operation only cares if at least one of the two statements is true. If both statements happen to be true, the OR operation will still evaluate to true. In fact, if the first statement happens to be true, there’s usually no reason to evaluate the second. Because of this, programming languages like C# and Java will shortcut the evaluation if Item1 is true. There are mechanisms to force the evaluation of the second item, because sometimes it’s necessary to carry out some programming code that’s part of the evaluation process. But typically, if the first statement is found to be true, there’s no need to look at the second. And here is where we see a big difference between OR and XOR.
The XOR operation is also known as “exclusive or.” Unlike the OR operation, XOR will evaluate true only if one and only one of the two items is true. If both items are true, the XOR operation will return false. The exclusive part of the name means one and only one side can be true for the XOR operation to evaluate to true. So in the case of my previous example where both Item1 and Item2 were true, XOR will evaluate to false.
Item1 XOR Item2 = FALSE
XOR at the Bit Level
I’ve intentionally used the work “item” because I wanted to keep what was being XORed abstract. In logic, statements are typically compared. However, when dealing with computers we will sometimes compare statements but other times compare bits. That’s why you’ll sometimes see XOR referred to as a “bit-wise” operation. It is an operation that is often applied at the bit level because there are some definite uses for it. If you’ve not done much work with logic in school, this all may seem a bit confusing (pun intended). One of the helpful things I was shown in logic was the truth table. A truth table is simply a matrix of all the statements and what they evaluate to for all cases of true and false. Table 1 is a classic truth table for XOR from a logic class.
Notice I’ve used p and q instead of item1 and item2. The letters p and q are often substituted for statements as a shortcut measure.
Table 1. Logic Truth Table for XOR
|p||q||p XOR q|
By looking at the truth table given in Table 1, I can quickly see what happens when XOR the two statements. I can do the same for bits. In Table 2 I show the values, except I’ll be using bit1 and bit2 instead of p and q. I’ll also use 1 and 0 instead of “True” and “False”
Table 2. Bit-wise Truth Table for XOR
|Bit1||Bit2||Bit1 XOR Bit2|
When we compare two sets of bits, we line them up and check each pair of bits individually. Table 3 shows this process:
Table 3. Bit-wise XOR on two bit streams
XOR is a simple operation to carry out. As a result, companies looking for “encryption” may decide to use it because simple operations tend to be very fast and XOR fits the bill perfectly. The common user won’t have any idea of how to decipher the data, so it appears secure. The operation is very quick, so the end user also doesn’t see a huge performance hit. But the problem is it’s too simple. The average end user may not realize how to decrypt the data, but any attacker worth his or her salt will. To make things worse, we reverse the XOR operation by using XOR yet again. Observe in Table 4 I’ve taken the result of Table 3. I then XOR it with Stream2 from Table 3 and I have Stream1 again.
Table 4. Undoing XOR
Applying XOR to SQL Server Passwords
If I know the key that was used, reversing XOR is trivial. I simply XOR the “encrypted” data by the key and I get my original data back. If I don’t know the key, I do have methods available to determine the length of the key and figure out what the key is. Those methods are beyond the scope of this article and aren’t necessary in the case of SQL Server passwords. The reason they aren’t necessary is because when a SQL Server password is transmitted across the network, each byte is XORed with the character 0xA5 (in hexadecimal representation). So my key is a stream of 0xA5 characters, one for each character in the original password. Since I know ahead of time the password stream has been XORed with 0xA5, I simply perform an XOR using 0xA5 and I get the stream as it existed before the XOR.
Microsoft does throw in a step prior to XOR when encrypting the password. That step involves flipping sets of bits. A byte is made up on eight bits. Half a byte, or four bits, is sometimes referred to as a nibble. If I look at a byte, I can split it down the middle and get two nibbles. For instance, a byte of 10101100 has a nibble 1010 and another nibble 1100. What Microsoft does is flip the nibbles. So my byte 10101100 becomes 11001010 and it is this second byte that gets XORed.
Keep in mind that Unicode characters are represented by two bytes. Each byte is treated separately with regards to flipping the nibbles. But in the case where a byte is 00000000, the flipped byte would look the same. The reason I bring this up is while the password is passed as a Unicode string, the second byte for most Latin characters (A, B, c, d) is 00000000 or 0x00. This little bit of information can often help us find the right packet of a network trace, even if we know nothing of how to tell which frames are for logins and which are for data.
Decrypting a SQL Server Password
If you know how to read Tabular Data Stream (TDS) frames, you know what to look for to identify which ones correspond to logging in to SQL Server. Remember I said Latin charcters would have the second byte as 0x00? Well, 0x00 XORed by 0xA5 is 0xA5. So even if you don't know the frame codes, you can look for a stream of hexadecimal codes that have A5 as every other code (if you're dealing with passwords requiring Unicode characters, you'll have to look for a certain code to identify what type of TDS packet - I'll cover this in a later article). An example would be this:
A2 A5 B3 A5 92 A5 92 A5 D2 A5 53 A5 82 A5 E3 A5
If I'm dealing with Latin characters, I can drop the A5's and I get:
A2 B3 92 92 D2 53 82 E3
Once I find the stream, I can decipher the password. I would start by XORing 0xA5 against each character. Then I flip the nibbles and I'm left with the ascii value for the particular letter. Once I look up the ascii value I have my letter. If I do this for all the letters and I have my password. Table 5 demonstrates the deciphering process. The first three streams are in hexadecimal.
Table 5. Deciphering the Password
The original password was simply "password" and I have a match. All SQL Server passwords go across the wire with this weak encryption algorithm. Once you have the right bit of info (the character stream), you can crack this thing with a scientific calculator and an ascii table. Since the SQL Server password is so easily deciphered, encrypting the connection between client and server becomes a necessary evil. Even given the use of Windows authentication, I would suggest a secure connection using SSL or IPSec because even if the login information isn’t being passed in plaintext (unencrypted) or something nearly as weak, the data will be.
David Litchfield’s paper “Threat Profiling Microsoft SQL Server” describes the XOR against 0xA5 but it doesn’t discuss the flipping of the bits, which is part of the password “encryption.” A company calling themselves Network Intelligence India Pvt. Ltd. posted a correction to Mr. Litchfield’s paper. You can find a link to both in the Additional Resources section. Before I get ahead of myself, let me say that a secure channel is an important part of our overall security for SQL Server, but it, in and of itself, isn’t a cure-all for SQL Server logins. David Litchfield and crew of NGSSoftware also found a weakness in the hash SQL Server generates to secure SQL Server passwords. This hash means anyone who manages to get sysadmin rights to our SQL Servers can potentially crack the passwords and use them against us.
Hash Weakness in Password Storage
SQL Server login passwords are stored in the sysxlogins system table but thankfully, only sysadmins have the ability to query against it directly. Therefore, this hash weakness isn’t one that can be exploited by a typical end user. The end user would have to somehow get sysadmin level privileges on a SQL Server. One of the ways a user could do this is by exploiting a vulnerability that leads to privilege escalation, something that's always a possibility. The hash weakness was reported in a paper entitled “Microsoft SQL Server Password (Cracking the password hashes)” that was released by NGSSoftware on June 24, 2002. You’ll find a link to the paper under Additional Resources.
SQL Server doesn’t store user passwords in plaintext (unencrypted), but instead encrypts them. When SQL Server encrypts a password, it uses an undocumented function called pwdencrypt(). This function produces a hash. Since hash can mean different things based on context let me define what I mean by a hash (also called a hash value) and a hash function. A hash or hash function is some function that takes a stream of bits or a string of characters and transforms them into another stream of bits or string of characters, usually smaller and of a fixed-length. A good hash function will return very few duplicate hashes, the fewer the better. The reason a good hash function should tend to return unique hashes is because these hashes are often used for comparison, such as with a password check. Hash functions are usually one-way functions, meaning I can’t reverse engineer the original bit stream or string of characters from the hash (as opposed to XOR which is a reverse function of itself). As a result, if I am doing a password check I’ll get the password from the user and I’ll then throw it through the hash function and generate a hash value. I’ll do a comparison of the hash value I’ve just generated against what I have previously stored for the user. If I have a match, I’ll let the user in. As a result, the less chance of a duplicate hash being generated, the better.
The pwdencrypt() function is a hash function. It takes a plaintext password and converts it into a hash. Actually, it’s more correct to say two hashes. The pwdencrypt() function first generates what is called a “salt.” In cryptography what we mean by salt is a random string of data that is added to the plaintext before being sent through the hash function. In the case of our pwdencrypt() function, the salt is basically a random integer. It’s a bit more complicated that that, but not by a whole lot. The salt is time-dependent, however. How can we tell? If we execute
We’ll get different results even if we’re only a second apart. For instance, the first time I ran this query, I received the following hash:
The second time I ran the query, I received this hash (which is clearly different):
First, the pwdencrypt() function takes the password and converts it to Unicode if it isn’t already. It then adds the salt to the end of the password. This is the plaintext it sends through an algorithm known as the Secure Hashing Algorithm (SHA). SHA will generate a ciphertext (the encrypted characters) that pwdencrypt() will temporarily put to the side. Then pwdencrypt() takes the password and makes it all uppercase. Once again, it’ll append the salt to the end and send the resulting combination through SHA. Finally, pwdencrypt() will combine a standard static code (0x0100 in hexadecimal), the salt, the first ciphertext (password in the original case), and the second ciphertext (password in all uppercase) to create the password hash stored in sysxlogins.
I’m not sure why the all-uppercase version of the password is included, but needless to say, it weakens the SQL Server password "hash." Since I only have to match against uppercase letters, I’ve eliminated 26 possible characters (the lowercase ones) to figure out what the password is. Granted, once I discover the password I won’t know the case of the individual characters, but to figure out the case is trivial. If I can find out that a user has a password of say “TRICERATOPS,” I can then build a quick little program to try every possibility of case for the word triceratops. Triceratops has 11 letters, so there are 211 possible combinations. That’s only 2048 different possibilities. A script or program can test each possibility until it gets a match. Remember SQL Server 7 and 2000 do not have account lockout policies for too many login failures.
Consider that if I didn't have the all-uppercase version of the password I’d have to brute force every single dictionary word and every single possible case. That means just to test triceratops to see if it were the password (regardless of case), I’d have to run up to 2048 attempts instead of one. I would have to test every possible case combination for every single word. I couldn’t just test the word. But since the all-uppercase version is part of what is stored in sysxlogins, the number of attempts I may have to make to crack the password decreases drastically.
Let's look at an example. An 8-character dictionary word has 256 (28) possible case combinations. I’ve been told the SQL Server account uses 1 of 8 words, all of them 8 characters in length (a controlled test). If I have to run through these 8 words and I have to potentially try every single case combination, I may have to try up to 256 x 8 = 2048 combinations.
If I can test just all-uppercase words to find a match, I would have to test just 8 times to get the word. Then I’d have to run up to 256 combinations to find the exact password. Instead of 256 x 8, I’m looking at a maximum of 256 + 8 = 264 combinations. Now extrapolate this out to the entire Webster’s dictionary.
The algorithm to attempt a dictionary attack against a SQL Server password hash isn’t very long or difficult. I’ve pretty much explained it in this section. And when NGSSoftware put out the paper revealing the weakness, they also included source code in VC++ to attempt such a crack. The program isn’t hard and it isn’t very complex, but it does require the Windows 2000 Software Development Kit (SDK) because it needs the CryptoAPI that’s part of the SDK. Figure 1 shows an example of the compiled source code in action against one of the password hashes from earlier.
Figure 1. Cracking the password hash.
NGSSoftware has additional tools that are GUI-based to perform similar tests but with a much nicer interface and a few more features. These two tools are called NGSSQLCrack and NGSSQuirrel. NGSSQLCrack does have the ability to perform a brute force attack should the dictionary attack fails. I've included a link to Steve Jones' reviews of both products in the Additional Resources section.
Most password hacking programs will attempt a dictionary attack first. Since dictionaries are easy to find in electronic form, people who use a password found in a dictionary are opening themselves up to having their passwords hacked. Too many programs can run through an entire dictionary listing in a very short time. SQL Server passwords are no different. In reality, if a user chooses a strong password, one with alphabetic and numeric characters as well as a special character that’s at least six characters long, the password is reasonably secure. I say reasonably, because someone who can bring the proper computer resources to bear will eventually be able to crack the password. The mechanism encrypting SQL Server passwords isn’t such that it is unreasonable for an attacker to be able to crack them, should the hacker get a hold of the hash.
Tip: When I attempted to compile the VC++ code presented in NGSSoftware’s article on cracking SQL Server passwords, VC++ did return 1 compile error with regards to the following line of code: wp = &uwttf; The error VC++ returned indicated that it wouldn’t carry out the implicit conversion. I had to modify the line to read: wp = (char *) &uwttf; in order to generate a successful compile. As they say on the newsgroups, “Your mileage may vary.”
Microsoft recommends Windows authentication because of single sign-on and also to reduce administrative overhead. These two reasons are good enough to use Windows authentication whenever possible. However, there are times when DBAs are forced to use SQL Server logins because that's all a program will support. There's not a whole lot we can do about the authentication method in those cases. But in cases where we do have a choice, such as a home grown application, the choice should usually point in the direction of Windows authentication. If addition to Microsoft's two main reasons, another reason is due to weaknesses in how the passwords are transmitted and how they are stored.
I did say weaknesses but keep in mind to consider the mitigating circumstances. To dispel the FUD (Fear, Uncertainty, and Doubt), let's consider a couple of things. In the first case, you typically have to have a rogue network engineer. If that's the case, SQL Server access isn't the only, nor necessarily the most critical issue facing an organization. Anyone with half a lick of creativity can imagine what such an empowered individual could do. This doesn't mean we shouldn't take steps to reduce our vulnerability, but is also doesn't mean we should go around with our hands in the air screaming, "The sky is falling!" In the second case, you need sysadmin privileges to access the sysxlogins table. Yes, even without a rogue DBA, there is always the possibility of a privilege escalation where a process somehow gets itself to sysadmin rights. NGSSoftware has a paper on that very possibility. But keep in mind that passwords aren't the only things that will be vulnerable. The data is there, too. Also, the more complex the password, the harder it is to crack, even if you do have an advantage of only having to get the upper-case letters. The fact is, if you don't mix in numbers and symbols, the passwords become relatively easy to crack. It's all about password complexity.
- Threat Profiling Microsoft SQL Server - NGSSoftware
- Weak Password Obfuscation Scheme (Modified) in MS SQL Server - Network Intelligence India Pvt. Ltd.
- Microsoft SQL Server Passwords (Cracking the password hashes) - NGSSoftware
- Review: MSSQLCrack - Steve Jones
- Review: NGSSquirrel - Steve Jones