Note to moderators: please move this post to appropriate category if this category was intended only for the Security Skills Exam discussions.
2nd Edit: Okay, back to original claim. Further testing in php to develop a workaround indicates the problem is not with the php parser as the string is not being altered until after being passed to the db client or the interface to the db client.
Edit: I just realized the problem might be related to the php parser considering 'backslash newline' as a line continuation sequence and removes it before calling the sql client. And, I just proved it. Odd that it would remove 'backslash newline' from the middle of a string, but it seems that it does. In php, this query, "insert unittest (teststring) values ('e\\\\\x0a\x0ae')" results in 'e\e', which is what I wanted. By adding an extra back slash before and new line after, the parser removes the inner set but leaves the outer set alone (thankfully). I am going to leave this post for the next person with simular problem. (note: the result string that I claim is what I want has a square box to indicate the new line character, but in order to show it in this post, I substituted a non-printable UTF-8 html escape sequence character.)
I have been security testing sql injection attack vectors from php application. Most articles note that single quote has to be escaped as two single quotes to prevent early termination of the input string, AND validate any other potentially dangerous character sequences -- but fail to mention what those 'potentially dangerous character sequences' (aka PDCS) are.
I set up a unit test in php to brute force all character permutations to identify PDCS. I found odd behavior with dblib and ADO clients altering my test strings. It appears that the client libraries will respond to the backslash character as an escape character introducer, but only before the characters (found so far), 0x0a (LF), 0x0d (CR), 0x10 (DLE), and 0x3d (=).
The test strings are 4 character permutations that range from 0-255. I've tested SQL2000 to insure varchar will correctly insert and select these strings without altering them. I use the default collation SQL_Latin1_General_CP1_CI_AS.
In Query Analyzer the following TSQL inserts the string without error or alteration:
schema: create table unittest (utID int IDENTITY, teststring varchar(10))
declare @s varchar(10)
set @s = 'e\'+char(10)+'e'
insert unittest (teststring) values (@s)
When using a client, the backslash and LF are stripped out of the string and only 'ee' is stored. I can store 'e\e' and 'e'+char(10)+'e' without error, but not 'e\'+char(10)+'e'.
I set up DB-Library Options in the Client Network Utility with 'Automatic ANSI to OEM conversion' unchecked to avoid the client from altering some characters from 128-255.
I've tried escaping the escape introducer (backslash) without success.
I've also searched the internet for the last two weeks for any hint of what is occurring here and how to disable or hack around it. Got nothing.
Any help or insight here will be much appreciated.