Apostrophes and Double Quotes - Should They be Allowed in table Text-Type Columns?

  • As a best practice, should apostrophes and double quotes be removed from text in SQL tables?

    I can see how it should be allowed. For example, names such as "O'Hare" contain the apostrophe, so that is the best way to store them. The apostrophe is part of their name, after all.

    But if you allow it, will it trip up programming, not just in SQL, but 3rd party software using the data? And if they are allowed, do third party applications have to run each text string through a function to avoid errors?

    I'd also like to know if your data entry (or other data input) is validated so that apostrophes and/or double quotes are prohibited. Why or why not?

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

  • Yes they should be allowed. The system should NEVER NEVER NEVER change the data. The job of sql is store and retrieve data. If there are challenges to retrieving that data then it should be handled by sql.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (1/30/2012)


    Yes they should be allowed. The system should NEVER NEVER NEVER change the data. The job of sql is store and retrieve data. If there are challenges to retrieving that data then it should be handled by sql.

    Isn't allowing them just asking for trouble? I recently witnessed a 3rd party application throw an error, simply because of a quote. So it is the 3rd party's developer's fault then? I'm playnig a devil's advocate here, but I can see both sides of this.

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

  • mtillman-921105 (1/30/2012)


    Sean Lange (1/30/2012)


    Yes they should be allowed. The system should NEVER NEVER NEVER change the data. The job of sql is store and retrieve data. If there are challenges to retrieving that data then it should be handled by sql.

    Isn't allowing them just asking for trouble? I recently witnessed a 3rd party application throw an error, simply because of a quote. So it is the 3rd party's developer's fault then? I'm playnig a devil's advocate here, but I can see both sides of this.

    Yes, that's a problem with the software.

    It almost certainly means they are using some very weak methods of preventing SQL injection attacks, using string manipulation where they should be using query parameterization.

    Those methods of preventing injection attacks have been obsolete for over a decade, but there are people who still use them without understanding that they don't actually work properly, never have, and never will.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • If some 3rd party app can't handle a single quote in the data then the third party vendor should deal with it. It really boils down to what sql is supposed to do. It should store and retrieve data, not change it so some developers can be lazy.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (1/30/2012)


    If some 3rd party app can't handle a single quote in the data then the third party vendor should deal with it. It really boils down to what sql is supposed to do. It should store and retrieve data, not change it so some developers can be lazy.

    I was also thinking about performance. So I wonder how the developers get around the issue - do they have to run every text field though a function, stripping out the quotes in case there are any there? Wouldn't that cause a performance hit?

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

  • if the software uses parameters, quotes are never a problem, performance could increase, as the query would benefit from the ability to be cached, because the parameters allow it to be typed for other values.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Stripping that kind of thing from a string does, obviously, take some CPU cycles, but it's usually minimal. Might make a difference on a server that was close to hardware overload already, but it's a "straw that broke the cammel's back" kind of thing. If it's not already overloaded, you'll probably never see the difference in performance.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Well, if they're not already doing so, I hope that new programming languages start using another delimiter for strings other than quotes. Even brackets "[]" would have been better characters to use, at least for English, since those characters aren't normally necessary.

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

  • mtillman-921105 (1/30/2012)


    Well, if they're not already doing so, I hope that new programming languages start using another delimiter for strings other than quotes. Even brackets "[]" would have been better characters to use, at least for English, since those characters aren't normally necessary.

    Any standard character on the keyboard is a liability for this kind of thing. But well-written code doesn't have problems with it.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (1/31/2012)


    Any standard character on the keyboard is a liability for this kind of thing. But well-written code doesn't have problems with it.

    If I understand correctly, I think that having to wrap every text field in a function, just in case there are quotes in it, is a design flaw. Ideally, that would be unnecessary.

    But thanks for all the information Gus, I want to look into this further.

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

  • You don't have to wrap every field in a function. If your queries are parameterized, which they should be anyway, then it is a non-issue.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • mtillman-921105 (1/31/2012)


    GSquared (1/31/2012)


    Any standard character on the keyboard is a liability for this kind of thing. But well-written code doesn't have problems with it.

    If I understand correctly, I think that having to wrap every text field in a function, just in case there are quotes in it, is a design flaw. Ideally, that would be unnecessary.

    But thanks for all the information Gus, I want to look into this further.

    What I'm saying is, you don't need to wrap them in a function unless there's something wrong with the code. The reason people strip these things out is to prevent SQL injection, and it's the wrong way to do that. It's completely unnecessary if you do it the right way.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • mtillman-921105 (1/30/2012)


    Well, if they're not already doing so, I hope that new programming languages start using another delimiter for strings other than quotes. Even brackets "[]" would have been better characters to use, at least for English, since those characters aren't normally necessary.

    1) The best delimiter for separating units of ASCII data is the Unit Separator, ASCII 31 (0x1F). Record Separator is ASCII 30 (0x1E), Group Separator is ASCII 29 (0x1D), and File Separator is ASCII 29 (0x1C); these have been defined since ASCII was defined in the early 1960's, though it's fallen out of use.

    2) Delimiters shouldn't be relevant if you have good field to field mappings between app and database; regrettably, some of us don't have that luxury.

    In response to the original post, as everyone else said, it's the app's problem; the database is there to take exactly what it's given, whether it's a Q, a š, a ©, or an ', and return that upon request.

    As far as CPU hit, with current hardware, in general, properly escaping each character in a string is not likely to cost more CPU than checking on the database connection and authentication, formatting the data for the connection type, transmitting the data, getting a result, validating the result, and other database connection overhead tasks.

  • Nadrek (2/1/2012)


    mtillman-921105 (1/30/2012)


    Well, if they're not already doing so, I hope that new programming languages start using another delimiter for strings other than quotes. Even brackets "[]" would have been better characters to use, at least for English, since those characters aren't normally necessary.

    1) The best delimiter for separating units of ASCII data is the Unit Separator, ASCII 31 (0x1F). Record Separator is ASCII 30 (0x1E), Group Separator is ASCII 29 (0x1D), and File Separator is ASCII 29 (0x1C); these have been defined since ASCII was defined in the early 1960's, though it's fallen out of use.

    2) Delimiters shouldn't be relevant if you have good field to field mappings between app and database; regrettably, some of us don't have that luxury.

    In response to the original post, as everyone else said, it's the app's problem; the database is there to take exactly what it's given, whether it's a Q, a š, a ©, or an ', and return that upon request.

    As far as CPU hit, with current hardware, in general, properly escaping each character in a string is not likely to cost more CPU than checking on the database connection and authentication, formatting the data for the connection type, transmitting the data, getting a result, validating the result, and other database connection overhead tasks.

    The ASCII escape characters aren't human-visible and don't have keys on a regular keyboard, so, while they work beatifully for computers, they don't work at all well for people. That's almost certainly why they've fallen out of use. Same reason we don't program in Assembler.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply