﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / Article Discussions / Article Discussions by Author / Discuss content posted by Joe Celko  / Stairway to Data, Step 7: Data Encoding Schemes - Part I / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Sat, 25 May 2013 04:11:23 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: Stairway to Data, Step 7: Data Encoding Schemes - Part I</title><link>http://www.sqlservercentral.com/Forums/Topic1108184-1604-1.aspx</link><description>Good article But there is a slight logic error in the section on validation and verification.[quote]Validation of an encoding means that the [b]syntax[/b] is correct. Verification means that there is a trusted source or procedure for the encoding, so we know the [b]value and the syntax[/b] is correct.[/quote] (my emphasis)[quote]Consider the five digit ZIP code. A quick validation can be the regular expression “zip_code CHAR(5) NOT NULL CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]')” which says that ZIP code is five digits. But this rule is not correct; the lowest ZIP code is 01001 which is for (Agawam, MA) and the highest ZIP code is 99950 (Ketchikan, AK). All zeros and all nines will never be used; they are deliberate dummy values for sorting missing data to the front or back of a list of ZIP codes. Are we done? Nope! There are unused ZIP codes between the highest and lowest code values; for example, Austin, TX has ZIP codes 78739, and 78741 but not 78740.[/quote]All of the examples you give show why the validation is not verification, but do not invalidate the validation. In fact the validation you show is [b]a correct validation rule for the syntax[/b]. Specifically, the syntax of a 5-digit ZIP code is five numerical digits, and that regular expression validates that syntax. The fact that some ZIP codes are currently not in use or are reserved for special purposes is not a validation (i.e, allowed sytax) issue, but a verification (i.e, allowed values) issue.Creating a validation rule which checked the entered value against a list of known ZIP codes would be incorrect, because it 1) would perform verification, not validation, and 2) could break as soon as the ZIP code list changed (which [b]does[/b] happen.)</description><pubDate>Wed, 10 Aug 2011 10:02:39 GMT</pubDate><dc:creator>sknox</dc:creator></item><item><title>RE: Stairway to Data, Step 7: Data Encoding Schemes - Part I</title><link>http://www.sqlservercentral.com/Forums/Topic1108184-1604-1.aspx</link><description>Having different codes for subtle variations like 'unassigned', 'not known', 'missing data'  ,'not applicable' , 'none' only helps if all of the people entering data can be made to appreciate the differences and always enter the data correctly. Am I being too cynical in believing that the chances of that happening are close to zero in most systems ?And why is 5000 bananas in a lifetime unreasonable ? If you live for about 30000 days that's roughly one a week, not allowing for buying for your family.(This message sponsored by the Banana Marketing Foundation ...)</description><pubDate>Wed, 10 Aug 2011 02:10:16 GMT</pubDate><dc:creator>archie flockhart</dc:creator></item><item><title>Stairway to Data, Step 7: Data Encoding Schemes - Part I</title><link>http://www.sqlservercentral.com/Forums/Topic1108184-1604-1.aspx</link><description>Comments posted to this topic are about the item [B]&lt;A HREF="/articles/Stairway+Series/Data+Encoding+schemes/72663/"&gt;Stairway to Data, Step 7: Data Encoding Schemes - Part I&lt;/A&gt;[/B]</description><pubDate>Thu, 12 May 2011 22:12:53 GMT</pubDate><dc:creator>CELKO</dc:creator></item></channel></rss>