SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Why is data integrity important?


Why is data integrity important?

Author
Message
alen teplitsky
alen teplitsky
SSCrazy
SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)

Group: General Forum Members
Points: 2922 Visits: 4674
mtassin (8/31/2010)
Eric Russell 13013 (8/30/2010)
Data integrity and constraint begins with choosing the appropriate data type. Obviously (to most of us) integers should be contained in an Integer data type and monetary values in a decimal with 2 decimal places. However, what I see happen all the time, even in major ISV applications, are date/time values contained in a varchar instead of a proper date data type. Not only does the reporting process take a performance hit with the data conversion, but inevitably there will be the occasional Feb 30. Even worse is when different users or applications use a different coding scheme and you have to sort out (in a where clause at runtime) the context of '3/11/09 6:22'.


Amen,

I've seen so many of these in older databases where I am... we work to squish them... but in many cases the business won't prioritize developer time to do so.

Even better is when you find tables that violate 1NF. And have something like a column called groups with values like 'ADMIN;USER;MANAGER' in it.

Come on... :'(


is that where people try to use SQL instead of LDAP?
mtassin
mtassin
SSCarpal Tunnel
SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)

Group: General Forum Members
Points: 4920 Visits: 72519
alen teplitsky (8/31/2010)


is that where people try to use SQL instead of LDAP?


Sort of... it's for forms based authentication on a web page. Instead of creating the M:N association table between users and groups... they stick the groups into the users table... We cry when we see this one... then cry more when we realise how much legacy code is tied to it with such awesome joins as
WHERE
'%' + group.code + '%' LIKE users.groups



--Mark Tassin
MCITP - SQL Server DBA
Proud member of the Anti-RBAR alliance.
For help with Performance click this link
For tips on how to post your problems
david_wendelken
david_wendelken
SSC Veteran
SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)

Group: General Forum Members
Points: 272 Visits: 480
Eric Russell 13013 (8/30/2010)
alen teplitsky (8/30/2010)
i've seen this no NULL's allowed OCD thing being taken to extremes

say you sign up a customer today and promise some service that will be done in the near future but you're not sure of the date. since the column is not null you have to put in a crazy date like 1/1/2049. or if something was done in the past then 1/1/1901

so if a customer calls in a few days the CSR will happily tell them they can expect to see someone come to their site in 2049




Yep. You can also end up in jail after asking the officer "Is this a joke?" after he informs you your driver's license was suspended almost 50 years before you were born. The suspension date was entered as 11/11/11 and the year was circa 1985, when I was in my 20's. The officer was so stupid he didn't know what he had just said.
RalphWilson
RalphWilson
Valued Member
Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)

Group: General Forum Members
Points: 70 Visits: 162
mtassin (8/31/2010)
RalphWilson (8/30/2010)
A "skipped" question is one that has not been answered but has a Question Number that is lower than the highest Question number that has been answered.



Why not just store the results as the questions are asked?

Table_Answers(
Answerid int identity(1,1) PRIMARY KEY,
Testid int NOT NULL,
questionid int NOT NULL,
TestAnswer int,
TestScore int NOT NULL)

As the questions are asked and answered/skipped just store either
1 in testscore if it's correct
-3 in testscore if it's wrong
-1 in testscore if it's skipped

Then left join Table_Answers to Table_questions if you care about the unanswered questions.

If Testanswer can be more than multiple choices throw in a varchar(max) or nvarchar(max) column for the free form answers... but those will be harder to grade via SQL anyway.


To put it mildly, that really screws up trying to do any overall input and analysis of the test results because the input process is a specialized reader for those Mark The Bubble tests. There is a key premarked on the sheet that mtaches the sheet tot he particular set/arrangement of questions that were answered. So, it becomes a bit more awkward to try to snag which questions were skipped on the fly . . . the reader application simply reads the sheets and records the answers as a row in a table for the most part. The essay questions are handled slightly differently but most of the tests (and the majority of all tests) are either T/F or multiple choice (usually 5 options . . . with the odd 4 option question often getting option 5 selected ;-).

Ralph D. Wilson II
Development DBA

"Give me 6 hours to chop down a tree and I will spend the first 4 sharpening the ax."
A. Lincoln
david_wendelken
david_wendelken
SSC Veteran
SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)SSC Veteran (272 reputation)

Group: General Forum Members
Points: 272 Visits: 480
RalphWilson (8/30/2010)
Now that we appear to be including DW in the mix, I have a question for y'all.

Suppose I have a database in which I am storing the answers that are given during the various exams that a students might take for a course. The design of the exam is such that it is anticipated that no student will complete the exam (thus, there are NULLs expected for some number of answers to questions for each student for each exam). The scoring on the exam is such that 1 point is added for correct answers, 3 points are subtracted for wrong answers and any _un_answered_ (ie. skipped) questions result in a 1 point subtraction. (Thus, the test is designed to try to suppress pure guesses but to also encourage students to answer all questions before skipping them.)

How do you differentuiate between the NULL of a skipped question (i.e. "No Answer Given") vs the NULL of the question that has not yet been reached (i.e. the "Not Applicable")?

This has occassionally come up in my experience and I have developed sevral diffent qork-arounds. I am just seeking advise here. ;-)


So, are the students instructed to start at question 1 and proceed to the last question? Because in a "normal" test, the order in which the questions are asked does not matter, so it's perfectly acceptable to start at the end and work forward.
mtassin
mtassin
SSCarpal Tunnel
SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)SSCarpal Tunnel (4.9K reputation)

Group: General Forum Members
Points: 4920 Visits: 72519
RalphWilson (8/31/2010)
[quote]mtassin (8/31/2010)
[quote]RalphWilson (8/30/2010)


To put it mildly, that really screws up trying to do any overall input and analysis of the test results because the input process is a specialized reader for those Mark The Bubble tests. There is a key premarked on the sheet that mtaches the sheet tot he particular set/arrangement of questions that were answered. So, it becomes a bit more awkward to try to snag which questions were skipped on the fly . . . the reader application simply reads the sheets and records the answers as a row in a table for the most part. The essay questions are handled slightly differently but most of the tests (and the majority of all tests) are either T/F or multiple choice (usually 5 options . . . with the odd 4 option question often getting option 5 selected ;-).


Didn't realize these were bubble questions... figured it had more to do with the last classes I taught which had a specialized testing application that worked this way. You got question one, either answered or skipped it, then got question two. All being stored back end as you answered it.

Gosh... I didn't know anybody still used Scantron. Smile

What does the scanner return on a non-answered question?



--Mark Tassin
MCITP - SQL Server DBA
Proud member of the Anti-RBAR alliance.
For help with Performance click this link
For tips on how to post your problems
Eric M Russell
Eric M Russell
SSChampion
SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)SSChampion (12K reputation)

Group: General Forum Members
Points: 12928 Visits: 10696
david_wendelken (8/31/2010)
Yep. You can also end up in jail after asking the officer "Is this a joke?" after he informs you your driver's license was suspended almost 50 years before you were born. The suspension date was entered as 11/11/11 and the year was circa 1985, when I was in my 20's. The officer was so stupid he didn't know what he had just said.

I've seen cases where an application required the user to supply a date, and over time they adopted this "work around" process of stubbing something like 11/11/11 when they didn't know what to enter. Of course this was undocumented, and they often forgot to go back re-enter a real date. That's one strong and compelling reason for allowing some columns to be NULL-able. It's fine so long as there is also a column indicating record completion status to prevent half baked data from getting sent to the shop floor or ending up on a report.


"The universe is complicated and for the most part beyond your control, but your life is only as complicated as you choose it to be."
RalphWilson
RalphWilson
Valued Member
Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)

Group: General Forum Members
Points: 70 Visits: 162
mtassin (8/31/2010)
RalphWilson (8/31/2010)
[quote]mtassin (8/31/2010)
[quote]RalphWilson (8/30/2010)


To put it mildly, that really screws up trying to do any overall input and analysis of the test results because the input process is a specialized reader for those Mark The Bubble tests. There is a key premarked on the sheet that mtaches the sheet tot he particular set/arrangement of questions that were answered. So, it becomes a bit more awkward to try to snag which questions were skipped on the fly . . . the reader application simply reads the sheets and records the answers as a row in a table for the most part. The essay questions are handled slightly differently but most of the tests (and the majority of all tests) are either T/F or multiple choice (usually 5 options . . . with the odd 4 option question often getting option 5 selected ;-).


Didn't realize these were bubble questions... figured it had more to do with the last classes I taught which had a specialized testing application that worked this way. You got question one, either answered or skipped it, then got question two. All being stored back end as you answered it.

Gosh... I didn't know anybody still used Scantron. Smile

What does the scanner return on a non-answered question?


The scanner returns a space character. (The other options are returned as the characters '1' through '5'. T/F are generally coded as '1' and '2'.) These are not the richest of testing facilities. ;-)

Ralph D. Wilson II
Development DBA

"Give me 6 hours to chop down a tree and I will spend the first 4 sharpening the ax."
A. Lincoln
Rich Mechaber
Rich Mechaber
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1595 Visits: 3665
I just spent 2 weeks(!!) cleaning up a lookup table in a third-party ERP application that allowed (and still allows, but for the now-rescinded, application level user permissions) free-form entry for states and countries.

Fun fact: the United States of America has 786 states within its borders, including China, Formosa, Bhopal, and Chicago.

Referential Integrity??? "We don't need no stinkin' referential integrity!!!"Smooooth

Rich
RalphWilson
RalphWilson
Valued Member
Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)Valued Member (70 reputation)

Group: General Forum Members
Points: 70 Visits: 162
rmechaber (8/31/2010)
I just spent 2 weeks(!!) cleaning up a lookup table in a third-party ERP application that allowed (and still allows, but for the now-rescinded, application level user permissions) free-form entry for states and countries.

Fun fact: the United States of America has 786 states within its borders, including China, Formosa, Bhopal, and Chicago.

Referential Integrity??? "We don't need no stinkin' referential integrity!!!"Smooooth

Rich

Some time back, I successfully made the case for an organization to have a look up table for countries & states (if the country was the US) based upon having located rows indicating Oklahoma City was in something like 30 of the 50 states and at least 15 foreign countries. (I have strong reason to believe that, at the very least, "Oklahoma City, TX" was incorrect. ;-) Cool )

Ralph D. Wilson II
Development DBA

"Give me 6 hours to chop down a tree and I will spend the first 4 sharpening the ax."
A. Lincoln
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search