Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««12345»»

Why is data integrity important? Expand / Collapse
Author
Message
Posted Tuesday, August 31, 2010 7:51 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, October 14, 2014 12:10 PM
Points: 1,414, Visits: 4,541
mtassin (8/31/2010)
Eric Russell 13013 (8/30/2010)
Data integrity and constraint begins with choosing the appropriate data type. Obviously (to most of us) integers should be contained in an Integer data type and monetary values in a decimal with 2 decimal places. However, what I see happen all the time, even in major ISV applications, are date/time values contained in a varchar instead of a proper date data type. Not only does the reporting process take a performance hit with the data conversion, but inevitably there will be the occasional Feb 30. Even worse is when different users or applications use a different coding scheme and you have to sort out (in a where clause at runtime) the context of '3/11/09 6:22'.


Amen,

I've seen so many of these in older databases where I am... we work to squish them... but in many cases the business won't prioritize developer time to do so.

Even better is when you find tables that violate 1NF. And have something like a column called groups with values like 'ADMIN;USER;MANAGER' in it.

Come on... :'(


is that where people try to use SQL instead of LDAP?


https://plus.google.com/100125998302068852885/posts?hl=en
http://twitter.com/alent1234
x-box live gamertag: i am null
[url=http://live.xbox.com/en-US/MyXbox/Profile[/url]
Post #977999
Posted Tuesday, August 31, 2010 8:00 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, September 30, 2014 7:42 AM
Points: 3,688, Visits: 72,435
alen teplitsky (8/31/2010)


is that where people try to use SQL instead of LDAP?


Sort of... it's for forms based authentication on a web page. Instead of creating the M:N association table between users and groups... they stick the groups into the users table... We cry when we see this one... then cry more when we realise how much legacy code is tied to it with such awesome joins as
WHERE
'%' + group.code + '%' LIKE users.groups





--Mark Tassin
MCITP - SQL Server DBA
Proud member of the Anti-RBAR alliance.
For help with Performance click this link
For tips on how to post your problems
Post #978013
Posted Tuesday, August 31, 2010 9:15 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, September 19, 2013 2:00 PM
Points: 183, Visits: 479
Eric Russell 13013 (8/30/2010)
alen teplitsky (8/30/2010)
i've seen this no NULL's allowed OCD thing being taken to extremes

say you sign up a customer today and promise some service that will be done in the near future but you're not sure of the date. since the column is not null you have to put in a crazy date like 1/1/2049. or if something was done in the past then 1/1/1901

so if a customer calls in a few days the CSR will happily tell them they can expect to see someone come to their site in 2049




Yep. You can also end up in jail after asking the officer "Is this a joke?" after he informs you your driver's license was suspended almost 50 years before you were born. The suspension date was entered as 11/11/11 and the year was circa 1985, when I was in my 20's. The officer was so stupid he didn't know what he had just said.

Post #978100
Posted Tuesday, August 31, 2010 9:36 AM


Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Friday, November 11, 2011 1:21 PM
Points: 50, Visits: 162
mtassin (8/31/2010)
RalphWilson (8/30/2010)
A "skipped" question is one that has not been answered but has a Question Number that is lower than the highest Question number that has been answered.



Why not just store the results as the questions are asked?

Table_Answers(
Answerid int identity(1,1) PRIMARY KEY,
Testid int NOT NULL,
questionid int NOT NULL,
TestAnswer int,
TestScore int NOT NULL)

As the questions are asked and answered/skipped just store either
1 in testscore if it's correct
-3 in testscore if it's wrong
-1 in testscore if it's skipped

Then left join Table_Answers to Table_questions if you care about the unanswered questions.

If Testanswer can be more than multiple choices throw in a varchar(max) or nvarchar(max) column for the free form answers... but those will be harder to grade via SQL anyway.


To put it mildly, that really screws up trying to do any overall input and analysis of the test results because the input process is a specialized reader for those Mark The Bubble tests. There is a key premarked on the sheet that mtaches the sheet tot he particular set/arrangement of questions that were answered. So, it becomes a bit more awkward to try to snag which questions were skipped on the fly . . . the reader application simply reads the sheets and records the answers as a row in a table for the most part. The essay questions are handled slightly differently but most of the tests (and the majority of all tests) are either T/F or multiple choice (usually 5 options . . . with the odd 4 option question often getting option 5 selected .


Ralph D. Wilson II
Development DBA

"Give me 6 hours to chop down a tree and I will spend the first 4 sharpening the ax."
A. Lincoln
Post #978132
Posted Tuesday, August 31, 2010 10:11 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, September 19, 2013 2:00 PM
Points: 183, Visits: 479
RalphWilson (8/30/2010)
Now that we appear to be including DW in the mix, I have a question for y'all.

Suppose I have a database in which I am storing the answers that are given during the various exams that a students might take for a course. The design of the exam is such that it is anticipated that no student will complete the exam (thus, there are NULLs expected for some number of answers to questions for each student for each exam). The scoring on the exam is such that 1 point is added for correct answers, 3 points are subtracted for wrong answers and any _un_answered_ (ie. skipped) questions result in a 1 point subtraction. (Thus, the test is designed to try to suppress pure guesses but to also encourage students to answer all questions before skipping them.)

How do you differentuiate between the NULL of a skipped question (i.e. "No Answer Given") vs the NULL of the question that has not yet been reached (i.e. the "Not Applicable")?

This has occassionally come up in my experience and I have developed sevral diffent qork-arounds. I am just seeking advise here.


So, are the students instructed to start at question 1 and proceed to the last question? Because in a "normal" test, the order in which the questions are asked does not matter, so it's perfectly acceptable to start at the end and work forward.

Post #978171
Posted Tuesday, August 31, 2010 10:45 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, September 30, 2014 7:42 AM
Points: 3,688, Visits: 72,435
RalphWilson (8/31/2010)
[quote]mtassin (8/31/2010)
[quote]RalphWilson (8/30/2010)


To put it mildly, that really screws up trying to do any overall input and analysis of the test results because the input process is a specialized reader for those Mark The Bubble tests. There is a key premarked on the sheet that mtaches the sheet tot he particular set/arrangement of questions that were answered. So, it becomes a bit more awkward to try to snag which questions were skipped on the fly . . . the reader application simply reads the sheets and records the answers as a row in a table for the most part. The essay questions are handled slightly differently but most of the tests (and the majority of all tests) are either T/F or multiple choice (usually 5 options . . . with the odd 4 option question often getting option 5 selected .


Didn't realize these were bubble questions... figured it had more to do with the last classes I taught which had a specialized testing application that worked this way. You got question one, either answered or skipped it, then got question two. All being stored back end as you answered it.

Gosh... I didn't know anybody still used Scantron. :)

What does the scanner return on a non-answered question?




--Mark Tassin
MCITP - SQL Server DBA
Proud member of the Anti-RBAR alliance.
For help with Performance click this link
For tips on how to post your problems
Post #978201
Posted Tuesday, August 31, 2010 10:51 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 8:13 AM
Points: 1,708, Visits: 4,854
david_wendelken (8/31/2010)
Yep. You can also end up in jail after asking the officer "Is this a joke?" after he informs you your driver's license was suspended almost 50 years before you were born. The suspension date was entered as 11/11/11 and the year was circa 1985, when I was in my 20's. The officer was so stupid he didn't know what he had just said.

I've seen cases where an application required the user to supply a date, and over time they adopted this "work around" process of stubbing something like 11/11/11 when they didn't know what to enter. Of course this was undocumented, and they often forgot to go back re-enter a real date. That's one strong and compelling reason for allowing some columns to be NULL-able. It's fine so long as there is also a column indicating record completion status to prevent half baked data from getting sent to the shop floor or ending up on a report.
Post #978203
Posted Tuesday, August 31, 2010 11:02 AM


Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Friday, November 11, 2011 1:21 PM
Points: 50, Visits: 162
mtassin (8/31/2010)
RalphWilson (8/31/2010)
[quote]mtassin (8/31/2010)
[quote]RalphWilson (8/30/2010)


To put it mildly, that really screws up trying to do any overall input and analysis of the test results because the input process is a specialized reader for those Mark The Bubble tests. There is a key premarked on the sheet that mtaches the sheet tot he particular set/arrangement of questions that were answered. So, it becomes a bit more awkward to try to snag which questions were skipped on the fly . . . the reader application simply reads the sheets and records the answers as a row in a table for the most part. The essay questions are handled slightly differently but most of the tests (and the majority of all tests) are either T/F or multiple choice (usually 5 options . . . with the odd 4 option question often getting option 5 selected .


Didn't realize these were bubble questions... figured it had more to do with the last classes I taught which had a specialized testing application that worked this way. You got question one, either answered or skipped it, then got question two. All being stored back end as you answered it.

Gosh... I didn't know anybody still used Scantron. :)

What does the scanner return on a non-answered question?


The scanner returns a space character. (The other options are returned as the characters '1' through '5'. T/F are generally coded as '1' and '2'.) These are not the richest of testing facilities.


Ralph D. Wilson II
Development DBA

"Give me 6 hours to chop down a tree and I will spend the first 4 sharpening the ax."
A. Lincoln
Post #978210
Posted Tuesday, August 31, 2010 6:16 PM


Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: 2 days ago @ 7:50 AM
Points: 705, Visits: 3,019
I just spent 2 weeks(!!) cleaning up a lookup table in a third-party ERP application that allowed (and still allows, but for the now-rescinded, application level user permissions) free-form entry for states and countries.

Fun fact: the United States of America has 786 states within its borders, including China, Formosa, Bhopal, and Chicago.

Referential Integrity??? "We don't need no stinkin' referential integrity!!!"

Rich
Post #978416
Posted Wednesday, September 1, 2010 8:10 AM


Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Friday, November 11, 2011 1:21 PM
Points: 50, Visits: 162
rmechaber (8/31/2010)
I just spent 2 weeks(!!) cleaning up a lookup table in a third-party ERP application that allowed (and still allows, but for the now-rescinded, application level user permissions) free-form entry for states and countries.

Fun fact: the United States of America has 786 states within its borders, including China, Formosa, Bhopal, and Chicago.

Referential Integrity??? "We don't need no stinkin' referential integrity!!!"

Rich

Some time back, I successfully made the case for an organization to have a look up table for countries & states (if the country was the US) based upon having located rows indicating Oklahoma City was in something like 30 of the 50 states and at least 15 foreign countries. (I have strong reason to believe that, at the very least, "Oklahoma City, TX" was incorrect. )


Ralph D. Wilson II
Development DBA

"Give me 6 hours to chop down a tree and I will spend the first 4 sharpening the ax."
A. Lincoln
Post #978803
« Prev Topic | Next Topic »

Add to briefcase «««12345»»

Permissions Expand / Collapse