October 20, 2016 at 2:28 pm
Here is my brilliant idea (that I am sure someone else already thought of...)
We have a situation where we regularly get tables from another department and we have to check every text-related column to see if the users actually "over-typed" something. What I mean is that they were looking away from the keyboard and typed more than 255 chars. without realizing it - saved the record - and never knew they had done this.
So the boss asked if I could write a SProc to detect this kind of thing. E.g. look at the columns, and see if they were "full" (for lack of a better term). So the first thing I did was to run:
EXEC sp_columns @table_name = 'WHATEVER_THE_TABLE_IS' and this gives me a nice display of ALL the internal makeup of the table - BUT!!! - I don't need all of it. I just need a few columns like LENGTH and PRECISION.
So here is the simple (I hope) question...
Can I develop a query that would allow me to specify the columns out of what I presume is Master? Again, I don't need all of it - just a few.
Thanks in advance for any help you can offer.
October 20, 2016 at 2:39 pm
select
TABLE_QUALIFIER = DB_NAME()
, TABLE_OWNER = OBJECT_SCHEMA_NAME(object_id)
, TABLE_NAME = OBJECT_NAME(object_id)
, *
from sys.columns
where object_id = OBJECT_ID('dbo.WHATEVER_THE_TABLE_IS', 'U')
October 20, 2016 at 2:43 pm
Perhaps something like this can help you start?
SELECT 'SELECT * FROM ' + t.TABLE_SCHEMA + '.' + t.TABLE_NAME
+ ' WHERE ' + STUFF(( SELECT ' OR LEN( ' + c.COLUMN_NAME + ') = ' + CAST( c.CHARACTER_MAXIMUM_LENGTH AS varchar(4))
FROM INFORMATION_SCHEMA.COLUMNS c
WHERE c.TABLE_NAME = t.TABLE_NAME
AND c.TABLE_SCHEMA = t.TABLE_SCHEMA
AND c.CHARACTER_MAXIMUM_LENGTH > 0
FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(max)'), 1, 3, '')
FROM INFORMATION_SCHEMA.TABLES t
EDIT: missed the last line of code.
October 20, 2016 at 2:53 pm
So, you're saying that the users are entering values that are considered invalid due to their excessive length? You can avoid this by implementing check constraints on each of these columns.
create table MyTable
(
col1 varchar(255) check ( len(col1) <= 40 ),
col2 varchar(255) check ( len(col2) <= 40 )
)
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
October 21, 2016 at 3:30 am
Eric M Russell (10/20/2016)
So, you're saying that the users are entering values that are considered invalid due to their excessive length? You can avoid this by implementing check constraints on each of these columns.create table MyTable
(
col1 varchar(255) check ( len(col1) <= 40 ),
col2 varchar(255) check ( len(col2) <= 40 )
)
If you limit the length of the value by a CHECK, why not limit the datatype instead?
create table MyTable
(
col1 varchar(40) ),
col2 varchar(40) )
)
October 21, 2016 at 4:23 am
HanShi (10/21/2016)
Eric M Russell (10/20/2016)
So, you're saying that the users are entering values that are considered invalid due to their excessive length? You can avoid this by implementing check constraints on each of these columns.create table MyTable
(
col1 varchar(255) check ( len(col1) <= 40 ),
col2 varchar(255) check ( len(col2) <= 40 )
)
If you limit the length of the value by a CHECK, why not limit the datatype instead?
create table MyTable(
col1 varchar(40) ),
col2 varchar(40) )
)
I agree with this solution. Just limit the field. And if it's a problem that the app is letting more than 255 characters through, just do a LTRIM on the string coming back to the database to only grab those first 255 characters.
EDIT: When you say you're getting tables from this other department, exactly what do you mean by that? Is there an SSIS package picking stuff up? A flat file coming into your department? A linked server?
October 21, 2016 at 6:24 am
My thanks to all who responded, but please, in the future, read the question - I DID NOT say "Users are entering data", what I said was "We get tables" - meaning these are tables exported from other systems where the limits are not the same as SQL Server. So a few of you got off on a tangent answering a question I never asked.
To those who presented good, workable solutions, many thanks - it's most helpful. To the others, I honor your excellent knowledge; you're way ahead of me - but deep knowledge is kind of useless if you're answering a question that was in fact, never asked.
No offense intended here to anyone, but as the wise man says; "intelligence is useless without common sense."
October 21, 2016 at 6:30 am
BLandry474 (10/21/2016)
My thanks to all who responded, but please, in the future, read the question - I DID NOT say "Users are entering data", what I said was "We get tables" - meaning these are tables exported from other systems where the limits are not the same as SQL Server. So a few of you got off on a tangent answering a question I never asked.
Actually, I did ask that question on my edit. Read it again. How are you getting these tables? What process is bringing them in?
October 21, 2016 at 6:52 am
My apologies Brandi - you're right - you did say that in your edit. However, you also just addressed constraining the columns entry when I asked a very simple question: How can I generate a query that produces results just like sp_columns (which unfortunately returns all tables and columns, when I only need a few). No one could produce that answer - which seems simple to me in what may well be my ignorance, but after all there is sp_columns so even more simply put - I want to be able to run that query and specify tables, and columns I know I need.
To answer your additional question: we are the largest financial services company in the world and we get databases from our clients who run a wide variety of systems (probably about 25-50 different ones). So we often get columns or "fields" if you will, that we need to convert into our internal SQL Server systems. Sometimes we use SSIS, but not always since we see databases from ancient systems like DataPoint, dBase, FoxPro, Paradox, Oracle, Sybase, etc. etc. - too many to name. Unfortunately for us, there is no "one size fits all". Worse, we often have to map data fields in other systems to the "best guess' in our own. This is the genesis of what we do.
The core problem we are trying to address is that Rules and Regs (SEC, Government, etc.) require (!!!) that we NEVER truncate data and yet we run into it all the time. Right now this is all done by 'eyeball' and I thought if I had a good query that could examine a column and its length, then be able to trim the incoming contents and check its length - I could then produce a report showing what records just are not going to fit. This is basically what I am trying to address, along with wishing daily that the world used just one database!!
Sorry if I sounded miffed, its just the work we do - day after day of analysis in the "coalmines" of data manipulation.
October 21, 2016 at 7:07 am
BLandry474 (10/21/2016)
My apologies Brandi - you're right - you did say that in your edit. However, you also just addressed constraining the columns entry when I asked a very simple question: How can I generate a query that produces results just like sp_columns (which unfortunately returns all tables and columns, when I only need a few). No one could produce that answer - which seems simple to me in what may well be my ignorance, but after all there is sp_columns so even more simply put - I want to be able to run that query and specify tables, and columns I know I need.To answer your additional question: we are the largest financial services company in the world and we get databases from our clients who run a wide variety of systems (probably about 25-50 different ones). So we often get columns or "fields" if you will, that we need to convert into our internal SQL Server systems. Sometimes we use SSIS, but not always since we see databases from ancient systems like DataPoint, dBase, FoxPro, Paradox, Oracle, Sybase, etc. etc. - too many to name. Unfortunately for us, there is no "one size fits all". Worse, we often have to map data fields in other systems to the "best guess' in our own. This is the genesis of what we do.
The core problem we are trying to address is that Rules and Regs (SEC, Government, etc.) require (!!!) that we NEVER truncate data and yet we run into it all the time. Right now this is all done by 'eyeball' and I thought if I had a good query that could examine a column and its length, then be able to trim the incoming contents and check its length - I could then produce a report showing what records just are not going to fit. This is basically what I am trying to address, along with wishing daily that the world used just one database!!
Sorry if I sounded miffed, its just the work we do - day after day of analysis in the "coalmines" of data manipulation.
I understand miffed. Really. But let me ask you some follow up questions.
How are you getting this data into your system if you're not doing SSIS?
At what point in the process (pre-load or post-load or during-load) are you wanting to check the length of the columns?
Regarding the Rules & Regs, how strict are we talking about? I would think you'd be able to keep the data in the original format somewhere (therefore you're obeying the no-truncate rule) but then do a trim to get it working properly in your system. But it also depends on what you're doing with the data when you get it, which leads me to...
...What is the final data getting dished up as and why? Because helping you get through this requires us to understand the end point of this data at least in a general technical sense.
October 21, 2016 at 7:31 am
BLandry474 (10/21/2016)
My apologies Brandi - you're right - you did say that in your edit. However, you also just addressed constraining the columns entry when I asked a very simple question: How can I generate a query that produces results just like sp_columns (which unfortunately returns all tables and columns, when I only need a few). No one could produce that answer - which seems simple to me in what may well be my ignorance, but after all there is sp_columns so even more simply put - I want to be able to run that query and specify tables, and columns I know I need.
Yesterday I was in a hurry, so I just left a basic query that would return code to check that string columns would have their max length used (I just checked and hierarchyid types were included but you could exclude them easily). You just had to refine the query to return a single table and store the result in a variable to execute it as dynamic SQL.
Was this a problem? Did you need more help? You mentioned that you only needed a place to start.
October 21, 2016 at 7:36 am
I Brandie,
Let me try to provide you some answers...
We use SSIS when its truly needed, but sometimes we actually use old DTS, or even just an ASCII dump directly into Excel. It really depends on the incoming data. Sometimes its a really good match and you can do it very easily, but the larger part of the time it requires SSIS where there is much more control through mapping.
I want to check the data POST-LOAD, because what we do is use "staging" tables that are SQL based. So we move the incoming data to SQL, and THEN it would be nice to add truncation control to our analysis procedures BEFORE the data goes into Production.
Hmmm... How strict? Well, in this case we are talking purely about text-based fields, usually notes, international phone numbers, and this kind of thing. All other data goes through other working scrub routines - so its just this text data that is taking up a good deal of our efforts - too much if you ask me.
As for what and why... We are moving client data to our own cloud and systems.
I hope this helps to answer your questions - thanks for the help.
October 21, 2016 at 7:38 am
Luis,
No - it was not a problem - I saw what you were doing but it wasn't quite what I was after (my fault, not yours).
October 21, 2016 at 7:57 am
So, the question remains. Do you need any more help?
The information that you need come from the system views (sys.schemas, sys.tables, sys.columns, sys.types) or the information_schema views (which I used). Both options have their advantages and disadvantages. Some people will prefer the system views as they're more reliable and have more information, and some will prefer the information_schema views because they're easier to query.
Viewing 15 posts - 1 through 15 (of 37 total)
You must be logged in to reply to this topic. Login to reply