Data Scrubbing

  • Hello Guys,

    I have a Requirement where i have to scrub the data. I have loaded data from a flat file into sql server table in some columns there are some values like -111,-11,-11 i.e starting with negative values and there are some empty spaces how can i replace all of these values with a NULL in all the tables of my database

    Thanks

  • update tablename

    set column = NULL

    where cast(column as int) < 0

  • Thanks for the reply, I think this update statement is only for updating a particular column but i have to update all of my columns, tables in my entire Database, in my Database i have many fields that have negative values, empty spaces i have to change them to nulls is there any way to do that

  • SQL_Learning (5/23/2013)


    Thanks for the reply, I think this update statement is only for updating a particular column but i have to update all of my columns, tables in my entire Database, in my Database i have many fields that have negative values, empty spaces i have to change them to nulls is there any way to do that

    There is but...

    It will be horribly slow and incredibly inefficient because you will have to use dynamic sql to look at every single column in every single row of every single table.

    Are you wanting to do this only for certain datatypes? Like only integers or do you need a more generic thing?

    Most likely you will have to use sys.objects and sys.columns to build dynamic sql that you can then execute. There is no magic "change all values in all columns in all tables" process.

    I can help you get started if you can answer my questions above.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Here all my datatypes are varchar there is no other datatype i have loaded the data into stating table where i have got some negative values and empty spaces loaded from the file so i have to convert them to NULL's basing on the requirement

  • Crude but quick way of doing this is to copy then run the outcome of a query like the below:

    select 'update ' + table_name +

    'set ' + column_name + ' = NULL

    where ISNUMERIC(' + column_name + ') = 1 AND cast(' + column_name + ' as int) < 0'

    from INFORMATION_SCHEMA.COLUMNS

    where DATA_TYPE = 'varchar'

    arrghhh I LOVE these views I don't know why MS is getting riod off them! :crazy:

    Actually this will probably not work with columns that have mixed data as the cast would fail. hmmmm

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Abu Dina (5/23/2013)


    Crude but quick way of doing this is to copy then run the outcome of a query like the below:

    select 'update ' + table_name +

    'set ' + column_name + ' = NULL

    where ISNUMERIC(' + column_name + ') = 1 AND cast(' + column_name + ' as int) < 0'

    from INFORMATION_SCHEMA.COLUMNS

    where DATA_TYPE = 'varchar'

    arrghhh I LOVE these views I don't know why MS is getting riod off them! :crazy:

    Actually this will probably not work with columns that have mixed data as the cast would fail. hmmmm

    I agree about those views. The columns one is particularly helpful. I suspect that there will be many people who will end up writing their own as a replacement. 😉

    I modified your code slightly so it will also capture empty strings.

    select 'update ' + table_name +

    ' set ' + column_name + ' = NULL

    where (ISNUMERIC(' + column_name + ') = 1 AND cast(' + column_name + ' as int) < 0) or ' + COLUMN_NAME + ' = '''''

    from INFORMATION_SCHEMA.COLUMNS

    where DATA_TYPE = 'varchar'

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks guys for your query

    i have executed this and it gave me all the table names and columns where the condition is met.

  • Yup so now you just need to copy the result of the query then paste into a new query window and run (preferably one update at a time to be on the safe side)

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • ya got it!! Thanks for the Query again

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply