Importing only rows with column data in specific columns.

  • Hi all -

    I have an interesting scenario that i would like some input on. I have a requirement to import data from one system into another. The source table is not normalized to the destination.

    The first (source) table will have a few pieces of common info, and the last several columns will have data in various columns. As an example, each row looks as follows:

    name, addy, ph, grade1, grade2, grade3, grade4, grade5

    Dan, 123 Main, 555-5555, A, NULL, C+, NULL, NULL

    ...

    Now, in the destination table, the import needs to omit the records with any NULL grades. As an example (based on the above):

    name, addy, ph, grade

    Dan, 123 Main, 555-5555, A

    Dan, 123 Main, 555-5555, C+

    ...

    My question I guess is, how is the best way to do this? Ultimately, I will want this in an integration serv package. My first thought was to pre-process the records on import with a CASE / IMPORT type solution, but not sure if that will work using case...

    Any suggestions would be appreciated.

    Thanks,

    Dan

  • Should do something like this:

    with basedata (

    name,

    addy,

    ph,

    grade

    ) as (

    select

    name,

    addy,

    ph,

    grade1

    from

    sourcetable

    union all

    select

    name,

    addy,

    ph,

    grade2

    from

    sourcetable

    union all

    select

    name,

    addy,

    ph,

    grade3

    from

    sourcetable

    union all

    select

    name,

    addy,

    ph,

    grade4

    from

    sourcetable

    union all

    select

    name,

    addy,

    ph,

    grade5

    from

    sourcetable

    )

    select

    name,

    addy,

    ph,

    grade

    from

    BaseData

    where

    grade is not null;

  • Dan, I think Lynn's solution is probably better, but just for academic purposes, you could try something like this...

    IF OBJECT_ID('tempdb..#test','u') IS NOT NULL

    DROP TABLE #test;

    SELECT

    'Dan' AS [name],

    '123 Main' AS addy,

    '555-5555' AS ph,

    CAST('A' AS VARCHAR(20)) AS grade1,

    CAST(NULL AS VARCHAR(20)) AS grade2,

    CAST('C+' AS VARCHAR(20)) AS grade3,

    CAST(NULL AS VARCHAR(20)) AS grade4,

    CAST(NULL AS VARCHAR(20)) AS grade5

    INTO #test;

    SELECT

    [name],addy,ph,grade

    FROM

    (

    SELECT

    [name],

    addy,

    ph,

    grade1,

    grade2,

    grade3,

    grade4,

    grade5

    FROM #test) t

    UNPIVOT

    (

    grade

    FOR x IN ([grade1],[grade2],[grade3],[grade4],[grade5])

    ) AS unpvt

    WHERE grade IS NOT NULL

    Greg
    _________________________________________________________________________________________________
    The glass is at one half capacity: nothing more, nothing less.

  • Thanks! I understand the unions and can see how it puts it together in a serial fashion, not sure on the 'with basedata' portion though.... could you elaborate on that just a bit?

    Thanks

    Dan

  • Lynn's solution is the way to go.

    In her solution, basedata is a CTE - effectively a subquery in this case - from which the normalised data is being selected.

    If you wanted to do the entire thing in SSIS, that is also possible, though quite advanced, through the use of an asynchronous Script Component. Advanced ... and slower - but possible.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • To understand the WITH basedata, you should read this:

    http://msdn.microsoft.com/en-us/library/ms175972(v=sql.105).aspx

    and this:

    http://msdn.microsoft.com/en-us/library/ms190766(v=SQL.105).aspx

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Okay... i read through them, and have a point or two of confusion. In my example above, my source has 'grade1', 'grade2', etc.... however it seems that she is referencing a column that doesn't exist - 'grade'... and after i convert this to my scenerio, it states that the 'grade column doesn't exist and errors out.

    Also, the very last portion (select outside the parents) seems to be missing the FROM object reference...

    Do I need to add the 'grade' column somewhere, in the queried table perhaps?

    Dan

  • ok.... i got it... the object referenced at the end is the basedata. so looking at the result set, i see all of the values.... but that leaves me with another issue... i see the 'grades' but not the subjects... this is frustrating...lol

  • You never mentioned the subjects.

    You have to add them to the query that constructs the basedata (the selects with union all).

    with basedata (

    name,

    addy,

    ph,

    subject,

    grade

    ) as (

    select

    name,

    addy,

    ph,

    'Subject 1',

    grade1

    from

    sourcetable

    union all

    select

    name,

    addy,

    ph,

    'Subject 2',

    grade2

    from

    sourcetable

    union all

    ...

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Phil Parkin (9/7/2012)


    Lynn's solution is the way to go.

    In her solution, basedata is a CTE - effectively a subquery in this case - from which the normalised data is being selected.

    If you wanted to do the entire thing in SSIS, that is also possible, though quite advanced, through the use of an asynchronous Script Component. Advanced ... and slower - but possible.

    Should read:

    In her his solution, basedata is a CTE - effectively a subquery in this case - from which the normalised data is being selected.

    Not a biggie, happens rather frequently.

  • robert.baird 2778 (9/7/2012)


    ok.... i got it... the object referenced at the end is the basedata. so looking at the result set, i see all of the values.... but that leaves me with another issue... i see the 'grades' but not the subjects... this is frustrating...lol

    Tell you what, it wouldn't be frustrating if you provided all the information up front instead of providing things piecemeal.

    The best place to start, please read the first article I reference below in my signature block about asking for help. It gives a step by step approach on what and how to post the information needed to provide you with the best possible answers. I haven't read it in a while myself, so I think it may leave out that we also need to know the expected results based on the sample data you need to post.

  • Thanks guys.... i worked it out... now, if i were to turn the select into an insert statement?

    with basedata (

    name,

    addy,

    ph,

    grade

    ) as (

    select <--PUT INSERTS HERE?

    name,

    addy,

    ph,

    grade1

    from

    sourcetable

    union all

    select

    name,

    addy,

    ph,

    grade2

    from

    sourcetable

  • Nope, more like this:

    with basedata (

    name,

    addy,

    ph,

    grade

    ) as (

    select

    name,

    addy,

    ph,

    grade1

    from

    sourcetable

    union all

    select

    name,

    addy,

    ph,

    grade2

    from

    sourcetable

    union all

    select

    name,

    addy,

    ph,

    grade3

    from

    sourcetable

    union all

    select

    name,

    addy,

    ph,

    grade4

    from

    sourcetable

    union all

    select

    name,

    addy,

    ph,

    grade5

    from

    sourcetable

    )

    INSERT INTO dest_table(

    name,

    addy,

    ph,

    grade

    )

    select

    name,

    addy,

    ph,

    grade

    from

    BaseData

    where

    grade is not null;

  • No, you put it outside the WITH statement.

    Here's the reference to the syntax:

    http://msdn.microsoft.com/en-us/library/ms174335(v=sql.105).aspx

    And here's is the example:

    http://msdn.microsoft.com/en-us/library/dd776381(v=sql.105).aspx#OtherTables

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • The UNPIVOT (one table scan) looks much better to me than fully scanning the table once for every grade column.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 15 posts - 1 through 15 (of 29 total)

You must be logged in to reply to this topic. Login to reply