How do I find the differences?

  • I have en excel file with a list if id's that we use to identify our students. I also have all this info in the database. What i need to do is this,

    I need to compare all the id's on this excel file (i assume i need to upload these into a table?) to whats currently in the database for these people and find anyone that is NOT on the excel list so i can update that list. How would i go about doing this?

    The excel file looks like this

    People Code Id-----------First name-----------Last Name

    000001------------------John----------------Doe

    000002------------------Jane----------------Smith

  • I'm not sure if this is the best approach or not but I would use a DTS package. Load the excel file into a temporary table. Update it (add the missing rows). Then export the data back over the old spreadsheet.

    There may be a way in DTS to append to an excel worksheet but I'm honestly not sure.

    Kenneth

    Kenneth FisherI was once offered a wizards hat but it got in the way of my dunce cap.--------------------------------------------------------------------------------For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]Link to my Blog Post --> www.SQLStudies.com[/url]

  • But im not sure how to "update it to add missing rows"

  • You may need to have 3rd party software to do this. If I require to do the same thing, I may ask our informatica developers to do the job. Alternatively, you may create DTS for this purpose.

  • Once you have the excel data into a table do something like this :

    INSERT INTO ExcelTable

    SELECT Field1, Field2, ...

    FROM DBTable

    WHERE KeyInfo NOT IN (SELECT KeyInfo FROM ExcelTable)

    At least that is one way. There are probably a couple of dozen others.

    Kenneth FisherI was once offered a wizards hat but it got in the way of my dunce cap.--------------------------------------------------------------------------------For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]Link to my Blog Post --> www.SQLStudies.com[/url]

  • I have uploaded the data into a table named "upload" the columns are:

    first_name, last_name, people_code_id

    my query to return ALL of our people code id's that i want/need is this

    select distinct p.first_name

    ,p.last_name

    ,p.people_code_id

    from people as p

    inner join peopletype as pt

    on p.people_code_id=pt.people_code_id

    where p.deceased_flag='n'

    and pt.people_type='alum'

    Now how can i modify that query to join the upload table,and return only the results that are NOT in the upload table? ?

  • Would something like this work?

    select distinct p.first_name

    ,p.last_name

    ,p.people_code_id

    from people as p

    inner join peopletype as pt

    on p.people_code_id=pt.people_code_id

    left outer join upload as u

    on p.people_code_id=u.people_code_id

    where p.deceased_flag='n'

    and pt.people_type='alum'

    and p.people_code_id not in (select people_code_id

    from upload)

  • select distinct p.first_name

    ,p.last_name

    ,p.people_code_id

    from people as p

    inner join peopletype as pt

    on p.people_code_id=pt.people_code_id

    where p.deceased_flag='n'

    and pt.people_type='alum'

    AND NOT EXISTS (SELECT * FROM Upload WHERE Upload.First_Name = p.First_Name

    AND Upload.Last_Name = p.Last_Name

    AND Upload.People_Code_Id = p.People_Code_Id )

    Kenneth FisherI was once offered a wizards hat but it got in the way of my dunce cap.--------------------------------------------------------------------------------For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]Link to my Blog Post --> www.SQLStudies.com[/url]

  • Kenneth Fisher (10/11/2007)


    select distinct p.first_name

    ,p.last_name

    ,p.people_code_id

    from people as p

    inner join peopletype as pt

    on p.people_code_id=pt.people_code_id

    where p.deceased_flag='n'

    and pt.people_type='alum'

    AND NOT EXISTS (SELECT * FROM Upload WHERE Upload.First_Name = p.First_Name

    AND Upload.Last_Name = p.Last_Name

    AND Upload.People_Code_Id = p.People_Code_Id )

    Thanks! That looks good, but how come you have to do the u.first_name=p.first_name ect...how come you couldnt just use not exists (select * from upload)?

  • Because you have to have some kind of condition to test on. If you did a NOT EXISTS (SELECT * FROM TABLE1) with no condition then the where clause would fail as long as there was a row in TABLE1. This way you are tieing the 2 tables together .. the one in the subquery and the one outside of it. You may want to look up the NOT EXISTS clause in BOL. It should have alot more detail.

    Kenneth FisherI was once offered a wizards hat but it got in the way of my dunce cap.--------------------------------------------------------------------------------For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]Link to my Blog Post --> www.SQLStudies.com[/url]

  • Will do, thanks for your help!!

  • That wasnt the solution, i have several people who appear who are already on my original list (members2007a.txt) several people are being pulled on the query, even though the data is already in the "uploaded" table. This is the query i used, I need to find ALL of our alum (indicated by people type "alum") in our system who are NOT on the excel file that i uploaded into the table "upload" any ideas why this isnt working?

    create table upload -- Create table to upload data into, same column

    (first_name varchar(50) --order as excel file

    ,last_name varchar(50)

    ,people_code_id varchar(50))

    bulk insert upload --Upload text file located on

    from 'c:\members2007a.txt' --SCT server, required (fieldterminator = '\t')

    with (fieldterminator = '\t') --indicating its a tab delimited file

    select distinct p.first_name

    ,p.last_name

    ,p.people_code_id

    from people as p

    inner join peopletype as pt

    on p.people_code_id=pt.people_code_id

    where p.deceased_flag='n'

    and pt.people_type='alum'

    and not exists (select *

    from Upload

    where Upload.People_Code_Id = p.People_Code_Id

    and upload.first_name=p.first_name

    and upload.last_name=p.last_name)

    order by last_name

  • select distinct p.first_name,p.last_name,p.people_code_id

    from people as p

    inner join peopletype as pt

    on p.people_code_id=pt.people_code_id

    left outer join upload as up

    on p.people_code_id = up.people_code_id

    where p.deceased_flag='n' and pt.people_type='alum'

    and up.people_code_id is null

    the left outer join works with only having poeple_code_id provided the people_code_id (your student code) is unique.

  • The people code id is unique. The query with the left join returns IDENTICAL results to my other query. Ive verified the data is in the excel fil ei uploaded, and in the table, yet it still returns several people who already are on the list. I cant figure out why? ?

  • select distinct p.first_name,p.last_name,p.people_code_id

    from people as p

    inner join peopletype as pt

    on p.people_code_id=pt.people_code_id

    where p.deceased_flag='n'

    and pt.people_type='alum'

    and not exists (select *

    from Upload

    where Upload.People_Code_Id = p.People_Code_Id

    and upload.first_name=p.first_name

    and upload.last_name=p.last_name)

    order by last_name

    The above statement should works. You may take a look your deceased_flag in people table and your peopletype table whether it has corresponding people_code_id.

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply