merging user records into one record

  • in our database we have permenant staff with 00141459 permanent number, tempory staff with ts number TS-12345 and contractors with TS-23456 Number, what i need to do is find the records for all the permanent staff members look for them in ts numbers then consolidate the name id email address and other filed into the permannet user name that was last used to login as it would be the most recent user account. we also have a field in the user table called dtlastlogin.

    thnx guys for all the help in advance..

  • Am I understanding this:

    You have one table with a employee id, and in that id you two possible cases: straight number and number with a TS prefix.

    You have some employees who are both in there as a straight number, and as a TS prefix with the same number after the prefix as their employee number.

    - OR -

    you have some employees who have different numbers, but the same name, and you want to find them, and consolidate them to the straight number and get rid of the other one and set the dtlastlogin to the most recent login date.

    Can you also provide a couple of examples of employee numbers that are hits as well as some misses?

  • okay let me describe it the following way:

    in my user table tbllmsuser i have the following fields:

    3pkiUsernumeric90

    0varUserIDnvarchar501

    0varNamenvarchar1001

    0varSurnamenvarchar1001

    0nBUIDnumeric91

    0varEmailnvarchar1001

    0varTelnvarchar501

    0varPasswordnvarchar501

    0varTitlenvarchar1001

    0nLMSGroupdecimal91

    0tiUserActivetinyint11

    0tiUserAdmintinyint11

    0varFaxnvarchar501

    0BUReportertinyint11

    0nBUDepartmentIdnumeric91

    0nBUDivisionIdnumeric91

    0nSecurityProfilenumeric91

    0bYesNoActivebit11

    0dtExpDatedatetime81

    0iCounterint41

    0varSourcevarchar501

    0fkiLineManagernumeric91

    0dtLastLogindatetime81

    0nLoginCountnumeric91

    0dtCurrentLogindatetime81

    0varCellNumbervarchar501

    0varEmploymentStatusvarchar501

    0varInitialsvarchar501

    0npasswordnumeric91

    0tiOnlinetinyint11

    0fkiLocationnumeric91

    0tiRegistertinyint11

    0nPersonnelNumbervarchar501

    0tiTemporarytinyint11

    0dtTemporaryAccessExpirationdatetime81

    this table is also linked to the userdetail table tbllmsuserdetail through pkiuser and fkiuser:

    3pkiLMSUserDetailnumeric90

    0fkiUsernumeric90

    0varAddressTypevarchar501

    0varAddress1varchar2551

    0varAddress2varchar2551

    0varAddress3varchar2551

    0varCityvarchar2551

    0varPostalCodevarchar501

    0tiSAOthertinyint11

    0varNationalityvarchar2551

    0varIDNovarchar501

    0varIDTypevarchar2551

    0dtDateOfBirthdatetime81

    0varCellNumbervarchar501

    0varPreferredCommunicationvarchar201

    0varIBTCostCentervarchar501

    0varConsultantNumbervarchar501

    0varRacevarchar501

    0tiGendertinyint11

    0tiDisabledtinyint11

    0varPhysicalLocationvarchar10001

    0varOrganisationalGroupingvarchar501

    0fkiLMSOccupationalnumeric91

    0varJFLgradevarchar2551

    0tiPrimarySecondarytinyint11

    0tiSuperUsertinyint10

    0varEmploymentStatusvarchar501

    0tiAssessorModeratortinyint11

    0varInitialsvarchar101

    0varHomeLanguagevarchar2551

    0varCitizenResidenceStatusvarchar501

    0tiInternalExternaltinyint10

    0varProvincialRegionvarchar501

    0tiRegisteredStatustinyint10

    0fkiRegistrationvarchar2551

    0dtDateRegistrationStatusdatetime81

    0dtDateOfTrainingdatetime81

    0varTrainedByvarchar2551

    0tiDocumentationFiledtinyint10

    0tiDocumentationSenttinyint10

    0varRegistrationReasonvarchar2551

    0varCountryvarchar501

    0fkiVenuenumeric91

    0fkiCountrynumeric51

    and the user table is also linked to tbllmsuserprofile through the pkiuser and fkiuser:

    2pkiProfilenumeric90

    0varProfileNamevarchar500

    0bActivebit10

    0bFacilitatorbit10

    0bAssessorbit10

    0bModeratorbit10

    1bDeveloperbit10

    there is other tables i have to update after merging the records so we dont loose the users learning objects he completed:

    from the following tables :

    tblUserLearningObjectSession

    3pkiUserLearningObjectSessionnumeric90

    0fkiLearningObjectSessionnumeric91

    0fkiUsernumeric91

    0nMarknumeric91

    0nAttemptsnumeric91

    0fkiMovedTonumeric91

    0bitCompetentbit11

    0dtAttempteddatetime81

    tblUserSessionMap

    3pkiUserSessionMapnumeric90

    0fkiLearningObjectSessionnumeric90

    0fkiUsernumeric90

    0nScorenumeric91

    0tiCompetenttinyint11

    0nAttemptsnumeric90

    0dtLastAttempteddatetime81

    0tiActivetinyint10

    0tiCompletedtinyint10

    0GUIDLastSessionvarchar2551

    0fkiSessionVenueMapnumeric91

    0[cmi.core.lesson_location]varchar2551

    0[cmi.core.lesson_status]varchar2551

    0[cmi.core.score.raw]varchar501

    0[cmi.core.score.max]varchar501

    0[cmi.core.score.min]varchar501

    0[cmi.core.exit]varchar501

    0[cmi.core.session_time]varchar501

    0[cmi.suspend_data]varchar50001

    0[cmi.core.entry]varchar501

    0[cmi.core.credit]varchar501

    0[cmi.core.total_time]varchar501

    0varLessonModevarchar201

    0nTimesPassednumeric91

    i have to merge user records so that the user that remains keeps the user name that last logged in and update the fields in the 3 user tables that reference for that user, secondly update the users learning objects.

    thanks guys for the speedy response and help

  • I see what you are going after here; is the employee number you listed in your first post an example of one that would need to be merged?

    Do you have a straight one-to-one merge, or can an employee have multiple contractor id's ( merge many to one )?

  • Also, what can you use to identify the records that should be merged? Is the base employee number part of the key, or do you have to match based on first name, last name, or some other criteria?

  • Grasshopper the field we would use is the varuserid which will be te employee number and unique further if need be we can filter on varname and varsurname and also the pkiuser as these fields should all be unique

    thanx once again

  • rhcomputer:

    1. Grasshopper is the rank on the site; the username is metawizard2 😀

    2. Other way around:

    What is common among the user records that you can hook on?

    In your first post you said that you have internal users with scheme:

    12345

    and external users with scheme

    TS-3456

    So, lets take an actual user, Bob Noodle.

    Bob is an internal employee with ID 12345. He is also an extrnal employee with ID TS-34567.

    Is that a good representation?

  • metawizard2

    sorry my bad

    the representation is correct the only problem that i for see is where the user has more than one user :

    fkiuser - varuserid - varname - varsurname - varemail - dtlastlogin

    234560 - 012345 - roy - heaney - - 25052008

    fkiuser - varuserid - varname - varsurname - varemail - dtlastlogin

    TS-34560 - 02345 - - heaney - rh@rh.co.za - 10012008

    fkiuser - varuserid - varname - varsurname - varemail - dtlastlogin

    TS-4560 - 01765 - roy - - - 05032008

    the fields listed above are the only fields unique enough to hook on, now we have to merge the data missing from permanaent record from the others keep pki for search to update the learning objects.

    we also have to then update the userdetail table to remove mappings to users merged and remove the unnecsesarry userdetails, aswell as the userprofile table.

    i hope this is clear enough

    thnx a million

    What is common among the user records that you can hook on?

    In your first post you said that you have internal users with scheme:

    12345

    and external users with scheme

    TS-3456

    So, lets take an actual user, Bob Noodle.

    Bob is an internal employee with ID 12345. He is also an extrnal employee with ID TS-34567.

    Is that a good representation?

  • Where you have that exception some logic can be run to find possible matches and a human will have to interceed for those; with those exceptions, the names will match between the records? And how many records do you have good versus bad ( determines best approach ) - ballpark?

  • OK, I have a solution for you. 😀

    This script will automatically handle generating the SQL statements to clean up your tables and foreign keys.

    There is a table variable called @WorkingList that will contain all the unresolved users that might be duplicates.

    To fix them, in the source data, fix the names to be the same. In the case of roy heaney, if all three rows are in fact the same user, set the first and last name to be roy heaney in all three rows.

    Run the script below in your development database ( or anywhere you want to play with it ) to see exactly how it would work.

    If you have any questions or need help, feel free to ask.

    /*

    This will create a table called acz_users, which has a userid, first and last name,

    and a last login date.

    There are duplicate users that we want to merge in this table, based on this criteria:

    When a user is unique, leave them alone.

    When a user is clearly a duplicate, keep the row that has the most recent last login.

    When a user is not unique, and possibly a duplicate, set them aside for someone

    to determine what should be done.

    This script will generate the SQL statements to clean up the a user table as provided,

    and also generate some statements to keep other files foreign keys in sync

    */

    IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[acz_users]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1)

    DROP TABLE [dbo].[acz_users]

    CREATE TABLE [dbo].[acz_users](

    [pkiuser] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [FirstName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [LastName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [LastLogin] [datetime] NULL

    ) ON [PRIMARY]

    Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( '123','roy','heaney','01/01/00')

    Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS345','','heaney','08/01/00')

    Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS456','roy','','03/01/00')

    Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( '564','Bob','Noodle','05/01/00')

    Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( '789','Andy','Griffith','06/01/00')

    Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS358','Andy','Griffith','04/01/00')

    Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS765','','Outsideguy','07/01/00')

    Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS234','Andy','','09/01/00')

    Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( '901','Roy','Fokker','04/01/00')

    Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS735','Roy','Fokker','07/01/00')

    -- Set Up Tables

    -- @WorkingList holds all and has them removed as they are cleared

    declare @WorkingList table(

    useridvarchar(50),

    WorkingFirstNamevarchar(50),

    WorkingLastNamevarchar(50),

    WorkingLastLogindatetime,

    WorkingFullNamevarchar(100) ,

    CountFirstnumeric,

    CountLastnumeric,

    CountFullnumeric,

    CountTotalnumeric )

    -- @RenumList holds rows that are going to be renumbered

    declare @RenumList table(

    useridvarchar(50),

    RenumFirstNamevarchar(50),

    RenumLastNamevarchar(50),

    RenumLastLogindatetime,

    RenumFullNamevarchar(100) )

    -- @RenumList holds rows that are going to be renumbered

    declare @Renum2List table(

    userid2varchar(50),

    Renum2FirstNamevarchar(50),

    Renum2LastNamevarchar(50),

    Renum2LastLogindatetime,

    Renum2FullNamevarchar(100) )

    -- @CleanList holds records that will not be purged

    declare @CleanList table(

    useridvarchar(50),

    CleanFirstNamevarchar(50),

    CleanLastNamevarchar(50),

    CleanLastLogindatetime,

    CleanFullNamevarchar(100) )

    -- @PurgeList holds the records that will be deleted from production

    declare @PurgeList table(

    useridvarchar(50),

    PurgeFirstNamevarchar(50),

    PurgeLastNamevarchar(50),

    PurgeLastLogindatetime,

    PurgeFullNamevarchar(100) )

    -- @Output holds the sql statements to update fk tables

    declare @Output table(

    ResolveFkLearningSQLvarchar(1000),

    ResolveFkSessionSQLvarchar(1000),

    CleanUserTableSQLvarchar(1000),

    CleanUserDetailTableSQLvarchar(1000)

    )

    -- @Xref holds old key to new key translation

    declare @Xref table(

    OldUserIdvarchar(50),

    OldFullNamevarchar(100),

    NewUserIdvarchar(50),

    NewFullNamevarchar(100) )

    -- Load the working list

    Insert into @WorkingList( userid, WorkingFirstName, WorkingLastName, WorkingLastLogin, WorkingFullName )

    Select pkiuser, FirstName, LastName, LastLogin, IsNull( FirstName, '' ) + ' ' + IsNull( LastName, '' )

    From acz_users

    -- Determine what automatically goes to the CleanList because it is unique

    Update @WorkingList

    Set CountFirst = ( Select Count( FirstName ) from acz_users where WorkingFirstName = FirstName and FirstName <> '' )

    Update @WorkingList

    Set CountLast = ( Select Count( LastName ) from acz_users where WorkingLastName = LastName and LastName <> '')

    Update @WorkingList

    Set CountFull = ( Select Count( FirstName + ' ' + LastName ) from acz_users where FirstName + ' ' + LastName = WorkingFullName )

    -- Good full name that is unique

    Update @WorkingList

    set CountTotal = 1 where

    CountFirst = 1 and CountLast = 1

    -- Unique first name with no last name ( would never be able to resolve )

    Update @WorkingList

    set CountTotal = 1 where

    CountFirst = 1 and CountLast = 0

    -- Unique last name with no first name ( would never be able to resolve )

    Update @WorkingList

    set CountTotal = 1 where

    CountFirst = 0 and CountLast = 1

    Update @WorkingList

    Set CountTotal = 0 where CountTotal is null

    -- Move all the resolved ones to the ClearList and clear out the WorkingList

    Insert into @CleanList ( userid, CleanFirstName, CleanLastName, CleanLastLogin, CleanFullName )

    Select userid, WorkingFirstName, WorkingLastName, WorkingLastLogin, WorkingFullName from @WorkingList

    where CountTotal = 1

    Delete from @WorkingList where CountTotal = 1

    -- Handle Matching Full Names

    Insert into @RenumList( userid, RenumFirstName, RenumLastName, RenumLastLogin, RenumFullName )

    Select userid, WorkingFirstName, WorkingLastName, WorkingLastLogin, WorkingFullName from @WorkingList

    where CountFull = 2

    -- Setup the duplicate renumlist

    Insert into @Renum2List

    Select * from @RenumList

    Delete from @WorkingList where CountFull = 2

    -- For the duplicate full names, get the max last login date and move to clean list

    Insert into @CleanList ( userid, CleanFirstName, CleanLastName, CleanLastLogin, CleanFullName )

    Select userid, RenumFirstName, RenumLastName, RenumLastLogin, RenumFullName from @RenumList

    where RenumLastLogin = ( select Renum2LastLogin from @Renum2List

    where Renum2LastLogin = ( select max( Renum2LastLogin ) from @Renum2List where

    Renum2FullName = RenumFullName ) )

    delete from @RenumList where userid in ( select userid from @CleanList )

    delete from @Renum2List where userid2 in ( select userid from @CleanList )

    -- Build the Xref file for the renumbers

    Insert into @Xref( OldUserId, OldFullName )

    select userid, RenumFullName from @RenumList

    Update @Xref

    set NewUserId = ( select userid from @CleanList where CleanFullName = OldFullName )

    Update @Xref

    set NewFullName = ( select CleanFullName from @CleanList where CleanFullName = OldFullName )

    -- Now Move the @RenumList to the @PurgeList

    Insert into @PurgeList( userid, PurgeFirstName, PurgeLastName, PurgeLastLogin, PurgeFullName )

    Select userid, RenumFirstName, RenumLastName, RenumLastLogin, RenumFullName from @RenumList

    Delete from @RenumList

    Delete from @Renum2List

    -- Create the SQL Statements to update the other tables

    Insert into @Output( ResolveFkLearningSQL )

    Select 'Update tblUserLearningObjectSession set userid = ' + NewUserId + ' where userid = ' + OldUserId

    from @Xref

    Insert into @Output( ResolveFkSessionSQL )

    Select 'Update tblUserSessionMap set userid = ' + NewUserId + ' where userid = ' + OldUserId

    from @Xref

    Insert into @Output( CleanUserTableSQL )

    Select 'Delete from tbllmsuser where userid = ' + userid

    from @PurgeList

    Insert into @Output( CleanUserDetailTableSQL )

    Select 'Delete from tbllmsuserdetail where userid = ' + userid

    from @PurgeList

    Select * from acz_users

    Select * from @WorkingList -- These are the ones that need to be resolved

    Select * from @CleanList -- These will be left alone

    Select * from @RenumList -- Should be empty

    Select * from @Xref -- Shows who is being renumbered to what

    Select * from @PurgeList -- Shows who is getting deleted

    Select * from @Output -- Contains the SQL Statements to clean up the files

  • thnx dude will give this a bash and let u know very much appreciated

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply