Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

merging user records into one record Expand / Collapse
Author
Message
Posted Monday, May 26, 2008 11:32 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, December 02, 2010 9:22 PM
Points: 5, Visits: 7
in our database we have permenant staff with 00141459 permanent number, tempory staff with ts number TS-12345 and contractors with TS-23456 Number, what i need to do is find the records for all the permanent staff members look for them in ts numbers then consolidate the name id email address and other filed into the permannet user name that was last used to login as it would be the most recent user account. we also have a field in the user table called dtlastlogin.

thnx guys for all the help in advance..
Post #506670
Posted Tuesday, May 27, 2008 10:48 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, July 05, 2011 7:37 PM
Points: 58, Visits: 149
Am I understanding this:
You have one table with a employee id, and in that id you two possible cases: straight number and number with a TS prefix.

You have some employees who are both in there as a straight number, and as a TS prefix with the same number after the prefix as their employee number.

- OR -
you have some employees who have different numbers, but the same name, and you want to find them, and consolidate them to the straight number and get rid of the other one and set the dtlastlogin to the most recent login date.

Can you also provide a couple of examples of employee numbers that are hits as well as some misses?



Post #507087
Posted Tuesday, May 27, 2008 11:54 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, December 02, 2010 9:22 PM
Points: 5, Visits: 7
okay let me describe it the following way:

in my user table tbllmsuser i have the following fields:

3 pkiUser numeric 9 0
0 varUserID nvarchar 50 1
0 varName nvarchar 100 1
0 varSurname nvarchar 100 1
0 nBUID numeric 9 1
0 varEmail nvarchar 100 1
0 varTel nvarchar 50 1
0 varPassword nvarchar 50 1
0 varTitle nvarchar 100 1
0 nLMSGroup decimal 9 1
0 tiUserActive tinyint 1 1
0 tiUserAdmin tinyint 1 1
0 varFax nvarchar 50 1
0 BUReporter tinyint 1 1
0 nBUDepartmentId numeric 9 1
0 nBUDivisionId numeric 9 1
0 nSecurityProfile numeric 9 1
0 bYesNoActive bit 1 1
0 dtExpDate datetime 8 1
0 iCounter int 4 1
0 varSource varchar 50 1
0 fkiLineManager numeric 9 1
0 dtLastLogin datetime 8 1
0 nLoginCount numeric 9 1
0 dtCurrentLogin datetime 8 1
0 varCellNumber varchar 50 1
0 varEmploymentStatus varchar 50 1
0 varInitials varchar 50 1
0 npassword numeric 9 1
0 tiOnline tinyint 1 1
0 fkiLocation numeric 9 1
0 tiRegister tinyint 1 1
0 nPersonnelNumber varchar 50 1
0 tiTemporary tinyint 1 1
0 dtTemporaryAccessExpiration datetime 8 1

this table is also linked to the userdetail table tbllmsuserdetail through pkiuser and fkiuser:

3 pkiLMSUserDetail numeric 9 0
0 fkiUser numeric 9 0
0 varAddressType varchar 50 1
0 varAddress1 varchar 255 1
0 varAddress2 varchar 255 1
0 varAddress3 varchar 255 1
0 varCity varchar 255 1
0 varPostalCode varchar 50 1
0 tiSAOther tinyint 1 1
0 varNationality varchar 255 1
0 varIDNo varchar 50 1
0 varIDType varchar 255 1
0 dtDateOfBirth datetime 8 1
0 varCellNumber varchar 50 1
0 varPreferredCommunication varchar 20 1
0 varIBTCostCenter varchar 50 1
0 varConsultantNumber varchar 50 1
0 varRace varchar 50 1
0 tiGender tinyint 1 1
0 tiDisabled tinyint 1 1
0 varPhysicalLocation varchar 1000 1
0 varOrganisationalGrouping varchar 50 1
0 fkiLMSOccupational numeric 9 1
0 varJFLgrade varchar 255 1
0 tiPrimarySecondary tinyint 1 1
0 tiSuperUser tinyint 1 0
0 varEmploymentStatus varchar 50 1
0 tiAssessorModerator tinyint 1 1
0 varInitials varchar 10 1
0 varHomeLanguage varchar 255 1
0 varCitizenResidenceStatus varchar 50 1
0 tiInternalExternal tinyint 1 0
0 varProvincialRegion varchar 50 1
0 tiRegisteredStatus tinyint 1 0
0 fkiRegistration varchar 255 1
0 dtDateRegistrationStatus datetime 8 1
0 dtDateOfTraining datetime 8 1
0 varTrainedBy varchar 255 1
0 tiDocumentationFiled tinyint 1 0
0 tiDocumentationSent tinyint 1 0
0 varRegistrationReason varchar 255 1
0 varCountry varchar 50 1
0 fkiVenue numeric 9 1
0 fkiCountry numeric 5 1

and the user table is also linked to tbllmsuserprofile through the pkiuser and fkiuser:

2 pkiProfile numeric 9 0
0 varProfileName varchar 50 0
0 bActive bit 1 0
0 bFacilitator bit 1 0
0 bAssessor bit 1 0
0 bModerator bit 1 0
1 bDeveloper bit 1 0

there is other tables i have to update after merging the records so we dont loose the users learning objects he completed:

from the following tables :
tblUserLearningObjectSession

3 pkiUserLearningObjectSession numeric 9 0
0 fkiLearningObjectSession numeric 9 1
0 fkiUser numeric 9 1
0 nMark numeric 9 1
0 nAttempts numeric 9 1
0 fkiMovedTo numeric 9 1
0 bitCompetent bit 1 1
0 dtAttempted datetime 8 1

tblUserSessionMap
3 pkiUserSessionMap numeric 9 0
0 fkiLearningObjectSession numeric 9 0
0 fkiUser numeric 9 0
0 nScore numeric 9 1
0 tiCompetent tinyint 1 1
0 nAttempts numeric 9 0
0 dtLastAttempted datetime 8 1
0 tiActive tinyint 1 0
0 tiCompleted tinyint 1 0
0 GUIDLastSession varchar 255 1
0 fkiSessionVenueMap numeric 9 1
0 [cmi.core.lesson_location] varchar 255 1
0 [cmi.core.lesson_status] varchar 255 1
0 [cmi.core.score.raw] varchar 50 1
0 [cmi.core.score.max] varchar 50 1
0 [cmi.core.score.min] varchar 50 1
0 [cmi.core.exit] varchar 50 1
0 [cmi.core.session_time] varchar 50 1
0 [cmi.suspend_data] varchar 5000 1
0 [cmi.core.entry] varchar 50 1
0 [cmi.core.credit] varchar 50 1
0 [cmi.core.total_time] varchar 50 1
0 varLessonMode varchar 20 1
0 nTimesPassed numeric 9 1

i have to merge user records so that the user that remains keeps the user name that last logged in and update the fields in the 3 user tables that reference for that user, secondly update the users learning objects.

thanks guys for the speedy response and help
Post #507438
Posted Wednesday, May 28, 2008 6:56 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, July 05, 2011 7:37 PM
Points: 58, Visits: 149
I see what you are going after here; is the employee number you listed in your first post an example of one that would need to be merged?

Do you have a straight one-to-one merge, or can an employee have multiple contractor id's ( merge many to one )?



Post #507633
Posted Wednesday, May 28, 2008 7:02 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, July 05, 2011 7:37 PM
Points: 58, Visits: 149
Also, what can you use to identify the records that should be merged? Is the base employee number part of the key, or do you have to match based on first name, last name, or some other criteria?


Post #507640
Posted Wednesday, May 28, 2008 7:34 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, December 02, 2010 9:22 PM
Points: 5, Visits: 7
Grasshopper the field we would use is the varuserid which will be te employee number and unique further if need be we can filter on varname and varsurname and also the pkiuser as these fields should all be unique

thanx once again
Post #507665
Posted Wednesday, May 28, 2008 8:04 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, July 05, 2011 7:37 PM
Points: 58, Visits: 149
rhcomputer:
1. Grasshopper is the rank on the site; the username is metawizard2 :D
2. Other way around:
What is common among the user records that you can hook on?
In your first post you said that you have internal users with scheme:
12345
and external users with scheme
TS-3456

So, lets take an actual user, Bob Noodle.

Bob is an internal employee with ID 12345. He is also an extrnal employee with ID TS-34567.

Is that a good representation?



Post #507685
Posted Wednesday, May 28, 2008 8:36 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, December 02, 2010 9:22 PM
Points: 5, Visits: 7
metawizard2

sorry my bad

the representation is correct the only problem that i for see is where the user has more than one user :
fkiuser - varuserid - varname - varsurname - varemail - dtlastlogin
234560 - 012345 - roy - heaney - - 25052008

fkiuser - varuserid - varname - varsurname - varemail - dtlastlogin
TS-34560 - 02345 - - heaney - rh@rh.co.za - 10012008

fkiuser - varuserid - varname - varsurname - varemail - dtlastlogin
TS-4560 - 01765 - roy - - - 05032008

the fields listed above are the only fields unique enough to hook on, now we have to merge the data missing from permanaent record from the others keep pki for search to update the learning objects.
we also have to then update the userdetail table to remove mappings to users merged and remove the unnecsesarry userdetails, aswell as the userprofile table.

i hope this is clear enough

thnx a million

What is common among the user records that you can hook on?
In your first post you said that you have internal users with scheme:
12345
and external users with scheme
TS-3456

So, lets take an actual user, Bob Noodle.

Bob is an internal employee with ID 12345. He is also an extrnal employee with ID TS-34567.

Is that a good representation?
Post #507716
Posted Wednesday, May 28, 2008 11:06 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, July 05, 2011 7:37 PM
Points: 58, Visits: 149
Where you have that exception some logic can be run to find possible matches and a human will have to interceed for those; with those exceptions, the names will match between the records? And how many records do you have good versus bad ( determines best approach ) - ballpark?


Post #507861
Posted Friday, May 30, 2008 10:12 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, July 05, 2011 7:37 PM
Points: 58, Visits: 149
OK, I have a solution for you. :D

This script will automatically handle generating the SQL statements to clean up your tables and foreign keys.

There is a table variable called @WorkingList that will contain all the unresolved users that might be duplicates.

To fix them, in the source data, fix the names to be the same. In the case of roy heaney, if all three rows are in fact the same user, set the first and last name to be roy heaney in all three rows.

Run the script below in your development database ( or anywhere you want to play with it ) to see exactly how it would work.

If you have any questions or need help, feel free to ask.

/*
This will create a table called acz_users, which has a userid, first and last name,
and a last login date.

There are duplicate users that we want to merge in this table, based on this criteria:
When a user is unique, leave them alone.
When a user is clearly a duplicate, keep the row that has the most recent last login.
When a user is not unique, and possibly a duplicate, set them aside for someone
to determine what should be done.

This script will generate the SQL statements to clean up the a user table as provided,
and also generate some statements to keep other files foreign keys in sync
*/

IF EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[acz_users]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1)
DROP TABLE [dbo].[acz_users]

CREATE TABLE [dbo].[acz_users](
[pkiuser] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[FirstName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[LastName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[LastLogin] [datetime] NULL
) ON [PRIMARY]

Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( '123','roy','heaney','01/01/00')
Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS345','','heaney','08/01/00')
Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS456','roy','','03/01/00')
Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( '564','Bob','Noodle','05/01/00')
Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( '789','Andy','Griffith','06/01/00')
Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS358','Andy','Griffith','04/01/00')
Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS765','','Outsideguy','07/01/00')
Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS234','Andy','','09/01/00')
Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( '901','Roy','Fokker','04/01/00')
Insert into acz_users( pkiuser, FirstName, LastName, LastLogin ) values ( 'TS735','Roy','Fokker','07/01/00')

-- Set Up Tables

-- @WorkingList holds all and has them removed as they are cleared
declare @WorkingList table(
userid varchar(50),
WorkingFirstName varchar(50),
WorkingLastName varchar(50),
WorkingLastLogin datetime,
WorkingFullName varchar(100) ,
CountFirst numeric,
CountLast numeric,
CountFull numeric,
CountTotal numeric )

-- @RenumList holds rows that are going to be renumbered
declare @RenumList table(
userid varchar(50),
RenumFirstName varchar(50),
RenumLastName varchar(50),
RenumLastLogin datetime,
RenumFullName varchar(100) )

-- @RenumList holds rows that are going to be renumbered
declare @Renum2List table(
userid2 varchar(50),
Renum2FirstName varchar(50),
Renum2LastName varchar(50),
Renum2LastLogin datetime,
Renum2FullName varchar(100) )

-- @CleanList holds records that will not be purged
declare @CleanList table(
userid varchar(50),
CleanFirstName varchar(50),
CleanLastName varchar(50),
CleanLastLogin datetime,
CleanFullName varchar(100) )

-- @PurgeList holds the records that will be deleted from production
declare @PurgeList table(
userid varchar(50),
PurgeFirstName varchar(50),
PurgeLastName varchar(50),
PurgeLastLogin datetime,
PurgeFullName varchar(100) )

-- @Output holds the sql statements to update fk tables
declare @Output table(
ResolveFkLearningSQL varchar(1000),
ResolveFkSessionSQL varchar(1000),
CleanUserTableSQL varchar(1000),
CleanUserDetailTableSQL varchar(1000)

)

-- @Xref holds old key to new key translation
declare @Xref table(
OldUserId varchar(50),
OldFullName varchar(100),
NewUserId varchar(50),
NewFullName varchar(100) )

-- Load the working list
Insert into @WorkingList( userid, WorkingFirstName, WorkingLastName, WorkingLastLogin, WorkingFullName )
Select pkiuser, FirstName, LastName, LastLogin, IsNull( FirstName, '' ) + ' ' + IsNull( LastName, '' )
From acz_users

-- Determine what automatically goes to the CleanList because it is unique
Update @WorkingList
Set CountFirst = ( Select Count( FirstName ) from acz_users where WorkingFirstName = FirstName and FirstName <> '' )

Update @WorkingList
Set CountLast = ( Select Count( LastName ) from acz_users where WorkingLastName = LastName and LastName <> '')

Update @WorkingList
Set CountFull = ( Select Count( FirstName + ' ' + LastName ) from acz_users where FirstName + ' ' + LastName = WorkingFullName )


-- Good full name that is unique
Update @WorkingList
set CountTotal = 1 where
CountFirst = 1 and CountLast = 1

-- Unique first name with no last name ( would never be able to resolve )
Update @WorkingList
set CountTotal = 1 where
CountFirst = 1 and CountLast = 0

-- Unique last name with no first name ( would never be able to resolve )
Update @WorkingList
set CountTotal = 1 where
CountFirst = 0 and CountLast = 1

Update @WorkingList
Set CountTotal = 0 where CountTotal is null

-- Move all the resolved ones to the ClearList and clear out the WorkingList
Insert into @CleanList ( userid, CleanFirstName, CleanLastName, CleanLastLogin, CleanFullName )
Select userid, WorkingFirstName, WorkingLastName, WorkingLastLogin, WorkingFullName from @WorkingList
where CountTotal = 1

Delete from @WorkingList where CountTotal = 1

-- Handle Matching Full Names
Insert into @RenumList( userid, RenumFirstName, RenumLastName, RenumLastLogin, RenumFullName )
Select userid, WorkingFirstName, WorkingLastName, WorkingLastLogin, WorkingFullName from @WorkingList
where CountFull = 2

-- Setup the duplicate renumlist
Insert into @Renum2List
Select * from @RenumList

Delete from @WorkingList where CountFull = 2

-- For the duplicate full names, get the max last login date and move to clean list
Insert into @CleanList ( userid, CleanFirstName, CleanLastName, CleanLastLogin, CleanFullName )
Select userid, RenumFirstName, RenumLastName, RenumLastLogin, RenumFullName from @RenumList
where RenumLastLogin = ( select Renum2LastLogin from @Renum2List
where Renum2LastLogin = ( select max( Renum2LastLogin ) from @Renum2List where
Renum2FullName = RenumFullName ) )

delete from @RenumList where userid in ( select userid from @CleanList )
delete from @Renum2List where userid2 in ( select userid from @CleanList )

-- Build the Xref file for the renumbers
Insert into @Xref( OldUserId, OldFullName )
select userid, RenumFullName from @RenumList

Update @Xref
set NewUserId = ( select userid from @CleanList where CleanFullName = OldFullName )

Update @Xref
set NewFullName = ( select CleanFullName from @CleanList where CleanFullName = OldFullName )

-- Now Move the @RenumList to the @PurgeList
Insert into @PurgeList( userid, PurgeFirstName, PurgeLastName, PurgeLastLogin, PurgeFullName )
Select userid, RenumFirstName, RenumLastName, RenumLastLogin, RenumFullName from @RenumList

Delete from @RenumList
Delete from @Renum2List

-- Create the SQL Statements to update the other tables
Insert into @Output( ResolveFkLearningSQL )
Select 'Update tblUserLearningObjectSession set userid = ' + NewUserId + ' where userid = ' + OldUserId
from @Xref

Insert into @Output( ResolveFkSessionSQL )
Select 'Update tblUserSessionMap set userid = ' + NewUserId + ' where userid = ' + OldUserId
from @Xref

Insert into @Output( CleanUserTableSQL )
Select 'Delete from tbllmsuser where userid = ' + userid
from @PurgeList

Insert into @Output( CleanUserDetailTableSQL )
Select 'Delete from tbllmsuserdetail where userid = ' + userid
from @PurgeList

Select * from acz_users
Select * from @WorkingList -- These are the ones that need to be resolved
Select * from @CleanList -- These will be left alone
Select * from @RenumList -- Should be empty
Select * from @Xref -- Shows who is being renumbered to what
Select * from @PurgeList -- Shows who is getting deleted
Select * from @Output -- Contains the SQL Statements to clean up the files



Post #509238
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse