Data Modeling Exercise - To Help Both Performance and Concurrency

  • Hello All,

    I am looking for your thoughts/corrections/modifications on the following:

    I am currently working on re-designing an existing data model. Currently, there is a "wide" user_master table holding various attributes for a user such as first_name, last_name etc. One such attribute is the organization the user belongs to and there are bunch of other attributes that are not frequently queried or used. And "user_id" (identity int) is the primary key for this table. Also, there are other tables (like addresses, phone numbers etc) tied to the "user_master" table through the "user_id".

    A person using the web app could be creating multiple user records (a user record consists of all the aforementioned attributes, phone numbers etc) at the same time or we could be loading multiple user records in the back end. And in such a situation I want to deal with as many records as possible at one time rather than dealing one user record at a time by looping through all the user records. Currently the procedures are written to deal with one user record at a time.

    My first task is to split the user_master table vertically and move some of the attributes to other tables of their own and then they will be linked by the "user_id" in the "user_master" table. The main reason here being, I want to keep the main table as narrow as possible and for that I want to remove the rarely used columns from taking up the space in this table.

    So a lean table is good for reporting (more data in lesser number of pages etc etc); but then vertically partitioning the table introduces a problem when saving (inserting and updating) multiple records at the same time. Since the user_id is a identity column, the issue is to grab the generated identity values for all the user records in a batch which are then used to populate the child tables. When doing one record at a time I can use SCOPE_IDENTITY to grab the latest identity value and use that to populate the other tables.

    So the way I wrote the multi user save proc is that the input is a XML string (it can be a Table Valued Param also) which contains all the required information with one line for each user. I then convert this XML into a table variable (lets call it TableA) and then do an upsert using this table variable (TableA). And I am using the OUTPUT clause to get all the identity values generated to populate another table variable (lets call this TableB). Then I can use the data in this second table variable (TableB) to help populate the user_id in the child tables.

    So this introduced another issue; correlating back the generated user_id to a user record row in the input. So to solve this, I modified the input to contain a row identifier (a number starting with 1 and incremented by 1) for each user record. Then I can do an insert by physically ordering the input data set (TableA) based on this row identifier. Then I can grab the user_ids that are generated and do a row_number() over them by sorting them which results in a series of numbers starting at 1 and incrementing by 1 (and put this info into into TableB). Then I can take this row_number() value (from TableB) and row identifier value from the input set (TableA) and do a JOIN on them, thus linking both the data sets together.

    But then I have another issue. The proc is designed to do both insert and update. So if a user_id is passed in the input XML, it is supposed to do a update of that user's data. If no user_id is mentioned in the XML it is supposed to do a insert. And since the input data is mixed, there will be lesser number of inserts than the total records in the input set. And it follows then that lesser number of new user_ids are generated into the table variable using OUTPUT clause and thus using a row_number() over them will not be the right way to join back to the input data set.

    So to overcome that I split the input data set into two table variables, one for inserts and one for updates. Then I can correlate back the data in a correct way.

    My goals are as follows:

    1. Make the main user_master table as lean as possible (to account for Performance)

    2. When creating a new user_master record, I want to free-up the main user_master table as soon as I can. That is, I don't want it to be in a transaction where all the child tables are also being updated (to account for Concurrency on the main user_master table). I am thinking I can live with the child table modifications failing but the user_master modifications succeeding.

    Having said all of this,

    1. Is it always true that when using INSERT statement by physically ordering the records, the records are inserted in that specific order (and identity values that are generated are assigned to the rows in that order)? I did not go with MERGE because MERGE does not allow me to physically order the records when doing the INSERT. So I am sticking to INSERT/UPDATE If..ELSE scenario.

    2. Is there a better way to do this? Am I just over-complicating things? The issue I am trying to tackle can be summarized as follows: How to do multiple upserts at same time on a set of parent-child tables linked by a identity value primary key from the parent table. What are the various approaches in these scenarios?

    The following code might be helpful in setting up my scenario. Sorry for a long post

    USE Master

    GO

    IF EXISTS(SELECT NULL FROM sys.databases WHERE name = 'DummyKolli')

    BEGIN

    ALTER DATABASE DummyKolli SET SINGLE_USER WITH ROLLBACK IMMEDIATE;

    END

    CREATE DATABASE DummyKolli

    GO

    USE DummyKolli

    GO

    IF EXISTS(SELECT NULL FROM sys.objects WHERE name = 'user_organization_xref' AND type = 'U')

    DROP TABLE usrOwner.user_organization_xref

    GO

    IF EXISTS(SELECT NULL FROM sys.objects WHERE name = 'user_master' AND type = 'U')

    DROP TABLE usrOwner.user_master

    GO

    IF EXISTS(SELECT NULL FROM sys.objects WHERE name = 'user_type' AND type = 'U')

    DROP TABLE usrOwner.user_type

    GO

    IF EXISTS(SELECT NULL FROM sys.objects WHERE name = 'organization' AND type = 'U')

    DROP TABLE orgOwner.organization

    GO

    IF EXISTS(SELECT NULL FROM sys.schemas WHERE name = 'usrOwner')

    DROP SCHEMA usrOwner

    GO

    CREATE SCHEMA usrOwner

    GO

    IF EXISTS(SELECT NULL FROM sys.objects WHERE name = 'organization_save' AND type = 'P')

    DROP PROCEDURE orgOwner.organization_save

    GO

    IF EXISTS(SELECT NULL FROM sys.schemas WHERE name = 'orgOwner')

    DROP SCHEMA orgOwner

    GO

    CREATE SCHEMA orgOwner

    GO

    CREATE TABLE usrOwner.user_type(user_type_id TINYINT IDENTITY(1, 1) NOT NULL,

    user_type_descr VARCHAR(16) NOT NULL,

    CONSTRAINT pk__user_type__user_type_id PRIMARY KEY (user_type_id),

    CONSTRAINT uq__user_type__user_type_descr UNIQUE (user_type_descr)

    )

    GO

    INSERT INTO usrOwner.user_type(user_type_descr) VALUES ('Contact'), ('Owner'), ('Super Admin'), ('Team Member')

    GO

    IF EXISTS(SELECT NULL FROM sys.objects WHERE name = 'user_master' AND type = 'U')

    DROP TABLE usrOwner.user_master

    GO

    CREATE TABLE usrOwner.user_master(user_id INT IDENTITY(-2147483648, 1) NOT NULL,

    last_name NVARCHAR(40) NOT NULL,

    first_name NVARCHAR(40) NOT NULL,

    can_login BIT NOT NULL,

    user_type_id TINYINT NOT NULL,

    --internal_apppage_row_id TINYINT NOT NULL,

    CONSTRAINT pk__user_master__user_id PRIMARY KEY (user_id),

    CONSTRAINT fk__user_master__user_type__user_type_id FOREIGN KEY (user_type_id) REFERENCES usrOwner.user_type(user_type_id)

    )

    GO

    INSERT INTO usrOwner.user_master(last_name, first_name, can_login, user_type_id) VALUES ('Cognosan', 'Sapena', 1, 3) -- insert one user for testing

    IF EXISTS(SELECT NULL FROM sys.objects WHERE name = 'organization' AND type = 'U')

    DROP TABLE orgOwner.organization

    GO

    CREATE TABLE orgOwner.organization(organization_id SMALLINT IDENTITY(-32768, 1) NOT NULL,

    organization_name NVARCHAR(40) NOT NULL,

    last_edit_date DATETIME NOT NULL CONSTRAINT df__organization__last_edit_date DEFAULT (GETDATE()),

    CONSTRAINT pk__organization__organization_id PRIMARY KEY (organization_id)

    )

    GO

    INSERT INTO orgOwner.organization(organization_name) VALUES ('HardMacro'), ('Pythia'), ('UrNoSequel')

    IF EXISTS(SELECT NULL FROM sys.objects WHERE name = 'user_organization_xref' AND type = 'U')

    DROP TABLE usrOwner.user_organization_xref

    GO

    CREATE TABLE usrOwner.user_organization_xref(user_org_id INT IDENTITY(-2147483648, 1) NOT NULL,

    user_id INT NOT NULL,

    organization_id SMALLINT NOT NULL,

    is_current BIT NOT NULL,

    CONSTRAINT pk__user_organization_xref__user_org_id PRIMARY KEY (user_org_id),

    CONSTRAINT fk__user_organization_xref__user_master__user_id FOREIGN KEY (user_id) REFERENCES usrOwner.user_master(user_id),

    CONSTRAINT fk__user_organization_xref__organization__organization_id FOREIGN KEY (organization_id) REFERENCES orgOwner.organization(organization_id)

    )

    GO

    And this code is the proc code trying to do multi user save

    USE DummyKolli

    GO

    DECLARE @user_base_details_xml XML, @user_phone_xml XML, @user_email_xml XML, @user_address_xml XML

    SET @user_base_details_xml = '<Root>

    <ListItem apppage_row_id = "1" user_id = "-2147483648" last_name = "Cognosan" first_name = "Sapena" can_login = "1" user_type_id = "3" organization_id = "-32766" is_current = "1"/>

    <ListItem apppage_row_id = "2" last_name = "Avaj" first_name = "Sequrity" can_login = "1" user_type_id = "3" organization_id = "-32767" is_current = "1"/>

    <ListItem apppage_row_id = "3" last_name = "Dickie" first_name = "Skript" can_login = "1" user_type_id = "3" organization_id = "-32768" is_current = "1"/>

    <ListItem apppage_row_id = "4" last_name = "Centdulkar" first_name = "Sachin" can_login = "1" user_type_id = "2" organization_id = "-32766" is_current = "1"/>

    </Root>'

    --Note: apppage_row_id uniquely identifies each row received in the input

    ----------Work with User Base Details First

    DECLARE @user_base_details_upd TABLE (row_id TINYINT IDENTITY(1, 1), apppage_row_id TINYINT, user_id INT, last_name NVARCHAR(40), first_name NVARCHAR(40), can_login BIT, user_type_id TINYINT, organization_id SMALLINT, is_current BIT)

    DECLARE @user_base_details_ins TABLE (row_id TINYINT IDENTITY(1, 1), apppage_row_id TINYINT, user_id INT, last_name NVARCHAR(40), first_name NVARCHAR(40), can_login BIT, user_type_id TINYINT, organization_id SMALLINT, is_current BIT)

    INSERT INTO @user_base_details_ins(apppage_row_id, user_id, last_name, first_name, can_login, user_type_id, organization_id, is_current)

    SELECT sl.rows.value('@apppage_row_id', 'TINYINT'),

    sl.rows.value('@user_id', 'INT'),

    sl.rows.value('@last_name', 'NVARCHAR(40)'),

    sl.rows.value('@first_name', 'NVARCHAR(40)'),

    sl.rows.value('@can_login', 'BIT'),

    sl.rows.value('@user_type_id', 'TINYINT'),

    sl.rows.value('@organization_id', 'SMALLINT'),

    sl.rows.value('@is_current', 'BIT')

    FROM @user_base_details_xml.nodes('/Root/ListItem') AS sl(rows)

    WHERE sl.rows.value('@user_id', 'INT') IS NULL

    INSERT INTO @user_base_details_upd(apppage_row_id, user_id, last_name, first_name, can_login, user_type_id, organization_id, is_current)

    SELECT sl.rows.value('@apppage_row_id', 'TINYINT'),

    sl.rows.value('@user_id', 'INT'),

    sl.rows.value('@last_name', 'NVARCHAR(40)'),

    sl.rows.value('@first_name', 'NVARCHAR(40)'),

    sl.rows.value('@can_login', 'BIT'),

    sl.rows.value('@user_type_id', 'TINYINT'),

    sl.rows.value('@organization_id', 'SMALLINT'),

    sl.rows.value('@is_current', 'BIT')

    FROM @user_base_details_xml.nodes('/Root/ListItem') AS sl(rows)

    WHERE sl.rows.value('@user_id', 'INT') IS NOT NULL

    DECLARE @output TABLE (user_id INT)

    BEGIN TRY

    BEGIN TRANSACTION user_base_details

    -----Update if exists

    UPDATE UM SET UM.last_name = UBD.last_name, UM.first_name = UBD.first_name, UM.can_login = UBD.can_login, UM.user_type_id = UBD.user_type_id

    --SELECT

    FROM @user_base_details_upd UBD

    INNER JOIN usrOwner.user_master UM ON (UBD.user_id = UM.user_id)

    -----Insert if not exists

    INSERT INTO usrOwner.user_master(last_name, first_name, can_login, user_type_id) OUTPUT inserted.user_id INTO @output

    SELECT UBD.last_name, UBD.first_name, UBD.can_login, UBD.user_type_id

    FROM @user_base_details_ins UBD

    WHERE UBD.user_id IS NULL

    ORDER BY UBD.row_id

    COMMIT TRANSACTION user_base_details;

    END TRY

    BEGIN CATCH

    IF @@TRANCOUNT > 0

    ROLLBACK TRANSACTION;

    GOTO Exit_Proc

    END CATCH

    DECLARE @user_id_correlation_data TABLE(apppage_row_id TINYINT, user_id INT)

    ; WITH UCD (user_id, row_id) AS

    (

    SELECT O.user_id, ROW_NUMBER() OVER (ORDER BY O.user_id) AS row_id

    FROM @output O

    )

    INSERT INTO @user_id_correlation_data(apppage_row_id, user_id)

    SELECT apppage_row_id, user_id

    FROM @user_base_details_upd

    UNION

    SELECT I.apppage_row_id, U.user_id

    FROM UCD U

    INNER JOIN @user_base_details_ins I ON (U.row_id = I.row_id)

    ------------Work with User Organization Details

    DECLARE @user_base_details TABLE (apppage_row_id TINYINT, user_id INT, last_name NVARCHAR(40), first_name NVARCHAR(40), can_login BIT, user_type_id TINYINT, organization_id SMALLINT, is_current BIT)

    INSERT INTO @user_base_details (apppage_row_id, user_id, last_name, first_name, can_login, user_type_id, organization_id, is_current)

    SELECT apppage_row_id, user_id, last_name, first_name, can_login, user_type_id, organization_id, is_current FROM @user_base_details_upd

    UNION

    SELECT apppage_row_id, user_id, last_name, first_name, can_login, user_type_id, organization_id, is_current FROM @user_base_details_ins

    BEGIN TRY

    BEGIN TRANSACTION user_org

    ----Update if exist

    UPDATE UOX SET UOX.is_current = UBD.is_current

    --SELECT UBD.user_id, UBD.organization_id, UBD.is_current

    FROM @user_base_details UBD

    INNER JOIN @user_id_correlation_data UCD ON (UBD.apppage_row_id = UCD.apppage_row_id)

    INNER JOIN usrOwner.user_organization_xref UOX ON (UCD.user_id = UOX.user_id AND UBD.organization_id = UOX.organization_id)

    ----Insert if not

    INSERT INTO usrOwner.user_organization_xref(user_id, organization_id, is_current)

    SELECT O.user_id, UBD.organization_id, UBD.is_current

    FROM @user_base_details UBD

    INNER JOIN @user_id_correlation_data O ON (UBD.apppage_row_id = O.apppage_row_id)

    LEFT JOIN usrOwner.user_organization_xref UOX ON (O.user_id = UOX.user_id AND UBD.organization_id = UOX.organization_id)

    WHERE UOX.user_org_id IS NULL

    ORDER BY UBD.organization_id, O.user_id

    COMMIT TRANSACTION user_org

    END TRY

    BEGIN CATCH

    IF @@TRANCOUNT > 0

    ROLLBACK TRANSACTION;

    GOTO Exit_Proc

    END CATCH

    Exit_Proc:

    IF @@TRANCOUNT > 0

    ROLLBACK TRANSACTION;

    Quick script to test your upserts

    USE DummyKolli

    GO

    SELECT O.organization_name, UT.user_type_descr, UM.*

    FROM usrOwner.user_master UM

    INNER JOIN usrOwner.user_type UT ON (UM.user_type_id = UT.user_type_id)

    LEFT JOIN usrOwner.user_organization_xref UOX ON (UM.user_id = UOX.user_id)

    LEFT JOIN orgOwner.organization O ON (UOX.organization_id = O.organization_id)

    --WHERE UM.user_id IN (-2147483648, -2147483636)

    ORDER BY UM.user_id

Viewing 0 posts

You must be logged in to reply to this topic. Login to reply