Redesign help

  • matak (10/21/2012)


    I have considered a consultant yes and will talk to the appropriate people about getting one in.

    I thought id give the forums a go just in case it was something not too hard (im sure ive mentioned my lack of design skills !! )

    Unfortuntely there is no definative solution that can be suggested in the context of this forum without seeing the whole picture. If your company is willing to pay for a consultant that would probably be the best thing for you. You could have somebody to help you out every step of the way. Of course you might have to become a salesman as to why? Make a case and try to use current issues to help you with that. Are you experiencing peformance problems and so on...

    Good Luck!


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • matak (10/17/2012)


    Hi,

    Ive finally been allowed to redesign some tables which are "wide" (Think Field1, Field2, Field3 ... Fieldn). There are multiple fields that do this and columns get added when more Fields are needed.

    To make it even better, Field1 for one project may not contain the same information as Field1 for a different project - there is a "mapping" table that lets people know what the field actually means.

    What im thinking of is to split these columns into a separate table something like the below

    IF OBJECT_ID('TempDB..#Field','U') IS NOT NULL

    DROP TABLE #Field

    CREATE TABLE #Field

    (

    fieldID int NOT NULL,

    pkFromOrigTable int NOT NULL,

    fieldName varchar(200) not null,

    fieldValue decimal(30,20)

    )

    Any advice, suggestions are more than welcome. I want to try and avoid creating something as bad.

    Below I've re-created in simplified form some tables and data that allow unlimited creation of custom columns that can be keyed to a sample users table. This is just a pared-down example of a method used by one of the largest open-source development platforms in existence.

    /* Pivot Query to display Profile Properties */

    /* Note: Scripts to create and populate test tables below */

    DECLARE @user-id AS INT

    SET @user-id = 1

    SELECT

    *

    FROM

    (

    SELECT

    pd.Name AS PropertyName

    ,up.PropertyValue AS PropertyValue

    FROM

    dbo.TEST_Users AS u

    INNER JOIN dbo.TEST_UserProfile AS up

    ON up.UserID = u.userid

    INNER JOIN dbo.TEST_ProfileDefinition AS pd

    ON up.DefinitionID = pd.DefinitionID

    WHERE

    u.UserID = @user-id

    ) AS Result PIVOT

    (MAX(PropertyValue) FOR PropertyName IN

    ([FirstName],[LastName],[StreetAddress],[City],[State],[Zip],[FavoriteColor],[PIN]))

    AS UserProfile

    /* Run Once */

    CREATE TABLE [dbo].[TEST_Users](

    [UserID] [int] IDENTITY(1,1) NOT NULL,

    [Username] [nvarchar](100) NOT NULL,

    CONSTRAINT [PK_Test_Users] PRIMARY KEY CLUSTERED

    (

    [UserID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    SET IDENTITY_INSERT [dbo].[TEST_Users] ON

    INSERT [dbo].[TEST_Users] ([UserID], [Username]) VALUES (1, N'George Washington')

    INSERT [dbo].[TEST_Users] ([UserID], [Username]) VALUES (2, N'John Adams')

    INSERT [dbo].[TEST_Users] ([UserID], [Username]) VALUES (3, N'Thomas Jefferson')

    INSERT [dbo].[TEST_Users] ([UserID], [Username]) VALUES (4, N'James Madison')

    SET IDENTITY_INSERT [dbo].[TEST_Users] OFF

    CREATE TABLE [dbo].[TEST_UserProfile](

    [ProfileID] [int] IDENTITY(1,1) NOT NULL,

    [UserID] [int] NOT NULL,

    [DefinitionID] [int] NOT NULL,

    [PropertyValue] [nvarchar](max) NULL,

    CONSTRAINT [PK_TEST_UserProfile] PRIMARY KEY CLUSTERED

    (

    [ProfileID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    SET IDENTITY_INSERT [dbo].[TEST_UserProfile] ON

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (1, 1, 1, N'George')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (2, 1, 2, N'Washington')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (3, 1, 3, N'123 Main St')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (4, 1, 4, N'Mt Vernon')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (5, 1, 5, N'VA')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (6, 1, 6, N'34567')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (7, 1, 7, N'Blue')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (8, 1, 8, N'1776')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (10, 2, 1, N'John')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (11, 2, 2, N'Adams')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (12, 2, 3, N'456 Elm Ln')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (13, 2, 4, N'Boston')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (14, 2, 5, N'MA')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (15, 2, 6, N'01234')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (16, 2, 7, N'Red')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (17, 2, 8, N'1799')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (18, 3, 1, N'Thomas')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (19, 3, 2, N'Jefferson')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (20, 3, 3, N'RR 567')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (21, 3, 4, N'Charlottesville')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (22, 3, 5, N'VA')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (23, 3, 6, N'23456')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (24, 3, 7, N'Green')

    INSERT [dbo].[TEST_UserProfile] ([ProfileID], [UserID], [DefinitionID], [PropertyValue]) VALUES (25, 3, 8, N'1801')

    SET IDENTITY_INSERT [dbo].[TEST_UserProfile] OFF

    CREATE TABLE [dbo].[TEST_ProfileDefinition](

    [DefinitionID] [int] IDENTITY(1,1) NOT NULL,

    [Category] [nvarchar](50) NOT NULL,

    [Name] [nvarchar](50) NOT NULL,

    [DataType] [nvarchar](50) NOT NULL,

    [Length] [int] NOT NULL,

    [ViewOrder] [int] NOT NULL,

    CONSTRAINT [PK_TEST_ProfileDefinition] PRIMARY KEY CLUSTERED

    (

    [DefinitionID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    SET IDENTITY_INSERT [dbo].[TEST_ProfileDefinition] ON

    INSERT [dbo].[TEST_ProfileDefinition] ([DefinitionID], [Category], [Name], [DataType], [Length], [ViewOrder]) VALUES (1, N'Personal', N'FirstName', N'varchar', 50, 10)

    INSERT [dbo].[TEST_ProfileDefinition] ([DefinitionID], [Category], [Name], [DataType], [Length], [ViewOrder]) VALUES (2, N'Personal', N'LastName', N'varchar', 50, 20)

    INSERT [dbo].[TEST_ProfileDefinition] ([DefinitionID], [Category], [Name], [DataType], [Length], [ViewOrder]) VALUES (3, N'Address', N'StreetAddress', N'varchar', 50, 10)

    INSERT [dbo].[TEST_ProfileDefinition] ([DefinitionID], [Category], [Name], [DataType], [Length], [ViewOrder]) VALUES (4, N'Address', N'City', N'varchar', 20, 20)

    INSERT [dbo].[TEST_ProfileDefinition] ([DefinitionID], [Category], [Name], [DataType], [Length], [ViewOrder]) VALUES (5, N'Address', N'State', N'char', 2, 30)

    INSERT [dbo].[TEST_ProfileDefinition] ([DefinitionID], [Category], [Name], [DataType], [Length], [ViewOrder]) VALUES (6, N'Address', N'Zip', N'varchar', 20, 40)

    INSERT [dbo].[TEST_ProfileDefinition] ([DefinitionID], [Category], [Name], [DataType], [Length], [ViewOrder]) VALUES (7, N'User', N'FavoriteColor', N'varchar', 50, 10)

    INSERT [dbo].[TEST_ProfileDefinition] ([DefinitionID], [Category], [Name], [DataType], [Length], [ViewOrder]) VALUES (8, N'User', N'PIN', N'int', 4, 20)

    SET IDENTITY_INSERT [dbo].[TEST_ProfileDefinition] OFF

  • This is just a pared-down example of a method used by one of the largest open-source development platforms in existence.

    And that platform would be...care to share?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Instead of separate denormalized columns, I'd rather contain user defined attributes in an XML column, preferably in a seperate extended type table. In the example below, each extended type (regular, contractor, temp, etc.) would have a seperate extype_id, and an employee could be of multiple types. Users can create their own types and dump their junk in it without cluttering up and fragmenting an otherwise normalized primary table. You can even create a view (or views) that join the two tables together, providing the users with a denormalized flat resultset that's easy for them to query and visualize like an Excel sheet.

    create table employee

    (

    employee_id smallint not null primary key,

    ssn char(12) not null,

    hire_date datetime not null,

    reportsto_id smallint null

    );

    create table employee_extype

    (

    primary key (employee_id, extype_id ),

    employee_id smallint not null,

    extype_id smallint not null,

    exttype_xml XML null

    );

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • opc.three (10/22/2012)


    This is just a pared-down example of a method used by one of the largest open-source development platforms in existence.

    And that platform would be...care to share?

    DotNetNuke

    DotNetNuke Users List

    ...and no, I don't work for DNN. But it is our primary development platform.

     

  • Steven Willis (10/23/2012)


    opc.three (10/22/2012)


    This is just a pared-down example of a method used by one of the largest open-source development platforms in existence.

    And that platform would be...care to share?

    DotNetNuke

    DotNetNuke Users List

    ...and no, I don't work for DNN. But it is our primary development platform.

     

    Thanks for sharing. That is not surprising. By nature DNN is built to handle structured, semi-structured and unstructured data so using an EAV may make sense. I would place that in the niche I mentioned. Most of what they store is likely unstructured, loosely typed metadata or document/site artifacts, not necessarily data domain elements. In my opinion when you have command over your data domain and need to store strongly typed data elements you should do everything in your power to avoid employing an EAV model (a non-relational concept) in a relational database. What is thought to be saved initially by implementing an EAV is paid for dearly when it comes time to work with that data in a tabular way, the definition of an anti-pattern. Just my two cents.

    edit: spelling

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Eric M Russell (10/23/2012)


    Instead of separate denormalized columns, I'd rather contain user defined attributes in an XML column, preferably in a seperate extended type table. In the example below, each extended type (regular, contractor, temp, etc.) would have a seperate extype_id, and an employee could be of multiple types. Users can create their own types and dump their junk in it without cluttering up and fragmenting an otherwise normalized primary table. You can even create a view (or views) that join the two tables together, providing the users with a denormalized flat resultset that's easy for them to query and visualize like an Excel sheet.

    create table employee

    (

    employee_id smallint not null primary key,

    ssn char(12) not null,

    hire_date datetime not null,

    reportsto_id smallint null

    );

    create table employee_extype

    (

    primary key (employee_id, extype_id ),

    employee_id smallint not null,

    extype_id smallint not null,

    exttype_xml XML null

    );

    I have not tried nor studied this technique in any great detail, but have heard from people I trust that the technique does not scale particularly well.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (10/23/2012)


    Eric M Russell (10/23/2012)


    Instead of separate denormalized columns, I'd rather contain user defined attributes in an XML column, preferably in a seperate extended type table. In the example below, each extended type (regular, contractor, temp, etc.) would have a seperate extype_id, and an employee could be of multiple types. Users can create their own types and dump their junk in it without cluttering up and fragmenting an otherwise normalized primary table. You can even create a view (or views) that join the two tables together, providing the users with a denormalized flat resultset that's easy for them to query and visualize like an Excel sheet.

    create table employee

    (

    employee_id smallint not null primary key,

    ssn char(12) not null,

    hire_date datetime not null,

    reportsto_id smallint null

    );

    create table employee_extype

    (

    primary key (employee_id, extype_id ),

    employee_id smallint not null,

    extype_id smallint not null,

    exttype_xml XML null

    );

    I have not tried nor studied this technique in any great detail, but have heard from people I trust that the technique does not scale particularly well.

    Of course the ideal solution is to have all tables normalized, and I've never actually designed a table like this in production.

    What's described above would be one form of a "property bag" technique. I have been tasked with building data extract and reporting from a large ISV database (PegaSystems) on Oracle that used this technique. From what I've been told, it's efficient from the application's perspective, because they can pull up a single patient's chart with only a few page reads. However, for aggregate reporting, the performance was bad, like full table scans. The developers actually had to replicate normalized versions of the tables just to facilitate the reporting queries.

    In the later versions of both Oracle and SQL Server, it's possible to index XML columns, which would make it more scalable as far as querying, but I still prefer fully normalized table design. Fortunately I don't get many requests for tables that allow the users to create their own attributes or "fields" on the fly.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (10/23/2012)


    What's described above would be one form of a "property bag" technique. I have been tasked with building data extract and reporting from a large ISV database (PegaSystems) on Oracle that used this technique. From what I've been told, it's efficient from the application's perspective, because they can pull up a single patient's chart with only a few page reads. However, for aggregate reporting, the performance was bad, like full table scans. The developers actually had to replicate normalized versions of the tables just to facilitate the reporting queries.

    Yep, totally agree. That's exactly where with EAVs and these type of property bag models fall down. They seem like a great idea at first because the developer making the choice is building an OLTP app..."it's so elegant, it can handle any type of property anyone could ever imagine, and it performs well, what could be better than this?" Until the poor soul that actually has to access more than one row at a time (reporting, batch processing, searching, etc.). I have been that poor soul, hence the soapbox 😀 On the other hand I have been paid well to undo messes like this :satisfied:

    In the later versions of both Oracle and SQL Server, it's possible to index XML columns, which would make it more scalable as far as querying, but I still prefer fully normalized table design. Fortunately I don't get many requests for tables that allow the users to create their own attributes or "fields" on the fly.

    I have not worked with an XML property bag implementation as described but have worked with XML indexes a bit and they have their own set of baggage to consider.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (10/23/2012)


    Steven Willis (10/23/2012)


    opc.three (10/22/2012)


    This is just a pared-down example of a method used by one of the largest open-source development platforms in existence.

    And that platform would be...care to share?

    DotNetNuke

    DotNetNuke Users List

    ...and no, I don't work for DNN. But it is our primary development platform.

     

    Thanks for sharing. That is not surprising. By nature DNN is built to handle structured, semi-structured and unstructured data so using an EAV may make sense. I would place that in the niche I mentioned. Most of what they store is likely unstructured, loosely typed metadata or document/site artifacts, not necessarily data domain elements. In my opinion when you have command over your data domain and need to store strongly typed data elements you should do everything in your power to avoid employing an EAV model (a non-relational concept) in a relational database. What is thought to be saved initially by implementing an EAV is paid for dearly when it comes time to work with that data in a tabular way, the definition of an anti-pattern. Just my two cents.

    edit: spelling

    OH, YES! I strongly agree! That little bit of "flexibility" established in DNN 1.0 from the start has caused me many headaches all the way up the the current v6.xx. When doing custom development within the DNN framework I avoid the ProfilePropertyDefinition table if at all possible and just create my own custom table(s) for the particular requirement.

    My example above actually shows one of the major problems with the approach whereas I often have to use a PIVOT operator to get the necessary results. But very often I get handed a pre-existing site in which the previous admin has used the native definition tables and it would be more work to change it than to just apply the procedures I've collected over the years like my pivot query above. A pivot seems simple once you see a good example, but TOO MANY TIMES TO COUNT I've seen something like this:

    SELECT

    *

    FROM

    (

    SELECT up.Value

    FROM User u

    INNER JOIN UserProfile up ON u.UserID = up.UserID

    INNER JOIN ProfilePropertyDefinition ppd ON up.PropertyID = ppd.PropertyID

    WHERE u.UserID = 12345 AND ppd.PropertyName = 'FName'

    ) AS FName

    ,(

    SELECT up.Value FROM User u INNER JOIN ..... etc

    ) AS LName

    ,(

    SELECT up.Value FROM User u INNER JOIN ..... etc

    ) AS City

    ,(

    SELECT up.Value FROM User u INNER JOIN ..... etc

    ) AS State

    /*** repeated dozens of times for all the profile items...(!) ***/

    ) AS Result

    etc

    Then, to make things even worse code like this will sometimes be wrapped by a cursor looping through all of the UserIDs! :blink:

    Ultimately, I don't think such a construction is necessarily always bad (an EAV that is, not the bad query above)--sometimes they can be very useful. But it's a different kind of data relationship that takes some thought to build efficient queries.

    Another construct I'd put in this category is something like a hierarchical Product table with multiple parent-child levels. That has its pros and cons as well--but writing EFFICIENT queries to build proper hierarchies can be a real PIA. 😉

     

  • opc.three (10/23/2012)


    Eric M Russell (10/23/2012)


    What's described above would be one form of a "property bag" technique. I have been tasked with building data extract and reporting from a large ISV database (PegaSystems) on Oracle that used this technique. From what I've been told, it's efficient from the application's perspective, because they can pull up a single patient's chart with only a few page reads. However, for aggregate reporting, the performance was bad, like full table scans. The developers actually had to replicate normalized versions of the tables just to facilitate the reporting queries.

    Yep, totally agree. That's exactly where with EAVs and these type of property bag models fall down. They seem like a great idea at first because the developer making the choice is building an OLTP app..."it's so elegant, it can handle any type of property anyone could ever imagine, and it performs well, what could be better than this?" Until the poor soul that actually has to access more than one row at a time (reporting, batch processing, searching, etc.). I have been that poor soul, hence the soapbox 😀 On the other hand I have been paid well to undo messes like this :satisfied:

    In the later versions of both Oracle and SQL Server, it's possible to index XML columns, which would make it more scalable as far as querying, but I still prefer fully normalized table design. Fortunately I don't get many requests for tables that allow the users to create their own attributes or "fields" on the fly.

    I have not worked with an XML property bag implementation as described but have worked with XML indexes a bit and they have their own set of baggage to consider.

    Exactly! I had to support a system like this for patient case records. It worked fantastic when a nurse only had to pull data for one patient. As soon as they wanted reports then you start seeing the issues. I always had to tell folks to try and keep the requested fields from those particular tables to a minimum due to the table scans. Unfortunately it wasn't our app so I couldn't make any changes to schema. Otherwise I could have allieviated some of the issues with some well placed indexes but it still wouldn't be ideal.


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply