|
|
|
Say Hey Kid
      
Group: General Forum Members
Last Login: Yesterday @ 12:28 PM
Points: 675,
Visits: 2,031
|
|
It's been a long time since I've dealt with complex database design, and in this case, I'd like to ask the opinions of others who are more current than I am.
I need to design a good structure to store and enforce the rules, if practical on two types of information.
Constraint: CLR is disallowed. Constraint: SQL Server 2012 SP1 is the DB being targeted.
Physical object No physical object relates to any other physical row. Physical objects have at most one descendent virtual object row. Physical objects have "Physical" type data, which is unrelated to the data virtual objects have (mostly non-overlapping)
Virtual object All virtual objects have a parent of EITHER a physical object, OR a virtual object, never both. No virtual object is its own parent. All virtual objects relate either directly or indirectly (through ancestor virtual objects) to a physical object. Virtual objects have zero or more descendent virtual object row(s). Virtual objects have "Virtual" type data, which is unrelated to the data physical objects have (mostly non-overlapping)
I.e. example objects: Phys1 Phys2 Phys3
Virt1 - parent Phys1 Virt2 - parent Phys2 Virt3 - parent Virt2 Virt4 - parent Virt3 Virt5 - parent Virt3
My initial idea is one I really don't like, as I don't like conditional field choice (use field A except when you use field B), but I hate fields that join to more than one table even more, which resulted in: DB init
CREATE DATABASE [PhysVirtTest] CONTAINMENT = NONE ON PRIMARY ( NAME = N'PhysVirtTest', FILENAME = N'YourPath\PhysVirtTest.mdf' , SIZE = 8MB , FILEGROWTH = 4MB ) LOG ON ( NAME = N'PhysVirtTest_log', FILENAME = N'YourPath\PhysVirtTest_log.ldf' , SIZE = 4MB , FILEGROWTH = 4MB ) GO ALTER DATABASE [PhysVirtTest] SET COMPATIBILITY_LEVEL = 110 GO ALTER DATABASE [PhysVirtTest] SET ANSI_NULL_DEFAULT OFF GO ALTER DATABASE [PhysVirtTest] SET ANSI_NULLS OFF GO ALTER DATABASE [PhysVirtTest] SET ANSI_PADDING OFF GO ALTER DATABASE [PhysVirtTest] SET ANSI_WARNINGS OFF GO ALTER DATABASE [PhysVirtTest] SET ARITHABORT OFF GO ALTER DATABASE [PhysVirtTest] SET AUTO_CLOSE OFF GO ALTER DATABASE [PhysVirtTest] SET AUTO_CREATE_STATISTICS ON GO ALTER DATABASE [PhysVirtTest] SET AUTO_SHRINK OFF GO ALTER DATABASE [PhysVirtTest] SET AUTO_UPDATE_STATISTICS ON GO ALTER DATABASE [PhysVirtTest] SET CURSOR_CLOSE_ON_COMMIT OFF GO ALTER DATABASE [PhysVirtTest] SET CURSOR_DEFAULT GLOBAL GO ALTER DATABASE [PhysVirtTest] SET CONCAT_NULL_YIELDS_NULL OFF GO ALTER DATABASE [PhysVirtTest] SET NUMERIC_ROUNDABORT OFF GO ALTER DATABASE [PhysVirtTest] SET QUOTED_IDENTIFIER OFF GO ALTER DATABASE [PhysVirtTest] SET RECURSIVE_TRIGGERS OFF GO ALTER DATABASE [PhysVirtTest] SET DISABLE_BROKER GO ALTER DATABASE [PhysVirtTest] SET AUTO_UPDATE_STATISTICS_ASYNC OFF GO ALTER DATABASE [PhysVirtTest] SET DATE_CORRELATION_OPTIMIZATION OFF GO ALTER DATABASE [PhysVirtTest] SET PARAMETERIZATION SIMPLE GO ALTER DATABASE [PhysVirtTest] SET READ_COMMITTED_SNAPSHOT OFF GO ALTER DATABASE [PhysVirtTest] SET READ_WRITE GO ALTER DATABASE [PhysVirtTest] SET RECOVERY SIMPLE GO ALTER DATABASE [PhysVirtTest] SET MULTI_USER GO ALTER DATABASE [PhysVirtTest] SET PAGE_VERIFY CHECKSUM GO ALTER DATABASE [PhysVirtTest] SET TARGET_RECOVERY_TIME = 0 SECONDS GO USE [PhysVirtTest] GO IF NOT EXISTS (SELECT name FROM sys.filegroups WHERE is_default=1 AND name = N'PRIMARY') ALTER DATABASE [PhysVirtTest] MODIFY FILEGROUP [PRIMARY] DEFAULT GO
Old-fashioned way
-- The first try, without using HierarchyID - output still needs to be consolidated with FOR XML or other tricks USE [PhysVirtTest];
DROP TABLE dbo.VirtEx1; DROP TABLE dbo.PhysEx1;
CREATE TABLE dbo.PhysEx1 ( ID INT IDENTITY(1,1) NOT NULL ,Data VARCHAR(32) ,CONSTRAINT PKP1 PRIMARY KEY CLUSTERED (ID) );
CREATE TABLE dbo.VirtEx1 ( ID INT IDENTITY(1,1) NOT NULL ,VirtParentID INT NULL CONSTRAINT FKVV1 FOREIGN KEY REFERENCES PhysVirtTest.dbo.VirtEx1(ID) ,PhysParentID INT NULL CONSTRAINT FKVP1 FOREIGN KEY REFERENCES PhysVirtTest.dbo.PhysEx1(ID) ,Data VARCHAR(32) ,CONSTRAINT CKOnePar1 CHECK ((VirtParentID IS NOT NULL AND PhysParentID IS NULL) OR (VirtParentID IS NULL AND PhysParentID IS NOT NULL)) ,CONSTRAINT CKVDup1 CHECK (VirtParentID <> ID) ,CONSTRAINT PKV1 PRIMARY KEY CLUSTERED (ID) );
CREATE UNIQUE NONCLUSTERED INDEX UNIVP1 ON VirtEx1 (PhysParentID) WHERE PhysParentID IS NOT NULL;
-- Insert Good Data INSERT INTO PhysEx1 (Data) VALUES ('Phys1'), ('Phys2'), ('Phys3'); INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(NULL,1,'Virt1'), (NULL,2,'Virt2'), (2,NULL,'Virt3'),(3,NULL,'Virt4'),(3,NULL,'Virt5')
-- Basic data display SELECT * FROM PhysEx1 SELECT * FROM VirtEx1
-- Virtual object parental display -- THIS TECHNIQUE ALLOWS STRICTLY LIMITED NESTING (but doesn't loop) -- If you don't have ALL NULLs in the right-most ancestor column, -- you may be missing some ancestors! SELECT Vdesc.* ,COALESCE(Vpar4.Data, Ppar4.DATA, Vpar3.Data, Ppar3.DATA, Vpar2.Data, Ppar2.DATA, Vpar1.Data, Ppar1.DATA) AS PhysHost ,COALESCE(Vpar1.Data, Ppar1.DATA) AS Parent ,COALESCE(Vpar2.Data, Ppar2.DATA) AS GrandParent ,COALESCE(Vpar3.Data, Ppar3.DATA) AS GreatGrandParent ,COALESCE(Vpar4.Data, Ppar4.DATA) AS GreatGreatGrandParent FROM VirtEx1 Vdesc LEFT OUTER JOIN VirtEx1 Vpar1 ON Vpar1.ID = Vdesc.VirtParentID LEFT OUTER JOIN PhysEx1 Ppar1 ON Ppar1.ID = Vdesc.PhysParentID LEFT OUTER JOIN VirtEx1 Vpar2 ON Vpar2.ID = Vpar1.VirtParentID LEFT OUTER JOIN PhysEx1 Ppar2 ON Ppar2.ID = Vpar1.PhysParentID LEFT OUTER JOIN VirtEx1 Vpar3 ON Vpar3.ID = Vpar2.VirtParentID LEFT OUTER JOIN PhysEx1 Ppar3 ON Ppar3.ID = Vpar2.PhysParentID LEFT OUTER JOIN VirtEx1 Vpar4 ON Vpar4.ID = Vpar3.VirtParentID LEFT OUTER JOIN PhysEx1 Ppar4 ON Ppar4.ID = Vpar3.PhysParentID
-- Test Bad Data prevention INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(NULL,1,'DuplicatePhysParent') INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(NULL,NULL,'NoParent') INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(1,1,'TooManyParent') INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(111,NULL,'VirtParentNotExist') INSERT INTO VirtEx1 (VirtParentID, PhysParentID, Data) VALUES(NULL,111,'PhysParentNotExist')
Possible hierarchyid based, but this is my very first foray into hierarchyid, and trying to get the Data values for ancestors isn't as simple as I'd hoped.
-- The second try, using HierarchyID - output is still not coded at all USE [PhysVirtTest];
DROP TABLE dbo.VirtEx2; DROP TABLE dbo.PhysEx2;
CREATE TABLE dbo.PhysEx2 ( ID INT IDENTITY(1,1) NOT NULL ,Data VARCHAR(32) ,CONSTRAINT PKP2 PRIMARY KEY CLUSTERED (ID) );
CREATE TABLE dbo.VirtEx2 ( ID HIERARCHYID NOT NULL ,PhysParentID INT NULL CONSTRAINT FKVP2 FOREIGN KEY REFERENCES PhysVirtTest.dbo.PhysEx2(ID) ,Data VARCHAR(32) --,CONSTRAINT CKOnePar2 CHECK ((VirtParentID IS NOT NULL AND PhysParentID IS NULL) OR (VirtParentID IS NULL AND PhysParentID IS NOT NULL)) ,CONSTRAINT PKV2 PRIMARY KEY CLUSTERED (ID) );
CREATE UNIQUE NONCLUSTERED INDEX UNIVP2 ON VirtEx2 (PhysParentID) WHERE PhysParentID IS NOT NULL;
-- Insert Good Data INSERT INTO PhysEx2 (Data) VALUES ('Phys1'), ('Phys2'), ('Phys3'); -- Hierarchyid guideance from http://stackoverflow.com/questions/1352778/sql-2008-hierarchyid-with-multiple-root-nodes INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(hierarchyid::GetRoot().GetDescendant(NULL,NULL),1,'Virt1') INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(hierarchyid::GetRoot().GetDescendant((select MAX(ID) from VirtEx2 where ID.GetAncestor(1) = hierarchyid::GetRoot()),NULL),2,'Virt2')
DECLARE @VirtParent hierarchyid DECLARE @VirtNew hierarchyid
SELECT @VirtParent = ID FROM VirtEx2 WHERE Data = 'Virt2'; INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(@VirtParent.GetDescendant(NULL,NULL),NULL,'Virt3') SELECT @VirtParent = ID FROM VirtEx2 WHERE Data = 'Virt3'; SELECT @VirtNew = MAX(ID) FROM VirtEx2 WHERE ID.GetAncestor(1) = @VirtParent; -- Method that works only for the first node: INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(@VirtParent.GetDescendant(NULL,NULL),NULL,'Virt4') INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(@VirtParent.GetDescendant(@VirtNew,NULL),NULL,'Virt4') SELECT @VirtNew = MAX(ID) FROM VirtEx2 WHERE ID.GetAncestor(1) = @VirtParent; INSERT INTO VirtEx2 (ID, PhysParentID, Data) VALUES(@VirtParent.GetDescendant(@VirtNew,NULL),NULL,'Virt5')
-- Basic data display SELECT * FROM PhysEx2 SELECT ID.ToString() AS IDString, ID.GetLevel() AS IDLevel, * FROM VirtEx2
-- ??? Advanced data display - parents and grandparents.
-- Test Bad Data prevention --'DuplicatePhysParent' --'NoParent' --'TooManyParent' --'VirtParentNotExist' --'PhysParentNotExist'
Thank you all for your time in reading and/or replying.
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 5:48 PM
Points: 7,088,
Visits: 7,143
|
|
I haven't done any database design that really matters for a few years, but I would do something very different from your idea.
Lets start with fundamental basics.
I want the design to be fairly proof against new releases of the RDBMS software, in this case SQL Server; and I want not to have it preclude the use of recently added (eight years ago?) features by using bizarre configuration. So I would change this block of code to have ON in every case where tyou have OFF.ALTER DATABASE [PhysVirtTest] SET ANSI_NULL_DEFAULT OFF GO ALTER DATABASE [PhysVirtTest] SET ANSI_NULLS OFF GO ALTER DATABASE [PhysVirtTest] SET ANSI_PADDING OFF GO ALTER DATABASE [PhysVirtTest] SET ANSI_WARNINGS OFF GO ALTER DATABASE [PhysVirtTest] SET ARITHABORT OFF GO ALTER DATABASE [PhysVirtTest] SET CURSOR_CLOSE_ON_COMMIT OFF GO ALTER DATABASE [PhysVirtTest] SET CONCAT_NULL_YIELDS_NULL OFF GO ALTER DATABASE [PhysVirtTest] SET NUMERIC_ROUNDABORT OFF GO ALTER DATABASE [PhysVirtTest] SET QUOTED_IDENTIFIER OFF GO I'm not sure that I haven't included a single ine that could sensibly have OFF, but certainly the vast majority should have ON.
Then why are there two tables (virtual objects and physical objects) instead of one? It doesn't help at all, and it does hinder! Also, what is the attitude to NULL? It seems to me that avoiding NULL is the only imaginable reason for having two tables instead of one, but your schema has NULLs anyway. Depending on how NULL-averse you are you might have more than one table, but the extra tables over and above the basic one table would be link tables, not base object tables.
So I would end up with something like CREATE TABLE dbo.PhysorVirtEx ( ID INT IDENTITY(1,1) NOT NULL Primary key Clustered ,ParentID INT NULL REFERENCES PhysorVirtEx(ID), ,Data VARCHAR(32) ,IsVirtual BIT NOT NULL, ,CONSTRAINT CKOnePar1 CHECK (ParentID is not NULL or Isvirtual = 0) ,CONSTRAINT CKVDup1 CHECK (VirtParentID <> ID) ); As you can see, I'm not so null-averse as to be silly about it.
Tom Que conclure à la fin de tous mes longs propos? C'est que les préjugés sont la raison des sots. (Voltaire, 1756)
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Today @ 11:14 PM
Points: 6,706,
Visits: 11,738
|
|
Physical objects have at most one descendent virtual object row.
Virtual objects have zero or more descendent virtual object row(s).
Which might be more common, someone asking to which physical object does this virtual object relate?? or which virtual objects refer to this physical object?.
One more question, if a physical object has 10 virtual descendants which would be more common, a separation of 10 levels between the physical object and the leaf-level virtual object (i.e. each virtual object has only one descendant) or a separation of far less that 10, maybe 2 (e.g. physical object has one virtual child and that virtual object has 9 virtual object children).
__________________________________________________________________________________________________ There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
Believe you can and you're halfway there. --Theodore Roosevelt
Everything Should Be Made as Simple as Possible, But Not Simpler --Albert Einstein
The significant problems we face cannot be solved at the same level of thinking we were at when we created them. --Albert Einstein
1 apple is not exactly 1/8 of 8 apples. Because there are no absolutely identical apples. --Giordy
|
|
|
|
|
Say Hey Kid
      
Group: General Forum Members
Last Login: Yesterday @ 12:28 PM
Points: 675,
Visits: 2,031
|
|
L' Eomot Inversé (1/11/2013)
I haven't done any database design that really matters for a few years, but I would do something very different from your idea. Lets start with fundamental basics. I want the design to be fairly proof against new releases of the RDBMS software, in this case SQL Server; and I want not to have it preclude the use of recently added (eight years ago?) features by using bizarre configuration. So I would change this block of code to have ON in every case where tyou have OFF. ALTER DATABASE [PhysVirtTest] SET ANSI_NULL_DEFAULT OFF ... I'm not sure that I haven't included a single ine that could sensibly have OFF, but certainly the vast majority should have ON. Then why are there two tables (virtual objects and physical objects) instead of one? It doesn't help at all, and it does hinder! Also, what is the attitude to NULL? It seems to me that avoiding NULL is the only imaginable reason for having two tables instead of one, but your schema has NULLs anyway. Depending on how NULL-averse you are you might have more than one table, but the extra tables over and above the basic one table would be link tables, not base object tables. So I would end up with something like CREATE TABLE dbo.PhysorVirtEx ( ID INT IDENTITY(1,1) NOT NULL Primary key Clustered ,ParentID INT NULL REFERENCES PhysorVirtEx(ID), ,Data VARCHAR(32) ,IsVirtual BIT NOT NULL, ,CONSTRAINT CKOnePar1 CHECK (ParentID is not NULL or Isvirtual = 0) ,CONSTRAINT CKVDup1 CHECK (VirtParentID <> ID) ); As you can see, I'm not so null-averse as to be silly about it.
First, thank you very much for responding. The ANSI settings were 100% the defaults SQL 2012 gave because I didn't care about the sample database, but I wanted one for your ease in cleaning up after helping me.
I'll edit the first post to add this information about why two tables: Physical and Virtual objects have mostly very different data, which in a real table will be sets of columns - the "Data" field here is merely a placeholder for simplicity. This is why I have two tables, so that non-key fields are facts about the key.
I apologize for not including that information originally.
opc.three (1/13/2013)
Physical objects have at most one descendent virtual object row. Virtual objects have zero or more descendent virtual object row(s). Which might be more common, someone asking to which physical object does this virtual object relate?? or which virtual objects refer to this physical object?. One more question, if a physical object has 10 virtual descendants which would be more common, a separation of 10 levels between the physical object and the leaf-level virtual object (i.e. each virtual object has only one descendant) or a separation of far less that 10, maybe 2 (e.g. physical object has one virtual child and that virtual object has 9 virtual object children).
Thank you for responding as well!
I expect that going from physical to virtual, and from virtual to physical, will be more or less equally common. Both will convey valuable and necessary data for certain tasks, and all of those tasks will be required on a day to day basis.
Most often, a physical object will have from 0 to 6 direct virtual descendants at a leaf level (i.e. the first level down). Rarely, a physical object will have a completely unknown tree structure going 2-N levels down, where N is in actuality unbounded, but almost certainly less than 5. These cases cannot be easily predicted in advance - perhaps they're wide and deep, perhaps narrow and deep. I'd guess they'd be narrow and deep more often, but I cannot guarantee that.
Alas, the rare/exception cases are always a pain to deal with.
|
|
|
|