Index and or constraint

  • I have this table
    CREATE TABLE [dbo].[LaserData](
        [Id] [int] IDENTITY(1,1) NOT NULL,
        [MachineId} [nvarchar](50) NULL,
        [F1] [nvarchar](50) NOT NULL,
        [Kant] [char](1) NULL,
    -- number of columns removed
    )

    Id is the primary key (needed for Entity Framework). The columns MachineId + F1 + Kant must be unique! The default sorting is on these three columns. Searching will usually be on F1 and all of them. also query for a specific MachineId
    Should I create a index
    CREATE UNIQUE NONCLUSTERED INDEX table ON table([MachineId] ASC, [F1] ASC, [Kant] ASC)
    or make two or more indexes for the columns and create an constraint for the uniqueness of the three columns?

    // Anders
    PS! I'm using SQL Server 2014 (but I couldn't find any Newbie subforum under the 2014 forum

  • I'd do a constraint on the 3 columns for sure since you are saying they need to be unique.  But when you say they need to be unique, do you mean that you could have the following 2 values:
    MachineId = 'bob'
    F1 = 'bob'
    Kant = 'b'
    and
    MachineId = 'bob'
    F1 = 'bob'
    Kant = 'c'
     or can there be no duplicates in any of those columns?  This would change how I would design this.  My interpretation of what you said is that the above should be allowed since the 3 columns added together would give you unique value.

    It might be easier to normalize this and use foreign key lookups.  It would make the data lookup faster and will result in smaller tables presuming some of those values are allowed to replicate in the table.  Data lookup could be faster based on how you defined your indexes.

    As for indexes, it depends.  If the table has a lot of insert/update/delete statements run against it, multiple indexes may not be the way to go.  If it is mostly selects being run against it, having more focused indexes will result in faster queries.
    Also, if there are a lot of inserts happening, your identity column may not be set up quite right.  You can double the number of ints you can use by setting it to start at negative 2,147,483,647 (-2,147,483,647) instead of starting at 1.  This will double the number of inserts you can do before you run out of ints.  The other option is to use bigint instead of int.
    But I think normalizing it will give you the best performance increase and you can index the other tables for fast lookups.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • If you were to make 2 indexes for the columns, what would the other one be?  Creating a unique constraint does create an index so that isn't any different.  One index should be fine, having the first column being the one that's queried the most.  When you say "Searching will usually be on F1 and all of them" does that mean each search will include all 3 columns?

  • andis59 - Tuesday, April 4, 2017 2:02 PM

    I have this table
    CREATE TABLE [dbo].[LaserData](
        [Id] [int] IDENTITY(1,1) NOT NULL,
        [MachineId} [nvarchar](50) NULL,
        [F1] [nvarchar](50) NOT NULL,
        [Kant] [char](1) NULL,
    -- number of columns removed
    )

    Id is the primary key (needed for Entity Framework). The columns MachineId + F1 + Kant must be unique! The default sorting is on these three columns. Searching will usually be on F1 and all of them. also query for a specific MachineId
    Should I create a index
    CREATE UNIQUE NONCLUSTERED INDEX table ON table([MachineId] ASC, [F1] ASC, [Kant] ASC)
    or make two or more indexes for the columns and create an constraint for the uniqueness of the three columns?

    // Anders
    PS! I'm using SQL Server 2014 (but I couldn't find any Newbie subforum under the 2014 forum

    Unique constraints are enforced by an index.

    What kind of values will be contained in the F1 column and what is the average row length of the table?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here is some more information.
    The data is used for creating a marking of a product. There already is a database and we now need to add an identification of the marking machine, since the same article will have different layouts. Due to the fact that this database was originally an Excel document we have column names like F1, F2, .... The table have 26 columns that all can be used in the layout.

    The columns for identification are
    MachineId  The Id of the machine
    F1 the Article number
    Kant The side of the product that the marking should be on

    The MachineId will be used as a 'grouping' where each group may have the same F1 + Kant.
    MachineId + F1 + Kant must be unique

    This is the current table (from MS SQL Server Management Studio) to which I need to add MachineId.

    CREATE TABLE [dbo].[LaserData](
        [Id] [int] IDENTITY(1,1) NOT NULL,,
        [F1] [nvarchar](50) NOT NULL,
        [Kant] [char](1) NULL,
        [Avdelning] [nchar](50) NULL,
        [F2] [nvarchar](50) NULL,
        [F3] [nvarchar](50) NULL,
        [F4] [nvarchar](50) NULL,
        [F5] [nvarchar](50) NULL,
        [F6] [nvarchar](50) NULL,
        [F7] [nvarchar](50) NULL,
        [F8] [nvarchar](50) NULL,
        [F9] [nvarchar](50) NULL,
        [F10] [nvarchar](50) NULL,
        [BC1] [nvarchar](50) NULL,
        [BC2] [nvarchar](50) NULL,
        [Template] [nvarchar](50) NULL,
        [P1] [nvarchar](50) NULL,
        [P2] [nvarchar](50) NULL,
        [P3] [nvarchar](50) NULL,
        [P4] [nvarchar](50) NULL,
        [P5] [nvarchar](50) NULL,
        [P6] [nvarchar](50) NULL,
        [FixtureId] [nvarchar](50) NULL,
        [ExternTest] [bit] NULL,
        [EnableTO] [bit] NULL,
        [TOnr] [nvarchar](50) NULL,
    CONSTRAINT [PK_LaserData] PRIMARY KEY CLUSTERED
    (
        [Id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
    CONSTRAINT [IX_laserdata_F1_Kant] UNIQUE NONCLUSTERED
    (
        [F1] ASC,
        [Kant] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]

  • bmg002 - Tuesday, April 4, 2017 3:05 PM

    I'd do a constraint on the 3 columns for sure since you are saying they need to be unique.  But when you say they need to be unique, do you mean that you could have the following 2 values:
    MachineId = 'bob'
    F1 = 'bob'
    Kant = 'b'
    and
    MachineId = 'bob'
    F1 = 'bob'
    Kant = 'c'
     or can there be no duplicates in any of those columns?  This would change how I would design this.  My interpretation of what you said is that the above should be allowed since the 3 columns added together would give you unique value.

    It might be easier to normalize this and use foreign key lookups.  It would make the data lookup faster and will result in smaller tables presuming some of those values are allowed to replicate in the table.  Data lookup could be faster based on how you defined your indexes.

    As for indexes, it depends.  If the table has a lot of insert/update/delete statements run against it, multiple indexes may not be the way to go.  If it is mostly selects being run against it, having more focused indexes will result in faster queries.
    Also, if there are a lot of inserts happening, your identity column may not be set up quite right.  You can double the number of ints you can use by setting it to start at negative 2,147,483,647 (-2,147,483,647) instead of starting at 1.  This will double the number of inserts you can do before you run out of ints.  The other option is to use bigint instead of int.
    But I think normalizing it will give you the best performance increase and you can index the other tables for fast lookups.

    I have added some info which hopefully will answer the first question you have, but the example you give would be correct. There may also be
    MachineId = 'ann'
    F1 = 'bob'
    Kant = 'b'
    and
    MachineId = 'ann'
    F1 = 'bob'
    Kant = 'c'

    I.e. The same F1+kant may be used with another MachineId.

    The table is mainly used for querying. The only time it's used for inserting/editing is when there is a new article or some of the marking data changes. When querying all three columns will be used in the where clause.
    The table contains about 1500 rows at the moment an we estimate that we might reach 5000 in a couple of years. So int will probably be sufficiant 😉

    Being a beginner at databases and SQL; How would I normalize this table?

    Thank you for your answer!

  • andis59 - Wednesday, April 5, 2017 12:35 AM

    Here is some more information.
    The data is used for creating a marking of a product. There already is a database and we now need to add an identification of the marking machine, since the same article will have different layouts. Due to the fact that this database was originally an Excel document we have column names like F1, F2, .... The table have 26 columns that all can be used in the layout.

    The columns for identification are
    MachineId  The Id of the machine
    F1 the Article number
    Kant The side of the product that the marking should be on

    The MachineId will be used as a 'grouping' where each group may have the same F1 + Kant.
    MachineId + F1 + Kant must be unique

    This is the current table (from MS SQL Server Management Studio) to which I need to add MachineId.

    CREATE TABLE [dbo].[LaserData](
        [Id] [int] IDENTITY(1,1) NOT NULL,,
        [F1] [nvarchar](50) NOT NULL,
        [Kant] [char](1) NULL,
        [Avdelning] [nchar](50) NULL,
        [F2] [nvarchar](50) NULL,
        [F3] [nvarchar](50) NULL,
        [F4] [nvarchar](50) NULL,
        [F5] [nvarchar](50) NULL,
        [F6] [nvarchar](50) NULL,
        [F7] [nvarchar](50) NULL,
        [F8] [nvarchar](50) NULL,
        [F9] [nvarchar](50) NULL,
        [F10] [nvarchar](50) NULL,
        [BC1] [nvarchar](50) NULL,
        [BC2] [nvarchar](50) NULL,
        [Template] [nvarchar](50) NULL,
        [P1] [nvarchar](50) NULL,
        [P2] [nvarchar](50) NULL,
        [P3] [nvarchar](50) NULL,
        [P4] [nvarchar](50) NULL,
        [P5] [nvarchar](50) NULL,
        [P6] [nvarchar](50) NULL,
        [FixtureId] [nvarchar](50) NULL,
        [ExternTest] [bit] NULL,
        [EnableTO] [bit] NULL,
        [TOnr] [nvarchar](50) NULL,
    CONSTRAINT [PK_LaserData] PRIMARY KEY CLUSTERED
    (
        [Id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
    CONSTRAINT [IX_laserdata_F1_Kant] UNIQUE NONCLUSTERED
    (
        [F1] ASC,
        [Kant] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]

    I'm not seeing a "MachineID" in that table.  And... wow... just WOW!  Apparently the original designer of the table had no concept in the usefulness of meaningful names regardless of what the source was.  To be honest, that's the first thing I'd fix so that a Developer doesn't need to rely on tribal knowledge or external documentation to know what's in a given column.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Wednesday, April 5, 2017 7:50 AM

    andis59 - Wednesday, April 5, 2017 12:35 AM

    This is the current table (from MS SQL Server Management Studio) to which I need to add MachineId.

    I'm not seeing a "MachineID" in that table.  And... wow... just WOW!  Apparently the original designer of the table had no concept in the usefulness of meaningful names regardless of what the source was.  To be honest, that's the first thing I'd fix so that a Developer doesn't need to rely on tribal knowledge or external documentation to know what's in a given column.

    No, as I said I need to add this column.
    The first thought that went through my mind when I saw the tables, was to change the column names. But the names are used in many, many layouts (that define how and what should be marked), so it wasn't realistic to make this change. Also the columns are used for product specific data so the same column can have different meaning for different products...

  • My rule of thumb for creating a constraint vs an index is the purpose of the object.  If the main purpose is to enforce a business key (uniqueness) I create a constraint, if it is mainly for querying then it I create an index.  Since both do create an index, it really is semantics.

    As has already been said, if you can, now is the time to fix the naming and make the column names more descriptive.

  • andis59 - Wednesday, April 5, 2017 12:47 AM

    bmg002 - Tuesday, April 4, 2017 3:05 PM

    I'd do a constraint on the 3 columns for sure since you are saying they need to be unique.  But when you say they need to be unique, do you mean that you could have the following 2 values:
    MachineId = 'bob'
    F1 = 'bob'
    Kant = 'b'
    and
    MachineId = 'bob'
    F1 = 'bob'
    Kant = 'c'
     or can there be no duplicates in any of those columns?  This would change how I would design this.  My interpretation of what you said is that the above should be allowed since the 3 columns added together would give you unique value.

    It might be easier to normalize this and use foreign key lookups.  It would make the data lookup faster and will result in smaller tables presuming some of those values are allowed to replicate in the table.  Data lookup could be faster based on how you defined your indexes.

    As for indexes, it depends.  If the table has a lot of insert/update/delete statements run against it, multiple indexes may not be the way to go.  If it is mostly selects being run against it, having more focused indexes will result in faster queries.
    Also, if there are a lot of inserts happening, your identity column may not be set up quite right.  You can double the number of ints you can use by setting it to start at negative 2,147,483,647 (-2,147,483,647) instead of starting at 1.  This will double the number of inserts you can do before you run out of ints.  The other option is to use bigint instead of int.
    But I think normalizing it will give you the best performance increase and you can index the other tables for fast lookups.

    I have added some info which hopefully will answer the first question you have, but the example you give would be correct. There may also be
    MachineId = 'ann'
    F1 = 'bob'
    Kant = 'b'
    and
    MachineId = 'ann'
    F1 = 'bob'
    Kant = 'c'

    I.e. The same F1+kant may be used with another MachineId.

    The table is mainly used for querying. The only time it's used for inserting/editing is when there is a new article or some of the marking data changes. When querying all three columns will be used in the where clause.
    The table contains about 1500 rows at the moment an we estimate that we might reach 5000 in a couple of years. So int will probably be sufficiant 😉

    Being a beginner at databases and SQL; How would I normalize this table?

    Thank you for your answer!

    I agree with Jeff Moden; do those column names mean anything to you and/or your software?  If you have done any software development, you likely know the pitfalls of calling a variable "A".  Might work fine, but support of the system gets more difficult the longer it is that nobody looks at it.

    As for normalizing the table, if it was me, I'd pull the columns that would contain multiple reused values and store them in their own table.  Things like machineid could be put into their own "MachineID" table.  But with the small expected dataset, I'm not sure I'd spend a lot of time normalizing the database.
    There are multiple advantages to normalizing including reduced database size and reusability.  You get reduced size due to de-duplication of values as you are now only using ints (or smallints depending on the number of MachineId's you expect) in this table and only 1 MachineId plus an int (or smallint) in the MachineID table and then you can just join the tables.  You get reusability as now any table that requires MachineId can pull it from the MachineID table and if you need to add a new MachineID, you just have 1 table to update.

    As for fixing the column names, there are solutions if your software relies on the names and it is painful to change all of the software - rename the table and create a view that pulls the data from the table with the column names that the software expects.  Then rename the columns on the table.  It is a bit more work, but it will make maintaining that table a lot easier if you can give the columns meaningful names.  Although, if the columns can have multiple meanings based on product, then renaming the columns doesn't serve a good purpose.  In that case, I think re-writing the software to write to different tables for different products so you can give them meaningful names is likely a better solution.  But that is also a long term project.  I have a similar project to do where the columns were called "userDefined1" through "userDefined10" and depending on the product determines which of those fields are used and what they are used for.  Sometimes UserDefined2 is a serial number, sometimes it is a customer number, sometimes it is a test result.  I don't control the software unfortunately and I just inherited the database so there isn't a lot that I can do.  The business doesn't see the benefit in changing the software when things currently work.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • andis59 - Wednesday, April 5, 2017 8:15 AM

    Jeff Moden - Wednesday, April 5, 2017 7:50 AM

    andis59 - Wednesday, April 5, 2017 12:35 AM

    This is the current table (from MS SQL Server Management Studio) to which I need to add MachineId.

    I'm not seeing a "MachineID" in that table.  And... wow... just WOW!  Apparently the original designer of the table had no concept in the usefulness of meaningful names regardless of what the source was.  To be honest, that's the first thing I'd fix so that a Developer doesn't need to rely on tribal knowledge or external documentation to know what's in a given column.

    No, as I said I need to add this column.
    The first thought that went through my mind when I saw the tables, was to change the column names. But the names are used in many, many layouts (that define how and what should be marked), so it wasn't realistic to make this change. Also the columns are used for product specific data so the same column can have different meaning for different products...

    So, name the table something else and fix it.  Create a view with the original table names and columns and the existing apps will never know the difference.  Just make sure to follow the rules for creating an updateable view.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Wednesday, April 5, 2017 12:09 PM

    So, name the table something else and fix it.  Create a view with the original table names and columns and the existing apps will never know the difference.  Just make sure to follow the rules for creating an updateable view.

    The problem is that most of the columns can be used for anything, it's product specific. The only column name that I know could be changed is F1, which is Article Number.

  • andis59 - Wednesday, April 5, 2017 12:44 PM

    Jeff Moden - Wednesday, April 5, 2017 12:09 PM

    So, name the table something else and fix it.  Create a view with the original table names and columns and the existing apps will never know the difference.  Just make sure to follow the rules for creating an updateable view.

    The problem is that most of the columns can be used for anything, it's product specific. The only column name that I know could be changed is F1, which is Article Number.

    Heh... <headdesk><headdesk><headdesk><facepalm> 😉  Add an "Oh my" to my previous "Wow!" 😉  They sure did make things un-normalizable (to coin a phrase) for you.

    I'm thinking that your plan of using MachineID+F1+Kant for the PK and then indexing the other two columns as however you see fit is about the only thing that you can do to this table.  Other than those 3 columns, there's probably no chance at DRI in the form of FKs, either.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Wednesday, April 5, 2017 4:54 PM

    I'm thinking that your plan of using MachineID+F1+Kant for the PK //snip//

    He can't - 2 of those columns are nullable.
    Which also makes a unique constraint harder to implement, but not impossible.

  • It might not be a bad idea to turn this project of "add a column" into a full re-write of all of the software to utilize a better database design?  Keep the old table for historical purposes (if needed) and migrate to a completely new setup.

    It will be more work short term, but the long term benefits will be worth it in the end.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply