Introduction
Many of us have experience on more than one SQL-DBMS, but it is almost inevitable that you will become a specialist on one product or another. All SQL-DBMSs are, of course, based on the relational model as originally defined by Edgar Codd (how well SQL follows the relational model is another issue, which we won’t go into here). Though the basic principles of all products are the same, sometimes the details of particular implementations can lead to difficulties when you make the move from one product to another. I have worked with Sybase and Oracle in the past, but today I make my living with Microsoft SQL Server, so I am most up to date with that product and have the deepest knowledge of it.
It can be interesting to look at the way the different products solve the same problems, and to observe what features may be implemented in one and not in the other. One of these differences is the way that Oracle and SQL Server handle the reporting of multiple constraint violations from a single statement. The Oracle approach is elegant and straightforward, in SQL Server we have to put a lot of effort into devising a workaround.
Oracle doesn’t always do better in this sort of comparison; SQL Server specialists are often puzzled by the dire warning issued against using triggers by many Oracle experts. To those of you are familiar with SQL Server but not with Oracle, triggers in Oracle can be used effectively but it is considerably more challenging to use them correctly [1].
On the other hand Oracle specialists may be equally puzzled by what I am about to say about SQL Server.
The Problem
For the most part SQL Server deals fairly well with set based operations, but with constraints in general, some serious shortcomings quickly become apparent. Any SQL insert, update or delete command will cause an exception condition to be raised that returns one constraint violation message, no matter how many constraints are violated, how many different types of constraint are violated, or how many of the affected rows violate the constraint.
Before we go any further, let’s have a closer look at the kind of problems that SQL Server’s standard constraint violation reporting leads to.
We will use the following small database design for the examples.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
CREATE TABLE dbo.customer ( customer_id int NOT NULL, customer_name varchar(100) NOT NULL, CONSTRAINT "customer id must be unique" PRIMARY KEY ( customer_id ) ); INSERT INTO dbo.customer ( customer_id, customer_name ) VALUES ( 100, 'Siemens' ); CREATE TABLE dbo.sales_order( customer_id int NOT NULL, sales_order_number int NOT NULL, sales_order_date date NOT NULL CONSTRAINT sales_order_date_default DEFAULT getdate(), CONSTRAINT "sales order number must be unique" PRIMARY KEY CLUSTERED ( sales_order_number ), CONSTRAINT "sales order must have belong to valid customer" FOREIGN KEY(customer_id) REFERENCES dbo.customer (customer_id) ); CREATE TABLE dbo.product( product_name varchar(100) NOT NULL, CONSTRAINT "product name must be unique" PRIMARY KEY CLUSTERED ( product_name ) ); CREATE TABLE sales_order_line ( sales_order_number integer NOT NULL, product_name varchar(100) NOT NULL, quantity integer NOT NULL CONSTRAINT quantity_default DEFAULT 1, CONSTRAINT "order number and product name must be unique" PRIMARY KEY (sales_order_number, product_name), CONSTRAINT "sales order line must belong to valid order" FOREIGN KEY (sales_order_number) REFERENCES sales_order (sales_order_number), CONSTRAINT "sales order line must be for valid product" FOREIGN KEY (product_name) REFERENCES product (product_name) ); |
We now attempt the following insert:
1 2 3 4 5 6 7 8 9 10 11 12 |
insert into sales_order_Line ( sales_order_number, product_name, quantity ) values ( 100, 'Bolt', 100 ) |
Resulting in the following message:
1 2 3 |
Msg 547, Level 16, State 0, Line 3 The INSERT statement conflicted with the FOREIGN KEY constraint "sales order line must belong to valid order". The conflict occurred in database "playground", table "dbo.sales_order", column 'sales_order_number'. The statement has been terminated. |
Let’s say we now add the primary key reference to fix this problem:
1 2 3 4 5 6 7 8 9 10 |
INSERT INTO sales_order ( customer_id, sales_order_number ) VALUES ( 100, 100 ) |
Attempting the insert on sales_order_line again, we once again get a foreign key violation constraint message, but for a different foreign key.
1 2 3 4 5 6 7 8 9 10 11 12 |
insert into sales_order_Line ( sales_order_number, product_name, quantity ) values ( 100, 'Bolt', 100 ) |
1 2 3 |
Msg 547, Level 16, State 0, Line 3 The INSERT statement conflicted with the FOREIGN KEY constraint "sales order line must be for valid product". The conflict occurred in database "playground", table "dbo.product", column 'product_name'. The statement has been terminated. |
Only one key violation is reported at a time. The situation gets even worse if we are inserting multiple rows. First let’s fix the foreign key violation on product:
1 |
INSERT INTO product (product_name) VALUES ('bolt') |
Now we attempt to insert several rows into sales_product_line, some of which are valid and some of which will violate foreign key constraints:
1 2 3 4 5 6 7 8 9 10 |
insert into sales_order_Line ( sales_order_number, product_name, quantity ) values (100,'Bolt', 100), (100, 'Bolt', 20), (200, 'Bolt', 1), (100, 'screw', 4) |
We observe that the insert values contain not just foreign key violations, but also primary key violations.
When issuing the insert command, as usual, only one of the constraint violations is reported:
1 2 3 |
Msg 2627, Level 14, State 1, Line 1 Violation of PRIMARY KEY constraint 'order number and product name must be unique'. Cannot insert duplicate key in object 'dbo.sales_order_line'. The duplicate key value is (100, Bolt). The statement has been terminated. |
If your database is at the size of this demonstration, it would seem absurd to consider that this is a difficulty. However, with the enormous amounts of data being routinely imported into, or captured by, databases, a constraint violation exception can lead to a great deal of subsequent forensic work to find the rows that did it.
Our Oracle user friends would probably say at this point that they don’t understand what the problem is. We just use the LOG ERRORS clause on the insert. This feature has been available since Oracle 10g Version 2.
I created our test database in Oracle and issued the following commands:
1 |
EXECUTE DBMS_ERRLOG.CREATE_ERROR_LOG('sales_order_line', 'err$_sales_order_line'); |
This creates a table to hold the errors with the same columns as sales_order_line and some additional columns to hold the error information.
We then issue the insert with the LOG ERRORS clause added at the end (the syntax has been adapted as Oracle does not support multiple value lines in the insert statement, also I had to shorten the constraint names as Oracle will only accept 30 characters):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
insert into sales_order_Line ( sales_order_number, product_name, quantity ) select 100,'Bolt', 100 from dual union all select 100, 'Bolt', 20 from dual union all select 200, 'Bolt', 1 from dual union all select 100, 'screw', 4 from dual LOG ERRORS INTO err$_sales_order_line ('INSERT') REJECT LIMIT UNLIMITED; |
This gives us the following result:
1 |
ORA ERR NUMBERSORA ERR MESG$ORA ERR ROWIDSORA ERR OPTYPSORA ERR TAGSSALES ORDER NUMBERPRODUCT NAMEQUANTITY1ORA-00001: Unique Constraint (SYSTEM.order nr: prod name not unique) verletzt|INSERT100Bolt1001ORA-00001: Unique Constraint (SYSTEM.order nr: prod name not unique) verletzt|INSERT100Bolt202291ORA-02291: Integritats-Constraint (SYSTEM.sales ord line not valid order) verletzt - ubergeordneter Schlussel nicht gefunden|INSERT200Bolt12291ORA-02291: Integritats-Constraint (SYSTEM.sales ord line not valid prod) verletzt - ubergeordneter Schlussel nicht gefunden|INSERT100screw4 |
A Solution
There are ways of avoiding this problem in SQL Server. Some workarounds are:
- Using Stored procedures or Table-valued functions. Inserts and updates will be done by the application using a stored procedure. This would allow us to do whatever validation and checking is appropriate before the insertion is done.
- Doing inserts a row at a time in a try-catch block This solution has been described here on Simple Talk in Handling Constraint Violations and Errors in SQL Server
- The use of an INSTEAD OF trigger would allow us to find a way of executing this INSERT statement so that we get a report of all the constraint violations before we actually apply it.
The first two solutions will work well if you are in control of the import process. However, this isn’t always the case. By using the INSTEAD OF trigger, users can use their existing ways of inserting, deleting or updating data. In addition we can get much closer to the behaviour of the elegant Oracle solution. Clearly this cannot be achieved in a standard trigger on the table, because the database update has already taken place before the trigger has run and, if any constraints are violated, then the trigger code will never be executed.
Here is the INSTEAD OF insert trigger to report all the constraint violations for the primary key and the two foreign key constraints in a global temporary table.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
CREATE TRIGGER dbo_sales_order_line_constraint_check_insert ON dbo.sales_order_line INSTEAD OF INSERT AS DECLARE @error bit = 0 IF object_id('tempdb..##err_dbo_sales_order_line') IS NULL BEGIN CREATE TABLE ##err_dbo_sales_order_line( sales_order_number int NOT NULL ,product_name varchar(100) NOT NULL ,quantity int NULL ,spid integer NOT NULL ,constraint_name sysname NOT NULL) END DELETE ##err_dbo_sales_order_line WHERE spid = @@spid INSERT INTO ##err_dbo_sales_order_line(sales_order_number,product_name,quantity, spid, constraint_name) SELECT sales_order_number,product_name,quantity,@@spid,'order number and product name must be unique' FROM inserted WHERE EXISTS ( SELECT * FROM dbo.sales_order_line WHERE inserted.product_name = dbo.sales_order_line.product_name AND inserted.sales_order_number = dbo.sales_order_line.sales_order_number ) OR EXISTS (SELECT * FROM inserted sub WHERE inserted.sales_order_number = sub.sales_order_number AND inserted.product_name = sub.product_name AND (inserted.quantity != sub.quantity OR (inserted.quantity IS NULL AND sub.quantity IS NOT NULL) OR (inserted.quantity IS NOT NULL and sub.quantity IS NULL))) IF @@rowcount > 0 SET @error = 1 INSERT INTO ##err_dbo_sales_order_line(sales_order_number,product_name,quantity, spid, constraint_name) SELECT sales_order_number,product_name,quantity,@@spid,'sales order line must belong to valid order' FROM inserted WHERE NOT EXISTS (SELECT * FROM dbo.sales_order WHERE dbo.sales_order.sales_order_number = inserted.sales_order_number) IF @@rowcount > 0 SET @error = 1 INSERT INTO ##err_dbo_sales_order_line(sales_order_number,product_name,quantity, spid, constraint_name) SELECT sales_order_number,product_name,quantity,@@spid,'sales order line must be for valid product' FROM inserted WHERE NOT EXISTS (SELECT * FROM dbo.product WHERE dbo.product.product_name = inserted.product_name) IF @@rowcount > 0 SET @error = 1 IF @error = 0 BEGIN INSERT INTO dbo.sales_order_line SELECT DISTINCT * FROM inserted END ELSE BEGIN RAISERROR('Constraint violation(s) - SELECT * FROM ##err_dbo_sales_order_line WHERE spid = @@spid for details', 16, 1) END |
We now do the insert again:
1 2 3 4 5 6 7 8 9 10 |
insert into sales_order_Line ( sales_order_number, product_name, quantity ) values (100,'Bolt', 100), (100, 'Bolt', 20), (200, 'Bolt', 1), (100, 'screw', 4) |
We now get this result:
1 2 3 4 |
Msg 50000, Level 16, State 1, Procedure dbo_sales_order_line_constraint_check_insert, Line 43 Constraint violation(s) - SELECT * FROM ##err_dbo_sales_order_line WHERE spid = @@spid for details Msg 3609, Level 16, State 1, Line 1 The transaction ended in the trigger. The batch has been aborted. |
We can now execute the query from the RAISERROR message:
1 |
SELECT * FROM ##err_dbo_sales_order_line WHERE spid = @@spid |
Which now gives us all the rows that violated constraints and the corresponding constraint name.
1 2 3 4 5 |
sales_order_number product_name quantity spid constraint_name 100 Bolt 100 53 order number and product name must be unique 100 Bolt 20 53 order number and product name must be unique 200 Bolt 1 53 sales order line must belong to valid order 100 screw 4 53 sales order line must be for valid product |
I have deliberately used temporary tables here as the we are probably only concerned with the results during the current session and we can save ourselves a lot of housekeeping by not holding the data permanently. Of course triggers for update and delete would also be required.
Now I can think of a number of objections to this approach.
- Maintenance First of all, maintaining these triggers looks like a lot of work. We can ease the burden by generating the triggers automatically. The trigger code above was generated by a query (you can find the code for the query at the end of the article). Furthermore, the trigger could be regenerated in a DDL trigger whenever an ALTER TABLE is issued against the table. This way the trigger will always be in step with the constraints defined in the database.
- Scalability. If we do a mass update and every row violates every constraint on the table then we could be faced with a massive number of output rows from the constraint reporting. It would be useful to have some way we could switch off or limit the number of rows returned. We could hard code a limit into the triggers but this seems a little arbitrary and may lead to misleading results.
Another refinement would be to have the INSTEAD OF triggers on views rather than on the tables themselves. If the view was updated then all the constraint violations would be reported, a direct update on the table would only report the first constraint on the first row that violated that constraint. Of course this would mean maintaining a whole set of views with the same structure as the underlying tables, but here again DDL triggers could come to our aid to modify the views automatically whenever the table structure changes (we would need a completely consistent naming convention for the views to achieve this).
Conclusion
As we have seen, different SQL-DBMS products can vary considerably in how they handle particular issues. Though it can be hard enough to keep up to date on the details of our “home” product, it can be very interesting to have a heads up from time to time and look at how the other half lives. Sometimes the grass on the other side of the fence is greener, but sometimes the concrete is grayer.
I think we can be fairly hopeful that Microsoft will eventually bring a similar kind of constraint reporting into the core product, thus matching the Oracle functionality, but until that time I hope that this article has provided some ideas about how a workaround could be developed.
References
[1] Oracle: Triggers Considered Harmful, Considered Harmful
Acknowledgments
Though I have been thinking about this problem for some time, I was prompted into action by the following article on Fabian Pascal’s dbdebunk website :
Weak Entities, Referential Constraints & Normalization
Appendix: Code to Generate Insert Trigger
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 |
ALTER VIEW generate_insert_constraint_reporting_trigger AS SELECT table_schema, table_name, /******************************************************** * Generate the create or alter trigger statement ********************************************************/ CASE WHEN EXISTS ( SELECT * FROM sys.triggers WHERE name = tables.table_schema + '_' + tables.table_name + '_constraint_check_insert' ) THEN 'ALTER' ELSE 'CREATE' END + ' TRIGGER ' + tables.table_schema + '_' + tables.table_name + '_constraint_check_insert ON ' + tables.table_schema + '.' + tables.table_name + ' INSTEAD OF INSERT ' + CHAR(10) + 'AS' + CHAR(10) + 'DECLARE @error bit = 0' + CHAR(10) + /*************************************************** * Create global temporary table for results ***************************************************/ 'IF object_id(''tempdb..##err_' + tables.table_schema + '_' + tables.table_name + ''') IS NULL' + CHAR(10) + 'BEGIN' + CHAR(10) + CHAR(9) + 'CREATE TABLE ##err_' + tables.table_schema + '_' + tables.table_name + '(' + CHAR(10) + CHAR(9) + ( STUFF( ( SELECT CHAR(10) + CHAR(9) + ',' + column_name + char(9) + data_type + CASE WHEN data_type in ('varchar', 'nvarchar', 'char', 'nchar', 'binary', 'varbinary') THEN CASE WHEN CHARACTER_MAXIMUM_LENGTH = -1 THEN '(max)' ELSE '(' + CAST(CHARACTER_MAXIMUM_LENGTH as varchar) + ')' END WHEN data_type in ('numeric', 'decimal') THEN '(' + CAST(numeric_precision as varchar) + ',' + CAST(numeric_scale as varchar) + ')' WHEN data_type = 'float' THEN '(' + CAST(numeric_precision as varchar) + ')' ELSE '' END + CASE WHEN IS_NULLABLE = 'YES' THEN ' NULL ' ELSE ' NOT NULL' END FROM INFORMATION_SCHEMA.columns WHERE columns.TABLE_NAME = tables.table_name FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)' ), 1, 3, '') ) + CHAR(10) + CHAR(9) + ',spid integer NOT NULL' + CHAR(10) + CHAR(9) + ',constraint_name sysname NOT NULL' + ')' + CHAR(10) + 'END' + CHAR(10) + 'DELETE ##err_' + tables.table_schema + '_' + tables.table_name + ' WHERE spid = @@spid' + CHAR(10) + /*************************************************** * Generate the primary key constraint check ***************************************************/ (SELECT 'INSERT INTO ##err_' + tables.table_schema + '_' + tables.table_name + '(' + STUFF ( ( SELECT ',' + column_name FROM INFORMATION_SCHEMA.columns WHERE table_schema = tables.table_schema AND table_name = tables.table_name FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)' ), 1, 1, '' ) + ', spid, constraint_name' + ')' + CHAR(10) + 'SELECT ' + STUFF ( ( SELECT ',' + column_name FROM INFORMATION_SCHEMA.columns WHERE table_schema = tables.table_schema AND table_name = tables.table_name FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)' ), 1, 1, '' ) + ',@@spid,''' + constraint_name + ''' FROM inserted WHERE EXISTS ( SELECT * FROM ' + tables.table_schema + '.' + tables.table_name + ' WHERE ' + STUFF(( SELECT ' AND inserted.' + column_name + ' = ' + tables.table_schema + '.' + tables.table_name + '.' + column_name FROM INFORMATION_SCHEMA.key_column_usage WHERE table_constraints.constraint_name = key_column_usage.constraint_name AND table_constraints.table_name = tables.table_name AND table_constraints.table_schema = tables.table_schema FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)') ,1,4,'') + CHAR(10) + ')' + CHAR(10) + /********************************************************** * There may be duplicates not just in the existing data * but in the inserted data too. **********************************************************/ ' OR EXISTS (SELECT * FROM inserted sub WHERE' + STUFF(( SELECT CASE WHEN KEY_COLUMN_USAGE.column_name IS NULL AND columns.IS_NULLABLE = 'YES' THEN ' AND (inserted.' + columns.column_name + ' != sub.' + columns.column_name + CHAR(10) + ' OR (inserted.' + columns.column_name + ' IS NULL AND sub.' + columns.column_name + ' IS NOT NULL)' + ' OR (inserted.' + columns.column_name + ' IS NOT NULL and sub.' + columns.column_name + ' IS NULL))' WHEN KEY_COLUMN_USAGE.column_name IS NULL AND columns.IS_NULLABLE = 'NO' THEN ' AND inserted.' + columns.column_name + ' != sub.' + columns.column_name + CHAR(10) ELSE ' AND inserted.' + columns.column_name + ' = sub.' + columns.column_name + CHAR(10) END FROM INFORMATION_SCHEMA.columns LEFT JOIN INFORMATION_SCHEMA.key_column_usage ON columns.column_name = key_column_usage.COLUMN_NAME AND table_constraints.constraint_name = key_column_usage.constraint_name WHERE table_constraints.table_name = tables.table_name AND table_constraints.table_schema = tables.table_schema AND columns.table_schema = tables.table_schema AND columns.table_name = tables.table_name FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)') ,1,4,'') + ')' + CHAR(10) + 'IF @@rowcount > 0 SET @error = 1' + CHAR(10) FROM INFORMATION_SCHEMA.table_constraints WHERE tables.TABLE_SCHEMA = table_constraints.table_schema AND tables.table_name = table_constraints.table_name AND table_constraints.constraint_type = 'primary key' ) + CHAR(10) + /****************************************************** * Generate the foreign key constraint check ******************************************************/ COALESCE(( SELECT 'INSERT INTO ##err_' + tables.table_schema + '_' + tables.table_name + '(' + STUFF ( ( SELECT ',' + column_name FROM INFORMATION_SCHEMA.columns WHERE table_schema = tables.table_schema AND table_name = tables.table_name FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)' ), 1, 1, '' ) + ', spid, constraint_name' + ')' + CHAR(10) + 'SELECT ' + STUFF ( ( SELECT ',' + column_name FROM INFORMATION_SCHEMA.columns WHERE table_schema = tables.table_schema AND table_name = tables.table_name FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)' ), 1, 1, '' ) + ',@@spid,''' + name + ''' FROM inserted ' + CHAR(10) + ' WHERE NOT EXISTS (SELECT * FROM ' + object_schema_name(referenced_object_id) + '.' + object_name(referenced_object_id) + ' WHERE ' + CHAR(10) + STUFF(( SELECT ' AND ' + object_schema_name(referenced_object_id) + '.' + object_name(referenced_object_id) + '.' + col_name(referenced_object_id, referenced_column_id) + ' = ' + 'inserted.' + col_name(parent_object_id, parent_column_id) FROM sys.foreign_key_columns WHERE foreign_keys.object_id = foreign_key_columns.constraint_object_id FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)' ),1,4,'') + ')' + CHAR(10) + 'IF @@rowcount > 0 SET @error = 1' + CHAR(10) FROM sys.foreign_keys WHERE OBJECT_SCHEMA_NAME(parent_object_id) = tables.table_schema AND object_name(parent_object_id) = tables.table_name FOR XML PATH(''),TYPE).value('.','VARCHAR(MAX)' ), '') + CHAR(10) + /********************************************** * If not constraint violations found then * do the insert, otherwise raise an error **********************************************/ 'IF @error = 0 ' + CHAR(10) + 'BEGIN' + CHAR(10) + CHAR(9) + 'INSERT INTO ' + tables.table_schema + '.' + tables.table_name + CHAR(10) + CHAR(9) + 'SELECT DISTINCT * FROM inserted' + CHAR(10) + 'END' + CHAR(10) + 'ELSE' + CHAR(10) + 'BEGIN' + CHAR(10) + CHAR(9) + 'RAISERROR(''Constraint violation(s) - SELECT * FROM ##err_' + tables.table_schema + '_' + tables.table_name + ' WHERE spid = @@spid for details'', 16, 1)' + CHAR(10) + CHAR(9) + 'ROLLBACK' + CHAR(10) + 'END' as trigger_sql FROM information_schema.tables |
Here is a test example:
1 2 3 |
select trigger_sql from generate_insert_constraint_reporting_trigger where table_name = 'sales_order_line' |