27 October 2016

21735 views

155 0

Triggers: Threat or Menace?

Triggers are generally over-used in SQL Server. They are only rarely necessary, can cause performance issues, and are tricky to maintain If you use them, it is best to keep them simple, and have only one operation per trigger. Joe Celko describes a feature of SQL that 'gets complicated fast'.

The original SQL-86 standard did not have TRIGGERs. In fact it did not have much of anything! My favorite omission to the standard was that you could grant privileges to a user, but you could not revoke them. Talk to security officers about this and see what they’d have to say about that!

We were trying to get a skeleton standard into the public domain so as to make sure that SQL did not suffer the fate of BASIC. This early popular programming language quickly mutated into a host of dialects, and never really converged back to a single ANSI/ISO Standard BASIC, so it was important to get something out there.

Trigger is a schema-level construct that has a body of procedural code. It is attached to one and only one table (the ‘subject table’ in SQL standard lingo). This body of code is executed when a ‘schema event’ occurs. These events are insertion, deletion, and updating; things that change the content of the subject table. Queries do not change the contents of the database, so they cannot fire a TRIGGER without an event.

… ‘My advice is to avoid using TRIGGERs’

Triggers could not have been added to the language initially. We did not have the SQL/PSM (Persistent Stored Modules, the official procedural language in the standards). Vendors, however, each had their own local proprietary 4GLs. Oracle uses PL/SQL, DB2 uses SQL/PSM and PL/SQL, Informix had Informix 4GL and, of course, SQL Server has T-SQL.

When we were setting up the SQL-92 standard, we deferred some things and TRIGGERs was one of them. We really did not think the TRIGGERs would come to the forefront so soon and we wound up having to add this in the SQL:1999 standard, based on what vendors had been doing for the past several years.

A stored procedure is relatively easy to invoke. In the PSM, you use “CALL <procedure name> (<parameter list>)”, pretty much like any other programming language. But TRIGGERs get complicated fast. When is the TRIGGER executed, relative to the event? Is it before or after the event (BEFORE and AFTER are SQL keywords)? If I have more than one TRIGGER on the same table, in what order they executed? The model used in SQL Server is fairly simple, but the full ANSI/ISO Standard is elaborate.

Triggers have to follow one of the basic principles of RDBMS, namely that when the transaction is finished, all of the constraints are true. Oh; for the record, all constraints are always TRUE on an empty table. But this principle does not prevent an intermediate step in the execution of TRIGGERs from violating constraints.

Finally, is the procedural code applied to the whole table or row by row? The ANSI/ISO Standard lets you pick how the code is executed. If you want more control, you can also restrict the columns in an UPDATE FOR <column list>” within the TRIGGER. SQL Server, however, has only table level TRIGGERs. This model often confuses new SQL programmers, whose mindset is still back in procedural code and a record-at-a-time processing from their old languages.

I am not going to get into the details of this, but this procedural code does not even have to be in SQL. That can be a total mess. Since an external procedure is not in the scope of the database session, you cannot get any kind of error message or feedback.

But even if you do not leave SQL, one TRIGGER can change its subject table, which in turn will cascade to a second table via DRI actions and/or a second TRIGGER. External procedures can pretty much do it anything in the application. Those external events can shut down the entire application, move it to another computer, etc. in short, all bets are off.

It is quite possible to build an elaborate, tangled web of TRIGGERs. It is also probably an incredibly bad idea! The SQL engine will have to store all of the information it needs to do rollbacks, so performance is never going to be good. Then you have to understand all of the possible execution paths that can occur depending on which subject table you touch.

You can do all of the usual things with TRIGGERs that you do with other schema objects and constraints.

DISABLE TRIGGER
ENABLE TRIGGER

Triggers can be enabled and disabled, like named constraints, but this has the same problems of other constraints. If, by doing so, you bring the system to an invalid configuration, then you can destroy the integrity of your schema and bring it to a state that does not match the constraints you wanted to enforce.

CREATE TRIGGER
ALTER TRIGGER
DROP TRIGGER

You can create, alter and drop TRIGGERs. However, remember the order of execution problem. If you drop a TRIGGER then re-create it, you might change the order of execution. I am not sure if the ALTER statement has the same problem to be honest.

DRI Actions

While we were looking at TRIGGERs, we got proposals for DRI (Declarative Referential Integrity) actions. You should be very familiar with this feature, by now. Here is some BNF:

<column_constraint> ::=

[ CONSTRAINT constraint_name ]

{ PRIMARY KEY | UNIQUE }

| [ FOREIGN KEY ]

REFERENCES [ schema_name. ] referenced_table_name [ ( ref_column ) ]

[ ON DELETE { NO ACTION | CASCADE | SET NULL | SET DEFAULT } ]

[ ON UPDATE { NO ACTION | CASCADE | SET NULL | SET DEFAULT } ]

The FOREIGN KEY references can be named and can be made up of one or more columns in the referencing table at have to match to the UNIQUE or PRIMARY KEY columns in the referenced table.

The only two database events that DRI actions track are in deletion and update. If you think about it for a minute, having an “ON INSERT” just would not work. When there is a change in the referenced table, then a referencing table event must change the referenced table. That means the that you would be able to place an order on your website without being required to put anything into the (order details) basket beforehand.

NO ACTION. This is the default if you did not have an ON clause. You get an error message and a rollback from the system. In the example I’ve just mentioned of orders and order detail, this transaction would roll back if you try to put an item into the details of an order that does not exist. Use it in the way that we do with the “ELSE NULL END” in a case expression; these empty markers are good programming technique.
CASCADE. If this option is used with an ON DELETE clause, then all of the matching rows in the referencing table are also deleted. In the example of orders and order details, if you try to put an item into the details of an order that does not longer exist, then this transaction would roll back. This is how you enforce strong and weak entity relationships.
SET NULL. The corresponding rows in the referencing table have the matching columns set to NULL. This obviously requires that those columns be NULL-able. Then the question is, “does it make sense?” It depends on the context.
SET DEFAULT. The corresponding rows of referenced table have the matching columns set to their default values. This obviously requires those those columns have a default. As an aside, the data types in both the referenced and referencing tables should match exactly; the compiler has to go through a lot of trouble to do casting.

The full ANSI/ISO version of this has more elaborate options for doing matching between the two tables. But I found that in practice, a PARTIAL match is rarely used.

Because the TRIGGER is procedural code, there is no real way to optimize it. However, DRI actions have a fixed structure and we can build it into the SQL engine. This also gives information to the optimizer; if I know there is a reference, then I know something exists in both the referenced and referencing tables. I do not have to do any extra work to validate this.

Avoid Using Triggers

My advice is to avoid using TRIGGERs. Here is a quick list of reasons why:

The dialects and options vary so much from SQL to SQL that your TRIGGERs will not port or even readable to new programmers in your shop.
Triggers are easy to mess up. When you look at DRI actions, there is no large elaborate body of procedural code in a proprietary dialect to understand. If data is corrupted in some way, then disabling TRIGGERs can be difficult or dangerous.
If your TRIGGER is incorrectly written to handle only single-row operations, or MERGE statements correctly, then finding the problem is very difficult in T-SQL.
Triggers have a negative impact on performance. If you do not need a TRIGGER, then do not write one, obviously. Are you doing something with a TRIGGER that should have been done in another tier of your application? In particular, you will see people put things it should have been in an input procedure into a TRIGGER; they are basically doing the editing and data scrubbing that should have been done before the data got to the database. In T-SQL, you have to fire the TRIGGER on the whole table. When it failed you have to do a rollback or lock the database while you apply those corrections. This can be a huge performance hit.
The order of execution of multiple TRIGGERs is not obvious. You need to get to schema information data, and most SQLs default to creation order. In T-SQL you can use sp_SetTrigOrder to determine the first and the last TRIGGERs to fire. There is cute trick if you have only three TRIGGERs in which you set the first and last, then let the third TRIGGER be forced to the second position.
Do not use a TRIGGER to replace a constraint. If a business rule can be implemented with a constraint, then that is always the better option. A CHECK () constraint is checked at the row level at insertion time, not the table level over and over (in full ANSI/ISO SQL, we do have schema level constraints, the CREATE ASSERTION statement, but that is another topic).

INSTEAD OF Triggers

In SQL Server, all of the triggers occur after the event. Other products have triggers that occurred before the event. In addition to the after triggers. However there something else you need to be aware of called an INSTEAD OF trigger.

But let us go back a bit, to college database classes and one of the big debates. The view is a virtual table and one of our goals was to ensure that they are treated as much like base tables as possible. This also means that views should be updateable.

Originally, we defined an updateable view in the ANSI/ISO standards as being on one and only one base table, with keys that allow us to resolve any reference to a row, to one and only one unique row in the base table. This is very minimal, and everybody could implement it. If a vendor wanted to give you more, this was fine with us.

Achieving a general VIEW updateability is known to be an NP complete problem at best and impossible in many general cases. In plain English, it means the problem of setting up rules that everybody would agree on for views is impossible even in theory.

Let us make that a little easier to see. Let us create some skeleton Personnel tables like this:

CREATE TABLE Personnel

(emp_id CHAR(15) NOT NULL PRIMARY KEY,

gender_code CHAR (6) NOT NULL

CHECK (gender_code IN ('male', 'female'))

);

And, yes, this is really a terrible design, but bear with me. I can now create two views like this:

CREATE VIEW Male_Personnel

SELECT emp_id FROM Personnel WHERE gender_code = 'male';

CREATE VIEW Female_Personnel

SELECT emp_id FROM Personnel WHERE gender_code = 'female';

Obviously I can assemble the original Personnel table with a UNION or UNION ALL the two views.

In fact, if I have any two of the three tables/views involved, I can reconstruct the third. All I need is to use an EXCEPT operator on Personnel. This design error is called “attribute splitting” and it refers to taking the values of an attribute and making them into multiple columns or even tables, like we have done here.

But what if I started off not with views, but with improperly designed base tables? Then I could make “Personnel” into a view with the union. But there are problems if I try to update base tables through the union to view. It gets even worse because they could create a view using an IN() predicate that is logically equivalent to a union.

CREATE VIEW Texas_Oklahoma_1 -- not updateable

SELECT X.*

FROM (SELECT * FROM Customers WHERE state_code = 'TX'

UNION ALL

SELECT * FROM Customers WHERE state_code = 'OK')

AS X;

CREATE VIEW Texas_Oklahoma_2 -- updateable

SELECT X.*

FROM Customers

WHERE state_code IN ('TX', 'OK');

There are also restrictions on the nature of the columns in an updateable view. Deterministic and non-deterministic columns, certain data types and so forth present problems. You just need to read the books online for SQL Server and learn what to avoid. It may get better later, so look again a new release comes out.

The other problem with INSTEAD OF triggers is complexity. The view is a virtual table and has no physical existence. That means you have to be able to get down to the base tables and change them in some way; those changes will then be reflected when the view is invoked again. Well, at least we hope so. As you can imagine, there is a lot of overhead with this.

Programming Tips

When I work with TRIGGERs, which is rare, I normally limit them to at most one per operation per TRIGGER (so one for INSERTs, one for UPDATEs, and one for DELETEs), so instead of adding an extra TRIGGER I would add extra logic to the existing TRIGGER.

Do not use TRIGGERs for auditing. One of the principles of security and audit work is that such processes are to be kept external to the thing being audited. This is done for the same reason that you do not create backups on the same hard disk as the data. What happens when the database is destroyed?

This is why you need an external third-party audit package. It will keep you out of jail and maintain its independent audit trails. It also makes the third-party vendor liable for any problems that you encounter

Triggers: Threat or Menace?

DRI Actions

Avoid Using Triggers

INSTEAD OF Triggers

Programming Tips

Subscribe for more articles

Rate this article

Joe Celko

Related articles

Snake draft sorting in SQL Server, part 2