Stairway to Database Design

Stairway to Database Design Level 7: Triggers

,

In levels one to four, we built the tables, base and virtual, of a schema. Levels five and six dealt with stored procedures. This level deals with a feature you need to avoid as much as possible. This is article is on Triggers.

You are usually told that a trigger is a "special kind" of stored procedure, but that is not quite right. It has no parameters, you cannot invoke it and it has local pseudo-tables that exist nowhere else. Before going any further, let's stop and discuss the SQL model for changing data in a base table.

INSERT, UPDATE and DELETE

SQL is a set-oriented language, so it changes tables with sets of data, all at once. It is not a record-oriented language which would handle data changes one record at a time, in sequence (think about magnetic tapes and punch cards). The three basic operations that change the contents of a table are INSERT, UPDATE and DELETE.

Insertion is the easiest of them to understand. This is the conceptual model and not the actual implementation. You put a set rows into an existing table. But there is more to it than that. The set goes in as a whole unit at one time, so any function calls with constant arguments such as CURRENT_TIMESTAMP have the same value for all the new rows. The IDENTITY table property (property, not column) violates this model by queuing the rows and attempting to write them while incrementing a table-level counter. The IDENTITY values are non-deterministic, depending on the indexing and physical state of the machinery at insertion time.

The new rows are kept in a pseudo-table named NEW in the ANSI/ISO Standards and INSERTED in T-SQL. If the insertion would put the table in a state that violates any constraints, then the transaction is rolled back by the system and you get an exception message. That means the new rows never got into the table (however, the IDENTITY property increments).

Deletion is similar. All the rows in the base table that match the WHERE clause of the DELETE FROM statement are put into a pseudo-table named OLD in the ANSI/ISO Standards and DELETED in T-SQL. If the deletion would put the table in a state that violates any constraints, then the transaction is rolled back by the system and you get an exception message. That means the old rows never left the table.

In practice, most SQL engines mark the old rows in one pass, check the constraints and remove them in a final pass.

Updating (updation? we need a word) is a combination of delete and insert. The conceptual model is that we look at the WHERE clause and build the DELETED pseudo-table, just like a DELETE FROM. Then we look at the SET clause build a new INSERTED pseudo-table.

Tables are made up of rows, rows are made up of columns. Each assignment in the SET clause is done at the same time. If a column is missing from the SET clause, the SQL engine effectively creates a "SET <column name> = <column name>" -- a do-nothing assignment. This is not like procedural languages that process from left to right. This statement will swap the values in columns a and b because it happens all at once.

UPDATE Foobar

SET a = b,

b = a;

This procedural statement will leave columns a and b set to the same value, because it executes from left to right.

BEGIN

SET a = b;

SET b = a;

END;

The DELETED rows are removed and the INSERTED rows are persisted in the table, all at once. If the update would put the table in a state that violates any constraints, then the transaction is rolled back by the system and you get an exception message. That means the table never changed.

Database Events

A TRIGGER is a body of code attached to one and only one table. But a table can have multiple triggers. There is a CREATE TRIGGER statement because it is a persisted schema object. The basic T-SQL syntax is pretty straight forward:

CREATE TRIGGER <trigger name>

ON <table or view name>

{FOR | AFTER | INSTEAD OF} {[INSERT] [,] [UPDATE] [,] [DELETE]}

AS

<trigger body>;

The trigger name and the table or view name explain themselves. The keyword AFTER and FOR are equivalent, but AFTER is more descriptive (other SQLs have BEFORE triggers). We will get to the INSTEAD OF option in a little while.

An INSERT, UPDATE or DELETE is called a database event or action. SELECT statements, DROP and ALTER are not database events. When one of these operations are done on the table, the trigger is fired, after the successful completion of the database action. The code in the trigger body is executed once at the table-level for each database event.

Watch out. Your application updates table T1, which fires trigger TR1, which updates table T1 itself. Because table T1 was updated, trigger TR1 fires again, and so on. Perhaps endlessly. But it does not have to stop there. Your trigger can cause triggers on other tables to fire. Triggers can also be recursive. Imagine an application updates table T1, which fires trigger TR1, which updates table T2. A trigger on T2 then fires and updates table T1 again. This pattern could be extended to (n) tables, never leaving the cycle.

The SQL engine has to track all of these changes so it can do a rollback if anything goes wrong. Avoid writing code like this; it is expensive, slow and can lock up the database. It is also a pain to maintain.

Trigger Body

The trigger body is a block of T-SQL procedural statements. But there are subtle differences. You cannot pass arguments to the trigger body, like a stored procedure. The trigger body has access to the INSERTED and DELETED pseudo-tables. You can alias these pseudo-tables, if you wish.

There are also special logical functions of the form: UPDATE (<column name>) and COLUMNS_UPDATED(). These test to see if an UPDATE FROM or INSERT INTO statement changed one or more columns in their argument list. This is a proprietary feature, so I will not discuss it in detail, but quick example might be:

CREATE TRIGGER No_Embezzlement_Trigger

ON Payroll

AFTER UPDATE

AS

IF UPDATE(payroll_amt)

BEGIN

RAISEERROR ('You cannot give yourself a raise');

ROLLBACK TRANSACTION;

END;

INSTEAD OF Triggers

Updating a view is known to be an NP-complete problem. In English that means we know that there is no general way to do it in a reasonable amount of time as teh problem gets larger.

INSTEAD OF triggers are the way we update VIEWs and get around the mathematical limits of RDBMS. An INSTEAD OF trigger performs the trigger body instead the database action that fired it. This might be easier to explain with an example. Let assume we have a VIEW defined as a join and aggregate functions on two tables:

CREATE VIEW SalesSummary (order_nbr, order_amt_tot)

AS

SELECT O.order_nbr, O.customer_name, SUM(D.unit_price * D.order_qty)

FROM Orders AS O, Order_Details AS D

WHERE O.order_nbr = D.order_nbr

GROUP BY O.order_nbr;

If I have a trigger for a database event against SalesSummary, it will fail without an INSTEAD OF trigger. This VIEW has a join, computation, and an aggregation to make sure it is not updatable. But an INSTEAD OF TRIGGER is not a BEFORE trigger! A true BEFORE trigger would execute its code, then try to complete the database event. The INSTEAD OF trigger would execute its code and it is done.

Remember that a VIEW is virtual; it has no physical, persisted existence. That means you do not get a DELETED and INSERTED pseudo-table to use. Furthermore, you cannot get a parameter from the INSERT INTO, DELETE FROM or UPDATE statements. All the code has to work on base tables.

The INSTEAD OF trigger can also be used on base tables as well as views. It is not used this way very often. The T-SQL idiom is to do an AFTER trigger, check the results and perhaps do a correction or ROLLBACK if there are problems.

Triggers for Auditing

A common trick is to use triggers to collect audit data in one or more tables. This is generally not a good idea. It slows down the performance of the application by adding extra disk writes. There is no trigger for a SELECT statement, so you cannot track who is looking at the data. HIPAA (Health Insurance Portability and Accountability Act) and many other laws require that there is a record of every viewing of the data.

But more than that, a basic principle of auditing is that the audit is separated from the thing being audited. For example, when a purchase order creates a shipment, the person who does the shipping is not the same person who approves purchase orders. The temptation to ship packages to yourself is avoided this way.

Imagine a table with a column for the date of the event and the user id of the employee updating a row, which is completed by a trigger. This sound good until you think about it. The trigger is always there and cannot be subverted. Opps! If you delete the row, the audit data goes with it. If someone has access to the audit columns, they can be updated to any value.

Third party audit tools use the transaction log file which the server is creating for recovery and the network traffic to capture all the actions that need to be audited and keeps that data secure and physically separated from the rest of the database. This will pass legal tests. Remember that ROI means "Risk of Incarceration" and not "Re urn on Investment" in America today.

Triggers for Data and Referential Integrity

If you have old SQL code that has been ported to a newer release of SQL, it is worth looking to see if there are triggers that can be replaced with declarative referential integrity (DRI) constraints and actions. This was the original purpose  of triggers. For example, when an order was deleted in the Orders table, the trigger would also delete all of its details rows in the Order_Details table. Ask an old-timer how much time was spent trying to catch orphaned rows when we forgot the triggers or got them wrong.

Today, most (not all) of those integrity triggers can be replaced by declarative DRI Actions. They perform a simple action for DELETE and UPDATE action. The actions are option clauses on the DDL. The full syntax is:

FOREIGN KEY (<referencing table column list>)

REFERENCES <referenced table name> (<referenced table column list>)

[ON UPDATE | ON DELETE][NO ACTION | CASCADE | SET NULL | SET DEFAULT]

NO ACTION: An error message tells the user that the action is not allowed and we get a ROLLBACK.

CASCADE: Deletes or updates all rows containing data involved in the foreign key relationship.

SET NULL: Sets the referencing columns to NULL. This assumes that all foreign key columns for the table can accept NULLs.

SET DEFAULT: Sets the referencing table columns to the default value defined for them. This assumes all columns for the table have defaults defined for them.

Use a trigger only if the integrity rule is complicated. One example I can think of involved re-distribution of values over groups when a member was inserted or deleted. Even then, consider putting it into a stored procedure.

T-SQL also has extensions for (DDL Triggers. These are fired by DDL events rather then DML events --  CREATE, ALTER, DROP, GRANT, DENY, REVOKE or UPDATE STATISTICS statements.

Multiple Triggers for the Same Event

It is legal to have more than one trigger for the same database event. It is not a good idea, but it is legal in T-SQL. Try to do the job in one trigger so it is easy to maintain.

By default, multiple triggers in SQL Server table for the same action are non-deterministic. However, it's possible to declare the firing order for two AFTER triggers using system stored procedure sp_settriggerorder. This procedure does not work with INSTEAD OF triggers.

The syntax is pretty straight forward:

EXEC sp_settriggerorder

@triggername = <explains itself>,

@order = [FIRST|LAST|NONE], -- firing order

@stmttype = [INSERT|UPDATE|DELETE], --trigger type

@namespace = [DATABASE|SERVER|NULL] ; -- explains itself

the parameter @order indicates whether the trigger should fire FIRST or LAST. If NONE is specified, then no order is enforced and we are back to the default behavior. It is also obvious that you cannot have two FIRST or two LAST triggers.

However, if you have a third trigger which is neither FIRST nor LAST, then he is forced into the middle of the firing sequence.

Conclusions

Everyone, not just T-SQL, has proprietary implementations of triggers. So they do not port. Some of their behavior can be non-deterministic, so they are hard to debug. They don't tell the optimizer anything it can use, like DRI actions do. And yet you may well find that they are safest way to assure complicated data integrity rule. Oh, darn!

This article is part of the parent stairway Stairway to Database Design

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating