SQLServerCentral Article

Overview of Performance

,

There are bunches of articles and books about database’s, application’s, and other types of performance. I would like to overview some interesting aspects related to the subject of performance or Transact SQL behavior in batch processes and in stored procedures.

Let's start from a simple question. One stored procedure runs for 120 seconds and a modified procedure runs for 90 seconds. Is the second one performing better?

This is not a simple question and can’t be answered only based on run time. There are other factors that affect the answer especially if environment where a stored procedure is running is heavily transactional.

Let see some examples that will demonstrate how to analyze performance. Lets assume that both procedures should change the same table(s) after some data verification. Very often I see procedures that start a transaction at the very beginning of procedure and commit/rollback it at the very end.

For example:

Create procedure proc_1
As
Begin
  Begin transaction
  -- step 1 – verify the data
  -- step 2 – perform calculations
  -- step 3 – get default variable values (date, user info…)
  -- update/insert records
  commit
end

Is there a problem? The procedure keeps the transaction active and locks tables at run time longer than it should.

The second procedure have a slightly modified logic – it will run for the same amount of time but has less interference with other users because of shorter locking time – transaction is shorter.

Create procedure proc_1
As
Begin
  -- step 1 – verify the data
  -- step 2 – make calculations
  -- step 3 – get default variable values (date, user info…)
  Begin transaction
    -- update/insert records
  commit
end

Now, the first approach may be justified at times, for example when one need to preserve the data changes while it runs, but in most situations this is not a case. For instance, when the last statement represents an entry into process history tables or data verification made against tables that are not a part of modification statements.

Let me give you another example. One of my clients asked me to check the process they have had a problem with. The process was locking other processes and sometimes was finishing on a deadlock. It had the following logic for the load of about 2,000,000 rows per week:

  1. Load data from a file into permanent temporary table.
  2. Truncate permanent temporary table
  3. load the data from an intermediate table into the permanent temporary table
  4. Based on the joint conditions insert records into permanent table from permanent temporary table
  5. Using the permanent table insert records into multiple tables

Step 4 was running for about 15 minutes. Permanent table had only one index based on primary key. But for all 15 minutes other tables from joints (most joints were based on non primary key fields’ relation) were locked. I add some steps to the process to drop all indexes on permanent temporary table before the insertion and create about 12 indexes after insertion.

The process started to look as following:

  1. Load data from file into permanent temporary table.
  2. Truncate permanent temporary table
  3. Drop all indexes from permanent temporary table
  4. From intermediate table load the data into the permanent temporary table
  5. Create about 12 indexes on permanent temporary table
  6. Based on the joint conditions insert records into permanent table from permanent temporary table
  7. Using the permanent table insert records into multiple tables

The new steps did not change the total time for the process. But step 6 (old step 4) started running for about 3 minutes instead of previous 15. What was done? Time was shifted from one process step to another process step – from insert statement to index creation step. It added time for the index creation and locking time for permanent temporary table (remember that the table was not used by any other process or application) but significantly decrease locking time for the permanent table. This method allows keeping production tables locking for a shorter period of time.

And this is one of the main points for the process optimization. Not always you can make total process time shorter. Instead, it is possible to shift running time from one part of the process to another part with an ability to make fewer locks and perform better in multi-user environment. Sometimes processes that run longer can be considered as better performing processes.

One more example of how to change performance. Let see the update statement that was done inside of one of the processes. Each table used in the update was between 4,000,000 and 10,000,000 rows

update dbo.ERS_1
  set TYPE = 309,
  TYPE_DT = getdate() 
 from dbo.ERS_1 ers
  inner join dbo.T_ETP etp
     ON ers.E_ID = etp.E_ID
     and etp.REL_TYPE = 120 and etp.IND = 'Y'
  inner join dbo.T_FTE fte
     ON ers.E_ID = fte.E_ID
     and fte.F_ID = 111
     and fte.REL_TYPE = 100
  inner join dbo.T_P p
     ON etp.P_ID = p.P_ID
  inner join T_SAF saf
     ON p.P_CD = saf.P_CD
     and p.P_NO = saf.P_NO 
 where ers.TYPE is null

This update was running for about 15 minutes even when all indexes were properly created and the statement has been tuned. And the update was creating a bad locking condition for some other applications because the updated table happened to be heavily used. The problem, obviously, is with the update statement itself.

When I changed the update statement, it was finishing in about 25 seconds. How did I do it?

I run select statement first to place the primary keys (column E_ID) of the records that will be updated in the table ERS_1 into temporary table. Then clustered index was created in temporary table. And, finally, I made update based on the joint with temporary table.

Create #tmp_1( E_ID int)
 
Insert into #tmp_1(E_ID)
  Select ers.E_ID 
   from dbo.ERS_1 ers
     inner join dbo.T_ETP etp
       ON ers.E_ID = etp.E_ID
       and etp.REL_TYPE = 120 and etp.IND = 'Y'
     inner join dbo.T_FTE fte
       ON ers.E_ID = fte.E_ID
       and fte.F_ID = 111
       and fte.REL_TYPE = 100
     inner join dbo.T_P p
       ON etp.P_ID = p.P_ID
     inner join T_SAF saf
       ON p.P_CD = saf.P_CD
       and p.P_NO = saf.P_NO 
  where ers.TYPE is null
  
create clustered index in_1 on #tmp_1(E_ID)
update dbo.ERS_1
  set TYPE = 309, 
  TYPE_DT = getdate() 
 from dbo.T_1 ers
   inner join #tmp_1 t1
     ON ers.E_ID = t1.E_ID

Conclusion

Performance is not only a function of elapsed time a process (stored procedure) is taking but in a lot of cases the way a process interacts with the other processes in a multi-user environment. Especially if we are talking about batch processes, data load processes and reports. They can run for a long time with multiple processes in parallel. And different technique methods can be used to minimize time and improve the way processes affect users and/or other processes.

Rate

5 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (1)

You rated this post out of 5. Change rating