Blog Post

How to Find Expensive Queries in Amazon Redshift

,

Slow-running queries can degrade your Redshift cluster’s performance and lead to increased costs. Identifying the most expensive queries is crucial to optimize resource usage and improve overall system efficiency.

Step-by-Step Guide

  1. Connect to your Redshift cluster.

    Use your preferred SQL client or the Redshift Query Editor to establish a connection with your cluster.

  2. Query the stl_query system table for the most resource-intensive queries.

    The stl_query table logs all completed queries, including their runtime metrics. Use the following SQL to retrieve the top 10 queries with the longest execution time over the last 24 hours:

    SELECT query, userid, label, starttime, endtime,
           DATEDIFF(seconds, starttime, endtime) AS elapsed_seconds
    FROM stl_query
    WHERE starttime >= GETDATE() - INTERVAL '1 day'
    ORDER BY elapsed_seconds DESC
    LIMIT 10;
        
  3. Retrieve the SQL text of the expensive queries.

    Use the stl_querytext table to get the text of the queries identified:

    SELECT text
    FROM stl_querytext
    WHERE query = <query_id>
    ORDER BY sequence;
        

    Replace <query_id> with the actual query ID from the previous step to analyze the query text for possible optimizations.

  4. Focus your optimization efforts.

    Look for joins without indexes, large data scans, or missing filters and consider rewriting or adjusting these queries for better performance.

Conclusion

Tracking and analyzing expensive queries using Redshift’s system tables is a simple yet powerful way to maintain a healthy and efficient data warehouse environment. Regular monitoring helps reduce costs and speeds up analytics workflows.

For further reading, visit the official Amazon Redshift system tables documentation: Amazon Redshift System Tables

Original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating