Get list of all the tables used in multiple SQL scripts

  • Hi All,

    I am trying to figure out an automated way to get a list of all the tables used in a set of 275 T-SQL scripts. i do not want to go through each script manually, look at all the from clauses and write down the tables used.

    I have managed to get all the 275 scripts into a table on SQL Server 2008 R2 database. one column per each script, so there are 275 rows in the table.

    FYI: each query can have multiple result sets, so multiple from clauses can exist.

    For Example: Query 1 = SELECT * FROM Table1 a

    INNER JOIN Table2 b ON b.Field = a.Field

    LEFT OUTER JOIN Table3 c ON c.Filed = b.Field

    UNION ALL

    SELECT * FROM Table4 a

    INNER JOIN Table5 b ON b.Field = a.Field

    LEFT OUTER JOIN Table6 c ON c.Filed = b.Field

    Desired Output = Table1,Table2,Table3,Table4,Table5,Table6

    Please let me know if there is an automated way. i do not want to be doing this manually. takes up too much time and effort. any help is highly appreciated.

    Many Thanks,

    Alvin

  • This can help you, but it could give you false positives. I'm using sys.sql_modules, but you can use your table.

    SELECT *

    FROM sys.tables t

    WHERE EXISTS( SELECT * FROM sys.sql_modules m WHERE m.definition + ' ' LIKE '%[^A-Za-z0-9]' + t.name + '[^A-Za-z0-9]%')

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • I actually prefer using sys.dm_sql_referencing_entities. I found that I was getting false positives using the LIKE operator, because people had commented out sections of code, so the table name was still in the definition, but as a comment.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Unfortunately, there is not one perfect answer because of dynamic sql.

    I could build a query that dynamically builds the name of a table and no amount of string searches or dependency lists are going to know that.

    Even straight-forward referenced entities can be out-of-date.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • drew.allen (8/28/2015)


    I actually prefer using sys.dm_sql_referencing_entities. I found that I was getting false positives using the LIKE operator, because people had commented out sections of code, so the table name was still in the definition, but as a comment.

    Drew

    This will only work for T-SQL coded objects (views, stored procedures, etc.)

    If you have access to a SQL Server instance where you can ensure that nothing is running on it, then what I'm thinking is to run DBCC FREEPROCCACHE to drop all queries from the plan cache. Then run all of the scripts, and finally query the plan cache to get the tables. A little bit of XQuery will get the tables involved in the queries. However, this will also get queries that SQL runs behind the scenes. You might be able to filter out by a database.

    Do you need to have this list of tables per script? Or just an overall all encompassing list of unique tables used by all of the scripts.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Create a new, work db. Create 275 procs from the existing code. Hopefully that's as as adding "CREATE PROCEDURE proc_NNN AS " to the beginning of the code and running it. Then use view sys.dm_sql_referencing_entities to find all objects (it doesn't seem to always list all columns, but hopefully in this case it will list them all).

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (8/29/2015)


    Create a new, work db. Create 275 procs from the existing code. Hopefully that's as as adding "CREATE PROCEDURE proc_NNN AS " to the beginning of the code and running it. Then use view sys.dm_sql_referencing_entities to find all objects (it doesn't seem to always list all columns, but hopefully in this case it will list them all).

    I think you are confusing this with sys.dm_sql_referenced_entities. Referencing entities don't list any columns. Referenced entities only list the columns that are actually being referenced in the procedure/function.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Maybe create a database with a few synonyms and run the sql statements. You'll get failures until you have added synonyms for all references.

  • Thanks for all the responses.

    Creating Stored Procedures for each query and use sys.dm_sql_referencing_entities is a good idea. but the problem is some queries will have variables used with out declaration (because most of the queries belong to reports and the variables get values from reports parameters). when i try to create procs, i will get errors for missing variables. I can try and write dynamic SQL to create those missing variables but it is not an easy task.

    So i am planning to have a list of all the 300+ tables stored in a temp table and then a create a second temp tables for the 274 scripts and write a cursor to take each table and do a wildcard search in the query column of the 2nd temp table. and if it finds a match then insert it into a new temp table. i know this is going to take a lot of time to run, but that is least of my problems. i can leave the query running overnight if needed. how can this be done in T-SQL

    following is an example scenario. 3rd table is the desired output

    create table #ListOfTables (id int, tablename varchar(250))

    insert into #ListOfTables values (1,'table1')

    insert into #ListOfTables values (2,'table2')

    insert into #ListOfTables values (3,'table3')

    insert into #ListOfTables values (4,'table4')

    insert into #ListOfTables values (5,'table5')

    insert into #ListOfTables values (6,'table6')

    insert into #ListOfTables values (7,'table7')

    insert into #ListOfTables values (8,'table8')

    ----------------------------------------------------------

    create table #ListOfQueries (Name varchar(70), Queries varchar(max))

    insert into #ListOfQueries values ('Query1','SELECT *

    FROM table1 INNER JOIN

    table2 ON table2.id = table1.id INNER JOIN

    table3 ON table3.id = table2.id INNER JOIN

    table4 ON table4.id = table3.id')

    insert into #ListOfQueries values ('Query2','SELECT *

    FROM table1 INNER JOIN

    table4 ON table4.id = table1.id ')

    insert into #ListOfQueries values ('Query3','SELECT *

    FROM table6 INNER JOIN

    table8 ON table8.id = table6.id ')

    insert into #ListOfQueries values ('Query4','SELECT *

    FROM table1 ')

    ---------------------------------------------------------

    create table #DesiredOutput (Name varchar(70), TableName varchar(250))

    insert into #DesiredOutput values ('Query1','table1')

    insert into #DesiredOutput values ('Query1','table2')

    insert into #DesiredOutput values ('Query1','table3')

    insert into #DesiredOutput values ('Query1','table4')

    insert into #DesiredOutput values ('Query2','table1')

    insert into #DesiredOutput values ('Query2','table4')

    insert into #DesiredOutput values ('Query3','table6')

    insert into #DesiredOutput values ('Query3','table8')

    insert into #DesiredOutput values ('Query4','table1')

    -----------------------------------------------------------

    select * from #ListOfTables

    select * from #ListOfQueries

    select * from #DesiredOutput

    ----------------------------------------------------------

    drop table #ListOfTables

    drop table #ListOfQueries

    drop table #DesiredOutput

    Many Thanks.

  • myjobsinus (8/31/2015)


    Thanks for all the responses.

    Creating Stored Procedures for each query and use sys.dm_sql_referencing_entities is a good idea. but the problem is some queries will have variables used with out declaration (because most of the queries belong to reports and the variables get values from reports parameters). when i try to create procs, i will get errors for missing variables. I can try and write dynamic SQL to create those missing variables but it is not an easy task.

    So i am planning to have a list of all the 300+ tables stored in a temp table and then a create a second temp tables for the 274 scripts and write a cursor to take each table and do a wildcard search in the query column of the 2nd temp table. and if it finds a match then insert it into a new temp table. i know this is going to take a lot of time to run, but that is least of my problems. i can leave the query running overnight if needed. how can this be done in T-SQL

    following is an example scenario. 3rd table is the desired output

    create table #ListOfTables (id int, tablename varchar(250))

    insert into #ListOfTables values (1,'table1')

    insert into #ListOfTables values (2,'table2')

    insert into #ListOfTables values (3,'table3')

    insert into #ListOfTables values (4,'table4')

    insert into #ListOfTables values (5,'table5')

    insert into #ListOfTables values (6,'table6')

    insert into #ListOfTables values (7,'table7')

    insert into #ListOfTables values (8,'table8')

    ----------------------------------------------------------

    create table #ListOfQueries (Name varchar(70), Queries varchar(max))

    insert into #ListOfQueries values ('Query1','SELECT *

    FROM table1 INNER JOIN

    table2 ON table2.id = table1.id INNER JOIN

    table3 ON table3.id = table2.id INNER JOIN

    table4 ON table4.id = table3.id')

    insert into #ListOfQueries values ('Query2','SELECT *

    FROM table1 INNER JOIN

    table4 ON table4.id = table1.id ')

    insert into #ListOfQueries values ('Query3','SELECT *

    FROM table6 INNER JOIN

    table8 ON table8.id = table6.id ')

    insert into #ListOfQueries values ('Query4','SELECT *

    FROM table1 ')

    ---------------------------------------------------------

    create table #DesiredOutput (Name varchar(70), TableName varchar(250))

    insert into #DesiredOutput values ('Query1','table1')

    insert into #DesiredOutput values ('Query1','table2')

    insert into #DesiredOutput values ('Query1','table3')

    insert into #DesiredOutput values ('Query1','table4')

    insert into #DesiredOutput values ('Query2','table1')

    insert into #DesiredOutput values ('Query2','table4')

    insert into #DesiredOutput values ('Query3','table6')

    insert into #DesiredOutput values ('Query3','table8')

    insert into #DesiredOutput values ('Query4','table1')

    -----------------------------------------------------------

    select * from #ListOfTables

    select * from #ListOfQueries

    select * from #DesiredOutput

    ----------------------------------------------------------

    drop table #ListOfTables

    drop table #ListOfQueries

    drop table #DesiredOutput

    Many Thanks.

    A cursor isn't necessary. Not even sure that a table of the table names is necessary. But, based upon loading the scripts into a table, you could do this (works against the current database... sample data is designed for the master database). If the tables are in multiple databases, then you would need the table of table names to query against, but you should be able to modify this query easily to handle that.

    DECLARE @Scripts TABLE (ScriptID INTEGER IDENTITY, ScriptText VARCHAR(MAX));

    INSERT INTO @Scripts (ScriptText)

    VALUES ('SELECT * FROM spt_fallback_db'), -- one table in script

    ('SELECT * FROM spt_fallback_dev, spt_fallback_db'), -- two tables in the script

    ('SELECT * FROM spt_fallback_usg'),

    ('SELECT * FROM spt_monitor'),

    ('SELECT * FROM spt_values, spt_monitor'), -- two tables in the script

    ('SELECT * FROM MSreplication_options, spt_monitor, spt_values'); -- three tables in the script

    WITH cte AS

    (

    SELECT so.schema_id, so.name, so.object_id, t1.*

    FROM @Scripts t1

    JOIN sys.objects so

    ON t1.ScriptText LIKE '%' + so.name + '%'

    )

    SELECT t2.*,

    ca.Tables

    FROM @Scripts t2

    CROSS APPLY (SELECT STUFF((SELECT ',' + name FROM cte WHERE cte.ScriptID = t2.ScriptID ORDER BY name FOR XML PATH(''), TYPE).value('.', 'varchar(max)'),1,1,'')) ca(Tables);

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • I found another solution which works exactly as i wanted. following is the query(try it on adventure works sample DB). Now the problem with this query is that it gives tables only if the query starts with 'Select or a DB obleject like 'dbo.uspGetEmployeeManagers','EXECUTE [dbo].[uspGetEmployeeManagers] @EmployeeID=50'. it does not work for complex SQL like 'Declate @start int select * from address where contact id = @start'

    SET NOCOUNT ON;

    GO

    ----Create a sample table to simulate a table with TSQL Commands

    --drop table #TSQL_Commands

    CREATE TABLE #TSQL_Commands(Id INT identity,ObjName VARCHAR(100), tsql_stmt NVARCHAR(MAX));

    INSERT INTO #TSQL_Commands VALUES (NULL,'SELECT *

    FROM person.contact c

    INNER JOIN person.[Address] ON [c].[rowguid] = [Address].[rowguid]

    WHERE [ContactID] < 50 ');

    INSERT INTO #TSQL_Commands VALUES (NULL,'SELECT *

    FROM person.contact c

    INNER JOIN person.[Address] ON [c].[rowguid] = [Address].[rowguid]');

    INSERT INTO #TSQL_Commands VALUES (NULL,'SELECT *

    FROM [Sales].[SalesOrderHeader] sh

    INNER JOIN sales.[SalesOrderDetail] sd ON [sh].[SalesOrderID] = [sd].[SalesOrderID]

    INNER JOIN sales.[SalesOrderHeaderSalesReason] ON [sd].[SalesOrderID] = [SalesOrderHeaderSalesReason].[SalesOrderID]')

    INSERT INTO #TSQL_Commands VALUES ('dbo.uspGetEmployeeManagers','EXECUTE [dbo].[uspGetEmployeeManagers] @EmployeeID=50');

    INSERT INTO #TSQL_Commands VALUES ('[dbo].[uspGetBillOfMaterials]','EXECUTE [dbo].[uspGetBillOfMaterials] @StartProductID=500,@CheckDate=''20090203''')

    --INSERT INTO #TSQL_Commands VALUES (6,'dbo.vw_test','select * from vw_test')

    GO

    --Cursor to execute dynamic sql with fmtonly, so the stmt is not actually executed, but a query plan is built

    DECLARE @sql NVARCHAR(MAX)

    DECLARE curExecDynSQL CURSOR LOCAL STATIC FOR

    SELECT tsql_stmt

    FROM #TSQL_Commands

    OPEN curExecDynSQL

    FETCH NEXT FROM curExecDynSQL INTO @sql

    WHILE @@FETCH_STATUS = 0

    BEGIN

    --PRINT @sql

    SET FMTONLY ON;

    EXEC sp_executesql @sql

    SET FMTONLY OFF;

    FETCH NEXT FROM curExecDynSQL INTO @sql

    END

    CLOSE curExecDynSQL

    DEALLOCATE curExecDynSQL

    GO

    --Cache query to extract tables names from the cached execution plan

    ;WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')

    SELECT

    [id],

    (SELECT [processing-instruction(x)]=[Stmt] FOR XML PATH(''),TYPE) AS [Stmt],

    [db],

    [Schema],

    [Tbl],

    [Alias]

    FROM(

    SELECT

    tsql_st.id,

    SUBSTRING(

    st.text,

    (qs.statement_start_offset/2)+1,

    (

    (

    CASE qs.statement_end_offset

    WHEN -1 THEN DATALENGTH(st.text)

    ELSE qs.statement_end_offset

    END - qs.statement_start_offset)/2

    )

    + 1) AS Stmt,

    x.i.value('(OutputList/ColumnReference/@Database)[1]', 'VARCHAR(100)')AS [db],

    x.i.value('(OutputList/ColumnReference/@Schema)[1]', 'VARCHAR(100)')AS [Schema],

    x.i.value('(OutputList/ColumnReference/@Table)[1]', 'VARCHAR(100)')AS [Tbl],

    x.i.value('(OutputList/ColumnReference/@Alias)[1]', 'VARCHAR(100)')AS [Alias]

    FROM sys.dm_exec_query_stats qs

    CROSS APPLY sys.dm_exec_sql_text([sql_handle]) st

    CROSS APPLY sys.dm_exec_query_plan([plan_handle]) qp

    CROSS APPLY qp.query_plan.nodes('//RelOp') x(i)

    INNER JOIN #TSQL_Commands tsql_st

    ON tsql_st.[tsql_stmt] COLLATE SQL_Latin1_General_CP1_CI_AS =

    SUBSTRING(st.text,(qs.statement_start_offset/2)+1,((

    CASE qs.statement_end_offset

    WHEN -1 THEN DATALENGTH(st.text)

    ELSE qs.statement_end_offset

    END - qs.statement_start_offset)/2)+ 1)

    OR st.[objectid] = object_id(tsql_st.ObjName)

    WHERE

    x.i.value('@PhysicalOp', 'NVARCHAR(200)') IN('Table Scan','Index Scan','Clustered Index Scan','Index Seek','Clustered Index Seek')

    AND EXISTS(

    SELECT 1

    FROM sys.tables t

    WHERE

    t.name = REPLACE(REPLACE(x.i.value('(OutputList/ColumnReference/@Table)[1]', 'VARCHAR(100)'),'[',''),']','')

    AND T.TYPE = 'U'

    AND T.is_ms_shipped = 0

    )

    ) AS x

    GROUP BY

    [id],

    [db],

    [Schema],

    [Tbl],

    [Alias],

    [Stmt]

    ORDER BY [id] ASC

    drop table #TSQL_Commands

  • Viewing 11 posts - 1 through 10 (of 10 total)

    You must be logged in to reply to this topic. Login to reply