Pulling Filename from Windows directory into table

  • Michael Valentine Jones (7/22/2009)


    This is a farily simple method:

    declare @files table (cmdout nvarchar(100) )

    insert into @files ( cmdout )

    -- Get filenames excluding directories

    exec master.dbo.xp_cmdshell 'dir C:\ /b /a:-d'

    select cmdout from @files where cmdout is not null order by cmdout

    Results:

    cmdout

    -------------------

    AUTOEXEC.BAT

    BOOT.BAK

    boot.ini

    cmldr

    CONFIG.SYS

    IO.SYS

    license.txt

    MSDOS.SYS

    ntdetect.com

    ntldr

    pagefile.sys

    Simpler still is EXEC Master.dbo.xp_Dirtree 'path/filename',1,1 but that isn't what the OP asked for either. He wants the dates for each file, as well.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • tjm (8/7/2009)


    Jeff,

    Just wanted to thank you again. That procedure has really come in handy in multiple scenarios.

    -tjm

    That's absolutlely awesome. Thanks for the feedback! 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • How do I email the list of filenames from the table? I'm using SQL 2005 and have configured Database Mail.

  • Jeff,

    This procedure works great but when I use it to read a directory that has more than 255 files in it the file names and details for the row numbers >=256 are the same as row number 255. The row number increases but the name, path , short name etc read the same as row 255. Any ideas?

    Thanks,

    E

  • Could you post or attach rows 250 to 260 so I can see what you mean and try to duplicate the problem?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • fm447k (8/10/2009)


    How do I email the list of filenames from the table? I'm using SQL 2005 and have configured Database Mail.

    Apologies here because I don't know. They have Lotus Notes and SQL Server 2000 and have the email system setup where I can't even do automatic DBA notifications where I'm currently working. I had to use CDONTS to setup any kind of automated email and that differs a whole lot from system to system. Because it's operating on SS2k, it has an 8k limit (unless I used the TEXT data type and I'm not going there) so I end up writing such things out to a file and then identify the file as an attachment. It's a real pain.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Attached file shows results 250 to 260. Note that after row 255 all the file details are the same yet the row number changes.

  • erouse (9/18/2009)


    Attached file shows results 250 to 260. Note that after row 255 all the file details are the same yet the row number changes.

    That looks like a bug in the sp_OA* canned sprocs. I guess I've never gotten there because I always move processed files to a different directory. Thanks for posting the text file. I'll see if I can recreate the problem tonight and maybe even come up with an alternate.

    With that in mind, can you use xp_CmdShell on your server(s)? There's a very fast method we can use there but it does require a dip into the xp_CmdShell world.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yes I can use xp_cmdshell

  • Apologies for the delay. This will do the trick so long as you understand that if MS ever changes the spacing on the output of the DOS DIR command, this proc will need to be tweaked.

    [font="Courier New"]CREATE PROCEDURE dbo.GetDirectoryFileInfo

    /****************************************************************************************

    PURPOSE:

    This utility stored procedure returns the long filename, CreateDate, LastModifiedDate,

    and the file size in Bytes from any given directory or UNC.

    INPUT PARAMETERS:

    The unquoted \\MachineName\Path or d:\Path where "d:" is the drive letter. Wildcards

    may be used for file names and extensions.  Only path information is allowed. Inclusion

    of anything not interpreted as a valid path will cause an empty result set to be

    returned for security reasons.

    OUTPUTS:

    Column name      DataType     Description

    -----------      --------     ----------------------------------------------------------

    RowNum           INTEGER      Sequential number

    FileName         VARCHAR(256) Filename and extension from the DIR command

    CreateDate       DATETIME     Date the file was created on

    LastModifiedDate DATETIME     Date the file was last modified

    Bytes            BIGINT       The number of bytes the file contains

    If the path is not found, is empty, the parameter passed was not an actual path, or

    the permissions to access a legitimate path does not exist for MS-SQL Server, the stored

    procedure will return an empty result set.  This is partially for security reasons...

    if a hacker gets no return, they don't know if they're on the right track or not.

    REVISION HISTORY:

    Rev 00 - Jeff Moden - Initial creation and unit test

    ****************************************************************************************/

    --===== Declare I/O parameters

    @pPath VARCHAR(512) --The path info and wildcards to be used with a DIR command

         AS

    --=======================================================================================

    --===== Presets

    --=======================================================================================

    --===== Supress the autodisplay of rowcounts for appearance and speed

        SET NOCOUNT ON

    --===== Declare local variables

    DECLARE @Command VARCHAR (300) --Holds the dynamic DOS command for the DIR command

    --===== If the temp table that holds the Directory output is not null, drop the table

         IF OBJECT_ID('TempDB..#DosOutput') IS NOT NULL

            DROP TABLE #DosOutput

    --===== Create the temp table that holds the Directory output

    CREATE TABLE #DosOutput

            (

             RowNum INT IDENTITY(1,1),

             Data VARCHAR(300)

            )

    --===== If the temp table that holds the file information is not null, drop the table

         IF OBJECT_ID('TempDB..#FileInfo') IS NOT NULL

            DROP TABLE #FileInfo

    --=======================================================================================

    --===== Get the directory information and the LastModifiedDate for lines with files only.

    --=======================================================================================

    --===== Setup to do a "DIR" with the following switches

         -- /TW  = Date/Time file was last written to (LastModifiedDate)

         -- /-C  = List number of bytes without commas

         -- Enclose the @pPath variable in quotes to all for paths with spaces.

        SET @Command = 'DIR "' + @pPath + '" /TW /-C'

    --===== Execute the "DIR" command and save the output in #DosOutput

         -- (order preserved by the Primary Key)

    INSERT INTO #DosOutput (Data)

       EXEC Master.dbo.xp_CmdShell @Command

    --===== Parse the Dos output into the file info table.

         -- The criteria in the WHERE clause ensures only file info is returned

    SELECT

            IDENTITY(INT,1,1) AS RowNum,

            SUBSTRING(Data,40,256) AS [FileName],

            CAST(NULL AS DATETIME) AS CreateDate, --Populated on next step

            CONVERT(DATETIME,SUBSTRING(Data,1,23)) AS LastModifiedDate,

            CAST(SUBSTRING(Data,22,17) AS BIGINT) AS Bytes

       INTO #FileInfo

       FROM #DosOutput

      WHERE SUBSTRING(Data,15,1) = ':' --Row has a date/time on it

        AND Data NOT LIKE '%<DIR>%'    ---Row is not a directory listing

    --=======================================================================================

    --===== Update each file's info with the CreateDate

    --=======================================================================================

    --===== Setup to do a "DIR" with the following switches

         -- /TC  = Date/Time File was created (CreateDate)

         -- Enclose the @pPath variable in quotes to prevent SQL Injection attacks

        SET @Command = 'DIR "' + @pPath + '" /TC'

    --===== Clear the #DosOutput table

    TRUNCATE TABLE #DosOutput

    --===== Execute the "DIR" command and save the output in #DosOutput

         -- (order preservation not important here)

    INSERT INTO #DosOutput (Data)

       EXEC Master.dbo.xp_CmdShell @Command

    --===== Parse the DOS output table for the CreateDate and add it to the

         -- file info table.

    UPDATE #FileInfo

        SET CreateDate = CONVERT(DATETIME,SUBSTRING(do.Data,1,23))

       FROM #FileInfo fi,

            #DosOutput do

      WHERE fi.FileName = SUBSTRING(do.Data,40,256) --Filenames match

        AND SUBSTRING(do.Data,15,1) = ':' --Row has a date/time on it

        AND do.Data NOT LIKE '%<DIR>%'    --Row is not a directory listing

    --=======================================================================================

    --===== Return a result set to the calling object

    --=======================================================================================

    SELECT * FROM #FileInfo

    --===== Exit the proc with no error reporting for security reasons

    RETURN

    GO

    [/font]

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • So far as why the filenames begin duplication at the 256th file, I have no idea. It must be a limit in the sp_AO* sprocs. I can't imagine that the file system object would have such a limit.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Your SP is exactly what I've been looking for.

    I just have one problem. I can't seem to find the temp table it created.

    Is the table somehow dropped after running the SP?

    when I try to execute the following:

    SELECT * FROM #FileDetails

    Results

    Msg 208, Level 16, State 0, Line 1

    Invalid object name '#FileDetails'.

    I'm using SQL Server Express 2005 and I'm kinda new to SQL

    I'm looking to get the filenames and attributes into a permanent table to automate alot of manual work.

    I also do not want to delete the filename from the table, just append to it if the filename doesn't already exist.

    Thanx in advance Jeff and hope you can help

  • brianparow (9/18/2010)


    Jeff,

    Your SP is exactly what I've been looking for.

    I just have one problem. I can't seem to find the temp table it created.

    Is the table somehow dropped after running the SP?

    when I try to execute the following:

    SELECT * FROM #FileDetails

    Results

    Msg 208, Level 16, State 0, Line 1

    Invalid object name '#FileDetails'.

    I'm using SQL Server Express 2005 and I'm kinda new to SQL

    I'm looking to get the filenames and attributes into a permanent table to automate alot of manual work.

    I also do not want to delete the filename from the table, just append to it if the filename doesn't already exist.

    Thanx in advance Jeff and hope you can help

    Let me ask you... do you really need the attributes or do the names of the files contain enough information (such as a date)?

    So far as your question goes, a "#" at the beginning of a file signifies a "temporary table" which is scope sensitive. It automatically drops when the stored procedure completes. I suppose you could change it a bit to keep the data active in a real table so that you could log process dates and the like.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanx Jeff,

    I already altered the SP to do just that.

    I do not really need the attributes except for the size.

    and I guess I probably do not need to keep a permanent table.

    What I'm trying to do is grab the filenames from a directory and see if there are 2 files with the same name and different extensions..

    ie:

    123.txt and 123.pdf

    I want it to check and find all txt files that do not have a corresponding pdf file and output those txt filenames to the results of the query.

    I have been trying to work on a query using:

    left(FileDetails.[Name],len(FileDetails.[Name])-4)

    This will give me the filenames without the extension and I've been trying for a while to get my results to no avial...

    Can anybody help me?

  • brianparow (9/19/2010)


    What I'm trying to do is grab the filenames from a directory and see if there are 2 files with the same name and different extensions..

    ie:

    123.txt and 123.pdf

    I want it to check and find all txt files that do not have a corresponding pdf file and output those txt filenames to the results of the query.

    Heh... jeez Brian... why didn't you just say so at the beginning??? Problem solved... read the comments in the code that follows

    --===== Conditionally drop temp tables to make reruns easier

    IF OBJECT_ID('TempDB..#FileList','U') IS NOT NULL

    DROP TABLE #FileList

    --===== Create the working temp table with a couple of calculated columns to isolate the extention

    -- and just the file name.

    CREATE TABLE #FileList

    (

    FullFileName SYSNAME,

    Level TINYINT,

    IsFile TINYINT,

    FileName AS SUBSTRING(FullFileName,1,LEN(FullFileName)-CHARINDEX('.',REVERSE(FullFileName))),

    Extension AS RIGHT(FullFileName,CHARINDEX('.',REVERSE(FullFileName)))

    )

    --===== Populate the table with file names from the given directory or UNC

    -- (YES, you CAN convert this section to dynamic SQL if you need to)

    INSERT INTO #FileList

    (FullFileName, Level, IsFile)

    EXEC xp_DirTree 'C:\Temp',1,1 --<<<<Change to the desired directory here

    --===== Ok... let see what that has in it. Not part of the problem but thought you should see it.

    SELECT * FROM #FileList

    --===== And finally, list all filenames that have a .txt extension but DON'T have a .PDF extension

    SELECT FileName FROM #FileList WHERE Extension = '.txt' AND IsFile = 1

    EXCEPT

    SELECT FileName FROM #FileList WHERE Extension = '.pdf' AND IsFile = 1

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 33 total)

You must be logged in to reply to this topic. Login to reply