Import All files is directory performance

  • Using the technique for "Looping, Importing and Archiving" at http://www.sqldts.com/default.aspx?246.

    This works fine when the number of files is the directory is relatively small (a few hundred) but, in this case, the directory contains over 150 thousand files and the run time is extremely high (in 20 hours, about 1000 files have been processed). In Windows Explorer, it takes about 20 seconds to get a list of the files.

    Is there a better solution ? Below is the function to get the next file to process. Is the command "counter = fold.files.count" the cause of the performance problem ?

    Function ShouldILoop

    Dim fso

    Dim fil

    Dim fold

    Dim pkg

    Dim counter

    set pkg = DTSGlobalVariables.Parent

    set fso = CREATEOBJECT("Scripting.FileSystemObject")

    set fold = fso.GetFolder(DTSGlobalVariables("gv_FileLocation").Value)

    counter = fold.files.count

    if counter >= 1 then

    for each fil in fold.Files

    DTSGlobalVariables("gv_FileFullName").Value = fil.path

    ShouldILoop = CBool(True)

    Next

    else

    ShouldILoop = CBool(False)

    End if

    End Function

    SQL = Scarcely Qualifies as a Language

  • I think the big problem here is that, like the old "GWBasic" and "BasicA", VBS is "interpretive" rather than compiled code... makes it pretty slow and that's another reason why I don't use DTS for imports, simple or not.

    I've not tried it with as many files as you have, but I've had GREAT success with code similar to the following snippet of T-SQL... yeah, I know... undocumented feature... but it is available in 2000, 2005, and I've been told, 2008.

    CREATE TABLE #FileInfo

    (

    RowNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

    ObjectName VARCHAR(256),

    Depth INT,

    IsFile INT

    )

    INSERT INTO #FileInfo

    (ObjectName,Depth,IsFile)

    EXEC Master.dbo.xp_DirTree 'C:\WINDOWS\system32',1,1

    SELECT *

    FROM #FileInfo

    WHERE IsFile = 1

    Of course, you would need to change "'C:\WINDOWS\system32'" to the proper path and it can be a UNC.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The reason it's taking so long is that the for loop is executed on each file and only returns after the last file from the collection has been looked at.

    Exiting the function after you assign the return value should speed things up quite a bit.

    for each fil in fold.Files

    DTSGlobalVariables("gv_FileFullName").Value = fil.path

    ShouldILoop = CBool(True)

    Exit Function

    Next

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply