Loading All CSV Files with PowerShell

Steve Jones, 2017-07-04 (first published: 2017-06-23)

I ran across Denis Gobo’s post about working with names from Social Security cards and wanted to play with the dataset. However, rather than use xp_cmdshell, which I might have, I decided to use PowerShell. I need to repeat this with Python, but for now, this worked.

There is a file you can download that has a lot of .txt files, each one with a CSV. You can read about this in more detail in Denis’ post. Essentially there were three fields, and the year of the file in the name. I downloaded and unzipped the files into a folder.

Here was my process:

  1. Loop through the files
  2. Extract the year from the file name
  3. bcp in the file into a staging table
  4. call a proc that takes the year as a parameter and moves all data

With that in mind, here’s how I set things up.

The SQL Server Side

I started by creating two tables.

CREATE TABLE Names

(   FirstName  VARCHAR(500),
     Gender     CHAR(1),
     NameCount  INT,
     YearInFile INT

);

CREATE TABLE Names2

(   FirstName VARCHAR(500),
     Gender    CHAR(1),
     NameCount INT

);

With these in place, I created a procedure:

CREATE PROCEDURE MoveNames @year INT

AS

INSERT Names SELECT FirstName, Gender, NameCount, @year FROM Names2;

TRUNCATE TABLE Names2;

GO

I tested these and they worked fine for my process. I want to load into Names2, then to get the year, I move the data to the other table, adding the year. There are probably better ways, but this worked fine for a relatively small load (few million rows).

PowerShell

With PoSh, I started with a simple loop. I tend to set variables first since this makes this easier to turn into a function.

$sourcefolder = “E:\Downloads\names”

$sourcefiles = Get-ChildItem $sourcefolder -Filter “*.txt”

foreach($file in $sourcefiles ){
   $yearinfile = $file.Name.Substring(3, 4)
   write-host “Loading File:” $file.Name ” year: ” $yearinfile

}

Running this gets me a list of all files along with the  years. That’s what I want, since I’m going to process each file and load it in with bcp. Thanks to Mike Fal for the outline of how I’ll do this.

Next, I set more variables at the top:

$InstanceName = “.\SQL2016”

$DatabaseName = “Sandbox”

$StagingTableName = “Names2”

$StagingProc = “MoveNames”

I’ll use these in this code. I call Invoke-Expression to run bcp and then Invoke-Sqlcmd to run my proc.

$cmd = “bcp ‘$DatabaseName.dbo.[$StagingTableName]’ in ‘$file’ -S’$InstanceName’ -T -c -t’,’”

Invoke-Expression $cmd

Invoke-Sqlcmd -ServerInstance $InstanceName -Database $DatabaseName -Query “$StagingProc $yearinfile”

From here, I put this in a file and ran it from my download location. The result:

2017-06-16 12_41_08-Windows PowerShell ISE

and from SQL Server:

2017-06-16 12_41_50-SQLQuery7.sql - (local)_SQL2016.sandbox (PLATO_Steve (56))_ - Microsoft SQL Serv

Now I can run some queries, and find out where I stand. #26, it turns out.

2017-06-16 12_43_35-SQLQuery7.sql - (local)_SQL2016.sandbox (PLATO_Steve (56))_ - Microsoft SQL Serv

Filed under: Blog Tagged: ETL, powershell, syndicated

Rate

Share

Share

Rate

Related content

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

Robert Davis

2009-02-23

1,567 reads

Networking – Part 4

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I’d like to talk about social networking. We’ll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let…

Andy Warren

2009-02-17

1,530 reads

Speaking at Community Events – More Thoughts

Last week I posted Speaking at Community Events – Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I’ve got a few more thoughts on the topic this week, and I look forward to your comments.

Andy Warren

2009-02-13

360 reads