Exporting Data with PowerShell

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 719974

    Comments posted to this topic are about the item Exporting Data with PowerShell

  • Jeff Moden

    SSC Guru

    Points: 996661

    This is a great introduction to exporting files with PoSh, Steve.  I gave it a 5.  That and the mention of how to do it with DBATools is great.  I also appreciate the note that PoSh can suffer from slower performance than some of the other methods.

    One of the things I'd really like to see is how to do this from a stored procedure that has been scheduled as a job in SQL Server.  I know how I've done it in the past (xp_CmdShell) but a lot of people poo-poo that idea even if they do poo-poo it because of bad information and resulting visceral fear of using xp_CmdShell.  Can PoSh be dynamically executed via a job that can have variable file names as targets?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • frederico_fonseca

    SSChampion

    Points: 14686

    ARGHHHH.. Sorry Jeff - hit report button by mistake.

     

    your question - can powershell use variables - not exactly like you do it on a storedproc - but you can query the database to retrieve the required variable values using a standard recordset and use the values as input to the remaining functions.

    I do this a lot on my jobs and will give an example shortly

  • ZZartin

    SSC-Dedicated

    Points: 30414

    I know how I've done it in the past (xp_CmdShell) but a lot of people poo-poo that idea even if they do poo-poo it because of bad information and resulting visceral fear of using xp_CmdShell.  Can PoSh be dynamically executed via a job that can have variable file names as targets?

    Well powershell has a lot of powerful tools to interrogate a file system and create file names if you want to put all the filename logic in powershell.  Unfortunately while powershell does accept parameters easily without using cmdshell there's not a great way to actually call a parameterized script.

  • frederico_fonseca

    SSChampion

    Points: 14686

    sample powershell script that I use to copy/move files from a drop folder to dev/tst/prod destinations - dev/tst are "copy" while prod is "move"

    some of the functions called are in house so you won't find reference to them on the net - if anyone wishes to have this fully functional I can supply them - no classified or copyright dependencies on them.

    script does the following

    connects to our etl db and retrieves the list of active files and their location with other details required for the processing

    for each record retrieved lists the possible files on its location and checks to see if it has already been processed (through another SQL call to the database) and if not it copies/moves the file to its desired processing location

    #
    # CopyAndMoveFiles.ps1
    #

    param(
    [Parameter(Mandatory=$true, Position=0)][string]$connectionstring,
    [Parameter(Mandatory=$true, Position=0)][string]$servername,
    [Parameter(Mandatory=$true, Position=0)][string]$dbname,
    [Parameter(Mandatory=$true, Position=0)][string]$logserver

    )

    #$connectionstring = "Data Source=servername;Initial Catalog=dbname;Provider=SQLNCLI10.1;Integrated Security=True;Connect Timeout=120;Packet Size=32768"
    #$servername = "servername"
    #$logserver = "servername"
    #$dbname = "dbname"

    $scriptPath = $(Split-Path -Parent $MyInvocation.MyCommand.Path)
    $logpath = ("\\$logserver\DataInput\ExecutionLogs")

    # Import Generic Powershell utilities module
    Import-Module (Join-Path $scriptPath "PowershellUtilities.psm1")

    $rundate = Get-Date

    $global:logfilename =(Join-Path $logpath ("$($logserver)_CopyAndMoveFiles_{0:yyyyMMddHHmmss}.log" -f $rundate))
    Set-Log -logFileFullName "$logfilename"

    $Record = @{
    "SourceSystem" = ""
    "SourcePath" = ""
    "DestinationPath" = ""
    "Operation" = ""
    "FileType" = ""
    "Pattern" = ""
    "IsMultiPath" = ""
    }

    $sqlprovider = Get-SqlProvider
    if ($sqlprovider -eq "")
    {
    Write-Log "SQL Provider not available"
    exit 3
    }

    $online=$false

    $reply = $null
    $a=ping -n 10 $servername
    $a|ForEach-Object {
    if ($_ -match 'reply \s*(.*)$')
    {
    $reply += $matches[1]
    }
    }
    if ($reply)
    {
    $online = $true
    }

    $valid = $false
    if ($online)
    {
    $connectionString = "Data Source=$servername;Initial Catalog=$dbname;Provider=$sqlprovider;PacketSize=32767;Integrated Security=SSPI;Auto Translate=False;"
    Write-Log "Connecting to: $connectionString"

    try
    {
    $Pathresults = (Invoke-OleDbCmd -ConnectionString $connectionString -Query "select SourceSystem, SourcePath, DestinationPath, Operation, FileType,Pattern,IsMultiPath from Config.PathFiles where IsActive = 1")
    if ($Pathresults)
    {
    foreach($result in $Pathresults)
    {
    $Record."SourceSystem" = $result.SourceSystem
    $Record."SourcePath" = $result.SourcePath
    $Record."DestinationPath" = $result.DestinationPath
    $Record."Operation" = $result.Operation
    $Record."FileType" = $result.FileType
    $Record."Pattern" = $result.Pattern
    $Record."IsMultiPath" = $result.IsMultiPath
    $valid = $true
    $filetype = @(($Record."FileType").split(",").Trim())
    $sourcePath = $Record."SourcePath"


    $ChildItems = gci $sourcePath\* -Include $filetype
    $mask = $Record."Pattern"
    $SoureSystemFolder = split-path $sourcePath -leaf
    $NameOfFile = $ChildItems.name
    $MultiPath = $Record."IsMultiPath"

    Write-Log "********Start processing folder: $($Record."SourcePath")"


    If ($ChildItems.Length -gt 0){
    $ChildItems | ForEach {
    $ModifiedDestination = "$(Join-Path $Record.DestinationPath $_.Name)"
    $FileName = Split-Path $ModifiedDestination -leaf

    Write-Log "Start processing file: $FileName"


    If ((Test-Path $ModifiedDestination) -eq $False) {
    $RegisterCount = (Invoke-OleDbCmd -ConnectionString $connectionString -Query "select count(1)
    from Logging.ProcessRegister r
    inner join Config.SourceSystem s
    on r.ConfigurationID = s.ID
    where (case
    when s.IsDateInFileName = 0
    then replace(SourceTable, reverse(substring(reverse(SourceTable), charindex('.', reverse(SourceTable)) + 1, 20)), '')
    else SourceTable
    end) = '$($FileName)'
    and ProcessStatus = 'Success'")
    If($RegisterCount.Item(0) -eq 0 -and $MultiPath -eq "1")
    {
    if($_.name -match $mask){

    & $Record."Operation" -Path $_.FullName -include $filetype -Destination "$($Record.DestinationPath)"
    Write-Log "File $FileName was successfully copied >$($Record."Operation")< from folder $($Record."SourcePath") to folder $($Record."DestinationPath")."
    Write-Log "End processing file: $FileName"
    }


    elseif($_.name -notmatch $mask){

    Write-Log "file: is unknown $_.name"

    }

    }

    elseif($RegisterCount.Item(0) -eq 0 -and $MultiPath -eq "0")
    {

    & $Record."Operation" -Path $_.FullName -include $filetype -Destination "$($Record.DestinationPath)"
    Write-Log "File $FileName was successfully copied >$($Record."Operation")< from folder $($Record."SourcePath") to folder $($Record."DestinationPath")."
    Write-Log "End processing file: $FileName"

    }


    else {
    Write-Log "File $FileName was previously loaded."
    Write-Log "End processing file: $FileName"
    }

    }
    else{
    Write-Log "File $FileName already exists in the folder."
    Write-Log "End processing file: $FileName"
    }


    }
    }
    else {Write-Log "### No Files to Process ###."}
    Write-Log "********End processing folder: $($Record."SourcePath")"
    }
    }
    }
    catch
    {
    $msg = $_.Exception.Message
    $Record."Message" = $msg.Replace("r"," ").Replace("n"," ")

    Write-Log $_ -Level Error

    throw $_
    }
    Write-Log "Finishing moving files"
    }
  • duncfair

    Old Hand

    Points: 360

    Seems like there should be a switch for Get-DbaDBTable for scripting the creation of the target table instead of just the insert statements.

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 719974

    Feel free to add that. dbatools is an open project.

  • Jeff Moden

    SSC Guru

    Points: 996661

    Steve Jones - SSC Editor wrote:

    Feel free to add that. dbatools is an open project.

    Heh... or just do it in T-SQL. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • frederico_fonseca

    SSChampion

    Points: 14686

    duncfair wrote:

    Seems like there should be a switch for Get-DbaDBTable for scripting the creation of the target table instead of just the insert statements.

    I assume you meant export-DbaDbTableData

    could be done - and can be proposed to the team.

    but I think you can do it with a combination of other dbatools scripts

    get-DbaDbTable ... |Export-DbaScript ... onto file

    followed by get-DbaDBTable ...|Export-DbaDbTableData ... append onto same file as the first block

    in both cases batch separator  needs to be specified and needs to be the same in both

     

     

  • kast218

    SSCertifiable

    Points: 6467

    Hi, Steve. Thanks for article.

    But first variant with pure Export-Csv approach is useless due to quoting all values for all fields (for powershell 7 you has QuoteFields parameter but you need explicitly point fields to quote in output https://github.com/PowerShell/PowerShell/pull/9132). You can not import resulting csv file anywhere (for example PostgreSQL database) without to many quoting removing for numeric fields. Also bit data type exported like False and True (not 0 and 1). For solve this limitations I added some workload, see my github script: https://github.com/ktaranov/sqlserver-kit/blob/master/PowerShell/Export_Query_To_Csv.ps1

  • Jeff Moden

    SSC Guru

    Points: 996661

    Teaching a man to fish...

    frederico_fonseca wrote:

    sample powershell script that I use to copy/move files from a drop folder to dev/tst/prod destinations - dev/tst are "copy" while prod is "move"

    some of the functions called are in house so you won't find reference to them on the net - if anyone wishes to have this fully functional I can supply them - no classified or copyright dependencies on them.

    script does the following

    connects to our etl db and retrieves the list of active files and their location with other details required for the processing

    for each record retrieved lists the possible files on its location and checks to see if it has already been processed (through another SQL call to the database) and if not it copies/moves the file to its desired processing location

    #
    # CopyAndMoveFiles.ps1
    #

    param(
    [Parameter(Mandatory=$true, Position=0)][string]$connectionstring,
    [Parameter(Mandatory=$true, Position=0)][string]$servername,
    [Parameter(Mandatory=$true, Position=0)][string]$dbname,
    [Parameter(Mandatory=$true, Position=0)][string]$logserver

    )

    #$connectionstring = "Data Source=servername;Initial Catalog=dbname;Provider=SQLNCLI10.1;Integrated Security=True;Connect Timeout=120;Packet Size=32768"
    #$servername = "servername"
    #$logserver = "servername"
    #$dbname = "dbname"

    $scriptPath = $(Split-Path -Parent $MyInvocation.MyCommand.Path)
    $logpath = ("\\$logserver\DataInput\ExecutionLogs")

    # Import Generic Powershell utilities module
    Import-Module (Join-Path $scriptPath "PowershellUtilities.psm1")

    $rundate = Get-Date

    $global:logfilename =(Join-Path $logpath ("$($logserver)_CopyAndMoveFiles_{0:yyyyMMddHHmmss}.log" -f $rundate))
    Set-Log -logFileFullName "$logfilename"

    $Record = @{
    "SourceSystem" = ""
    "SourcePath" = ""
    "DestinationPath" = ""
    "Operation" = ""
    "FileType" = ""
    "Pattern" = ""
    "IsMultiPath" = ""
    }

    $sqlprovider = Get-SqlProvider
    if ($sqlprovider -eq "")
    {
    Write-Log "SQL Provider not available"
    exit 3
    }

    $online=$false

    $reply = $null
    $a=ping -n 10 $servername
    $a|ForEach-Object {
    if ($_ -match 'reply \s*(.*)$')
    {
    $reply += $matches[1]
    }
    }
    if ($reply)
    {
    $online = $true
    }

    $valid = $false
    if ($online)
    {
    $connectionString = "Data Source=$servername;Initial Catalog=$dbname;Provider=$sqlprovider;PacketSize=32767;Integrated Security=SSPI;Auto Translate=False;"
    Write-Log "Connecting to: $connectionString"

    try
    {
    $Pathresults = (Invoke-OleDbCmd -ConnectionString $connectionString -Query "select SourceSystem, SourcePath, DestinationPath, Operation, FileType,Pattern,IsMultiPath from Config.PathFiles where IsActive = 1")
    if ($Pathresults)
    {
    foreach($result in $Pathresults)
    {
    $Record."SourceSystem" = $result.SourceSystem
    $Record."SourcePath" = $result.SourcePath
    $Record."DestinationPath" = $result.DestinationPath
    $Record."Operation" = $result.Operation
    $Record."FileType" = $result.FileType
    $Record."Pattern" = $result.Pattern
    $Record."IsMultiPath" = $result.IsMultiPath
    $valid = $true
    $filetype = @(($Record."FileType").split(",").Trim())
    $sourcePath = $Record."SourcePath"


    $ChildItems = gci $sourcePath\* -Include $filetype
    $mask = $Record."Pattern"
    $SoureSystemFolder = split-path $sourcePath -leaf
    $NameOfFile = $ChildItems.name
    $MultiPath = $Record."IsMultiPath"

    Write-Log "********Start processing folder: $($Record."SourcePath")"


    If ($ChildItems.Length -gt 0){
    $ChildItems | ForEach {
    $ModifiedDestination = "$(Join-Path $Record.DestinationPath $_.Name)"
    $FileName = Split-Path $ModifiedDestination -leaf

    Write-Log "Start processing file: $FileName"


    If ((Test-Path $ModifiedDestination) -eq $False) {
    $RegisterCount = (Invoke-OleDbCmd -ConnectionString $connectionString -Query "select count(1)
    from Logging.ProcessRegister r
    inner join Config.SourceSystem s
    on r.ConfigurationID = s.ID
    where (case
    when s.IsDateInFileName = 0
    then replace(SourceTable, reverse(substring(reverse(SourceTable), charindex('.', reverse(SourceTable)) + 1, 20)), '')
    else SourceTable
    end) = '$($FileName)'
    and ProcessStatus = 'Success'")
    If($RegisterCount.Item(0) -eq 0 -and $MultiPath -eq "1")
    {
    if($_.name -match $mask){

    & $Record."Operation" -Path $_.FullName -include $filetype -Destination "$($Record.DestinationPath)"
    Write-Log "File $FileName was successfully copied >$($Record."Operation")< from folder $($Record."SourcePath") to folder $($Record."DestinationPath")."
    Write-Log "End processing file: $FileName"
    }


    elseif($_.name -notmatch $mask){

    Write-Log "file: is unknown $_.name"

    }

    }

    elseif($RegisterCount.Item(0) -eq 0 -and $MultiPath -eq "0")
    {

    & $Record."Operation" -Path $_.FullName -include $filetype -Destination "$($Record.DestinationPath)"
    Write-Log "File $FileName was successfully copied >$($Record."Operation")< from folder $($Record."SourcePath") to folder $($Record."DestinationPath")."
    Write-Log "End processing file: $FileName"

    }


    else {
    Write-Log "File $FileName was previously loaded."
    Write-Log "End processing file: $FileName"
    }

    }
    else{
    Write-Log "File $FileName already exists in the folder."
    Write-Log "End processing file: $FileName"
    }


    }
    }
    else {Write-Log "### No Files to Process ###."}
    Write-Log "********End processing folder: $($Record."SourcePath")"
    }
    }
    }
    catch
    {
    $msg = $_.Exception.Message
    $Record."Message" = $msg.Replace("r"," ").Replace("n"," ")

    Write-Log $_ -Level Error

    throw $_
    }
    Write-Log "Finishing moving files"
    }

    That's pretty cool, Frederico.  Thanks for taking the time to post that.

    If don't mind, I have one more question... what do you use to schedule it to run?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • frederico_fonseca

    SSChampion

    Points: 14686

    Jeff Moden wrote:

    Teaching a man to fish...That's pretty cool, Frederico.  Thanks for taking the time to post that.

    If don't mind, I have one more question... what do you use to schedule it to run?

    SQL Server Agent - on this case we run this on a weekly basis so its enough to run it once.

    We could also have it as a service or under a filewatcher but no need - script is part of a wider ETL set of scripts

    whole ETL for this particular datawarehouse is comprised of

    • Powershell scripts (to do some extracts or to copy files as per script I gave)
    • C# (for some file copy and for pre processing of some files (convert format - rows to columns and similar)
    • SSIS - to control process flow, data load and stored procs to process the loaded data and generate required reporting tables

     

  • Jeff Moden

    SSC Guru

    Points: 996661

    Thanks, Frederico.  I appreciate it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply