SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Powershell and Excel and SQL Server


Powershell and Excel and SQL Server

Author
Message
schleep
schleep
SSCrazy
SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)

Group: General Forum Members
Points: 2801 Visits: 1406
Is there a way to read an entire Excel sheet into a Powershell variable, including the values as well as the cell formatting (specifically, Interior.ColourIndex), as opposed to the RBAR-CBAC (Cell By Agonizing Cell - to extend Jeff's nomenclature), method?

Thanks!

P



AndyK-565224
AndyK-565224
SSC-Enthusiastic
SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)

Group: General Forum Members
Points: 106 Visits: 272
You'll have to clarify a little.

My most simpleminded reading of your request sounds like you want all the contents of all the cells (including some description of the cell formats) all strung together and stored into a single Powershell variable?

Perhaps you mean some sort of array or hashtable?

Or do you mean into a singe Object of some sort that you can later manipulate?

If the last, and your source file is in .xlsx format, and you are comfortable looking at C# examples and translate into powershell, then look at
http://closedxml.codeplex.com/
which I have used to import excel tables.

A simpler one to use is
http://exceldatareader.codeplex.com/discussions/432523
schleep
schleep
SSCrazy
SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)

Group: General Forum Members
Points: 2801 Visits: 1406
I'm doing this in a loop: it's taking about .5 seconds/row to access these cells, + another .5 seconds for processing.

$nId = $UserWorksheet.Cells.Item($intRow, 1)
$cE_Description = $UserWorksheet.Cells.Item($intRow, 2)
$cF_Description = $UserWorksheet.Cells.Item($intRow, 3)
$nParent_Id = $UserWorksheet.Cells.Item($intRow, 4)
$cOther = $UserWorksheet.Cells.Item($intRow, 5)
$cActive = $UserWorksheet.Cells.Item($intRow, 6)
$nExclusive = $UserWorksheet.Cells.Item($intRow, 7)
$cDefaultChecked = $UserWorksheet.Cells.Item($intRow, 8)

Not great performance on 100 rows, downright painful when the number of rows starts to get up there.

So what I was hoping for is a way to get the entire sheet's data read into an array -- or an object -- all at once, I'm thinking that should speed up processing.



AndyK-565224
AndyK-565224
SSC-Enthusiastic
SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)

Group: General Forum Members
Points: 106 Visits: 272
Yes. If you can guarantee that the spreadsheet will be in .xlsx form it makes things a lot simpler. Remember, an .xlsx document is really just a suite of XML files. Change the .xlsx extension to .zip on some spreadsheet, and you can explore how the data is arranged.

Microsoft has an SDK that lets you manipulate these files with a .net object model.

Welcome to the Open XML SDK 2.5 for Office
http://msdn.microsoft.com/en-us/library/bb448854(v=office.15).aspx

But most people find it very difficult to use.

Someone has written a "wrapper" for this that makes it more like the the Excel Object Model that you used in your example
ClosedXML
http://closedxml.codeplex.com/

Most of the examples are in C# or VB -- but you should be able to translate it.

I used this successfully at a prior employer and I can't track down any sample code of what I did - but it does work.

And since you are just reading text files that are XML formatted - it is pretty fast - AND you don't need Excel installed on that machine.

Also Microsoft really discourages use of the Excel Object Model in a server environment. Every time you do something it spawns an instance of Excel - and try as you might you can't guarantee that the instance is always terminated - so you wind up with dozens or hundreds of Excel instances gumming up memory.
schleep
schleep
SSCrazy
SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)

Group: General Forum Members
Points: 2801 Visits: 1406
Thanks Andy.

I had a look at the closed XML thing, seems rather complex for my needs.

Let me get down to specifics: I have a worksheet object, got the range I in question defined.
I want a list of the rows where any of the cells on that row have cell.Interior.ColorIndex <> -4142.



AndyK-565224
AndyK-565224
SSC-Enthusiastic
SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)SSC-Enthusiastic (106 reputation)

Group: General Forum Members
Points: 106 Visits: 272
AFAIK all the strategies to speed up reading Excel involve finding ways to avoid using the COM model.

You can cause excel to save the data as .cvs and then read the .cvs file.

If the spreadsheet is laid out as a nice table, you can use ODBC to read the file.

But I don't know of a way to get at the color data using these methods.

If you control the layout of the spreadsheets, you could write a Function in VBA in the spreadsheet. Said function would return a number indicating what color was found in the referenced cell. Insert a "helper" column next to the column you are interested in. Refer to the function in the helper column - then when you read the file with ODBC or save it as .cvs, you will have a number value you can inspect.

An example is here:
http://stackoverflow.com/questions/5110963/excel-get-cell-color
schleep
schleep
SSCrazy
SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)

Group: General Forum Members
Points: 2801 Visits: 1406
Thanks again Andy,

I had arrived at the same solution: a VBA function to, er, highlight the rows of interest.

Fortunately this is an ad-hoc type job that only needs to be done every couple of months, so I do have the luxury of the RBAR solution, in that I can set it to run and fuhgetabouddit.

Cheers!



schleep
schleep
SSCrazy
SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)SSCrazy (2.8K reputation)

Group: General Forum Members
Points: 2801 Visits: 1406
Here's some feedback:

I've found that the code below is considerably faster than my previous RBAR-CBAC solution -- although I think that's what's happening in the background of the where clause.

$Rows = $UserWorksheet.Range("A2:H$LastRow").Cells | ?{$_.Interior.ColorIndex -ne -4142} | Sort-Object -Property row -Unique | select row

I have 8 columns in my table, 7 of which need to be searched for ColorIndex, and as soon as I find one -ne -4142 field on a row, I can stop looking at that row. So I tried to iterate through the columns where rowid not in changed rows. Bad idea: took 7 times longer to go through 51 rows of data than the above code.



Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search