Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Powershell and Excel and SQL Server Expand / Collapse
Author
Message
Posted Tuesday, December 24, 2013 9:15 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Tuesday, December 16, 2014 5:34 AM
Points: 439, Visits: 1,014
Is there a way to read an entire Excel sheet into a Powershell variable, including the values as well as the cell formatting (specifically, Interior.ColourIndex), as opposed to the RBAR-CBAC (Cell By Agonizing Cell - to extend Jeff's nomenclature), method?

Thanks!

P





Post #1525774
Posted Thursday, December 26, 2013 10:11 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, October 14, 2014 10:24 AM
Points: 72, Visits: 271
You'll have to clarify a little.

My most simpleminded reading of your request sounds like you want all the contents of all the cells (including some description of the cell formats) all strung together and stored into a single Powershell variable?

Perhaps you mean some sort of array or hashtable?

Or do you mean into a singe Object of some sort that you can later manipulate?

If the last, and your source file is in .xlsx format, and you are comfortable looking at C# examples and translate into powershell, then look at
http://closedxml.codeplex.com/
which I have used to import excel tables.

A simpler one to use is
http://exceldatareader.codeplex.com/discussions/432523
Post #1525999
Posted Friday, December 27, 2013 6:53 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Tuesday, December 16, 2014 5:34 AM
Points: 439, Visits: 1,014
I'm doing this in a loop: it's taking about .5 seconds/row to access these cells, + another .5 seconds for processing.

$nId = $UserWorksheet.Cells.Item($intRow, 1)
$cE_Description = $UserWorksheet.Cells.Item($intRow, 2)
$cF_Description = $UserWorksheet.Cells.Item($intRow, 3)
$nParent_Id = $UserWorksheet.Cells.Item($intRow, 4)
$cOther = $UserWorksheet.Cells.Item($intRow, 5)
$cActive = $UserWorksheet.Cells.Item($intRow, 6)
$nExclusive = $UserWorksheet.Cells.Item($intRow, 7)
$cDefaultChecked = $UserWorksheet.Cells.Item($intRow, 8)

Not great performance on 100 rows, downright painful when the number of rows starts to get up there.

So what I was hoping for is a way to get the entire sheet's data read into an array -- or an object -- all at once, I'm thinking that should speed up processing.



Post #1526124
Posted Friday, December 27, 2013 8:29 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, October 14, 2014 10:24 AM
Points: 72, Visits: 271
Yes. If you can guarantee that the spreadsheet will be in .xlsx form it makes things a lot simpler. Remember, an .xlsx document is really just a suite of XML files. Change the .xlsx extension to .zip on some spreadsheet, and you can explore how the data is arranged.

Microsoft has an SDK that lets you manipulate these files with a .net object model.

Welcome to the Open XML SDK 2.5 for Office
http://msdn.microsoft.com/en-us/library/bb448854(v=office.15).aspx

But most people find it very difficult to use.

Someone has written a "wrapper" for this that makes it more like the the Excel Object Model that you used in your example
ClosedXML
http://closedxml.codeplex.com/

Most of the examples are in C# or VB -- but you should be able to translate it.

I used this successfully at a prior employer and I can't track down any sample code of what I did - but it does work.

And since you are just reading text files that are XML formatted - it is pretty fast - AND you don't need Excel installed on that machine.

Also Microsoft really discourages use of the Excel Object Model in a server environment. Every time you do something it spawns an instance of Excel - and try as you might you can't guarantee that the instance is always terminated - so you wind up with dozens or hundreds of Excel instances gumming up memory.
Post #1526167
Posted Friday, December 27, 2013 12:00 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Tuesday, December 16, 2014 5:34 AM
Points: 439, Visits: 1,014
Thanks Andy.

I had a look at the closed XML thing, seems rather complex for my needs.

Let me get down to specifics: I have a worksheet object, got the range I in question defined.
I want a list of the rows where any of the cells on that row have cell.Interior.ColorIndex <> -4142.








Post #1526236
Posted Friday, December 27, 2013 1:01 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, October 14, 2014 10:24 AM
Points: 72, Visits: 271
AFAIK all the strategies to speed up reading Excel involve finding ways to avoid using the COM model.

You can cause excel to save the data as .cvs and then read the .cvs file.

If the spreadsheet is laid out as a nice table, you can use ODBC to read the file.

But I don't know of a way to get at the color data using these methods.

If you control the layout of the spreadsheets, you could write a Function in VBA in the spreadsheet. Said function would return a number indicating what color was found in the referenced cell. Insert a "helper" column next to the column you are interested in. Refer to the function in the helper column - then when you read the file with ODBC or save it as .cvs, you will have a number value you can inspect.

An example is here:
http://stackoverflow.com/questions/5110963/excel-get-cell-color
Post #1526248
Posted Monday, December 30, 2013 5:27 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Tuesday, December 16, 2014 5:34 AM
Points: 439, Visits: 1,014
Thanks again Andy,

I had arrived at the same solution: a VBA function to, er, highlight the rows of interest.

Fortunately this is an ad-hoc type job that only needs to be done every couple of months, so I do have the luxury of the RBAR solution, in that I can set it to run and fuhgetabouddit.

Cheers!




Post #1526463
Posted Monday, December 30, 2013 7:58 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Tuesday, December 16, 2014 5:34 AM
Points: 439, Visits: 1,014
Here's some feedback:

I've found that the code below is considerably faster than my previous RBAR-CBAC solution -- although I think that's what's happening in the background of the where clause.

$Rows = $UserWorksheet.Range("A2:H$LastRow").Cells | ?{$_.Interior.ColorIndex -ne -4142} | Sort-Object -Property row -Unique | select row

I have 8 columns in my table, 7 of which need to be searched for ColorIndex, and as soon as I find one -ne -4142 field on a row, I can stop looking at that row. So I tried to iterate through the columns where rowid not in changed rows. Bad idea: took 7 times longer to go through 51 rows of data than the above code.




Post #1526507
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse