June 30, 2009 at 8:55 am
Ok everyone....we've been fighting this issue at my client for a while now and these are all of the options I've come up with....anyone have anything else.
The situation: We have a couple of summarized reports that our clients want as data sources within their respective tactical data marts that we are creating. We're creating this on the assumption that there won't be any extra work on their part, so already, throwing the numbers we need into a Sharepoint list is out. Here's what we've briefly come up with, then shot down:
1) Using the SSIS Script Task to automate the Excel sheets using the Interop libraries. The catch is that Microsoft strongly advises against this on the server side, so that's enough for our server group to say we can't do it (BTW, we don't have many privileges at all in our environments, so we often need permission from others)
2) Using Excel as an ADO recordset. The catch here is that most of the top sections of the reports we receive are all titles and such, so when Excel uses the first 8 rows to determine datatypes, it sets them all wrong, and the numbers we try to pull come out null. I know there's a registry edit to change that, but again, no from the server team....
3) Using Open XML and Office 2007. A couple of issues here. Most things I see require a 3rd party unzip library, which will be a no. I also saw the ExcelPackage class on CodePlex. Here though, don't you have to add those components to the GAC since the script task won't pick them up in its references? That will also be a no no for us.
Are there any other solutions I'm missing? Does anyone else have the same issues? What would you do?
June 30, 2009 at 10:53 am
Are the summarized reports sourcing their data from another system, or are they populated by manual data entry?
If the former, can you get access to that system/systems and query the data directly (ie avoid Excel). If the latter, if you provided a Sharepoint list for the dtaa entry and then you generated the report, the net work (on their part) is no different (ie still data entry, just different entry point) but you can then query the list for the data.
Further to your point '2' - have/did you try modifying the OpenRowset property of the Excel Source? Normally, this is populated with the range name or simply sheet name when you set up the source. When using the dialog (ie dbl-click the item from the canvas) you're limited to the named ranges and sheet names, but using the properties pane, you can manually set this to be what you want/need - for e.g. $D$50:G350 . This may let/get you around your headers on the report. You can also set this easily through the Advanced Editor for the Excel Source.
HTH,
Steve.
June 30, 2009 at 12:03 pm
Use Excel. Export it to text files, and use the XML file format to tell OpenRowset what the column definitions are and how many rows to skip at the beginning of the file. You'll need to use the Bulk option for OpenRowset to accomplish this.
Setting up OLE Automation to get Excel to export to .txt is pretty straightforward. And it bypasses Excel deciding what the data types are.
No new software. No registry edits. Should be okay.
Does that idea pass muster with the people telling you how it (needs to/cannot) be done?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
July 1, 2009 at 6:04 am
StevefromOZ - Yes, one of the sheets is manual entry and the other is a report that comes from a different federal gov't organization, so I can't speak as to how that is put together. The issue would then be that people responsible would have to have access to our dashboard to use their data, which is also a no no as per our project sponsor. As far as the OpenRowSet you mentioned, I've set that to be a certain range, but even then, it seems as though the datatypes are still based on the first 8 rows in the spreadsheet regardless of the range.
GSquared - I'm not sure I had thought of the solution this way. Converting to a textfile would be a good way to get around the datatype issue. I'm not very familiar with the Bulk option you mentioned, but I'm sure I can look that up. Can you elaborate on using XML once you have it converted to a flat file?
July 1, 2009 at 7:16 am
Look up OpenRowset in Books Online or MSDN. One of the options with it is a "Bulk" flag. You use that for text/csv/etc., files. It allows you to select from them as if they were tables. To do so, you need a format file, that tells the server where columns begin and end and so on.
There are two types of format files, text and XML. Has nothing to do with what type of data you're using OpenRowset on, it's the type of format file. The text version isn't as intuitive, to me at least, as the XML version, which I why I recommended that.
BOL/MSDN have details and examples of how to use either. Just look up OpenRowset, and check the links at the bottom for the details on the format file types as part of your research.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
July 1, 2009 at 7:44 am
Thanks so much for the info. I'll be doing some research into that stuff and I'll try to post back later once we've established the solution.
July 16, 2009 at 10:40 am
Ok, so after looking at the OpenRowSet Bulk option and creating an XML format file, that does seem like a very feasible option. I want to post the report I have to confirm with someone that I'm not overlooking something simpler. Attached is the report with customer names removed. The data I'm trying to grab is just the data in the rows for "My Division". I was thinking of putting those into a pre staging area that I could code against, but getting them directly into the staging db that the cube is based off of is fine. Based on earlier comments of solutions that weren't possible for me, does anyone see something I'm missing?
Viewing 7 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply