ETL Security Holes

  • Comments posted to this topic are about the item ETL Security Holes

  • Yes, Security is a major concern in a data platform. But for my knowledge SSIS does not have inbuilt security feature to control data flow and for data protection other then package protection property.

    Can we also discuss about the security features that other ETL tool has?

  • Data security starts at data access.

    If an unwanted user can access the data, it is not secure. PowerPivot does not change this.

    It might in some cases remove the step of requiring user A to export data to excel before showing it to user B.

  • Regarding the scurity of the ETL (or reporting) tools themselves, the biggest concern is not embedding login credentials into the package. The package should login using Windows Authentication and a trusted connection. That way if someone copies of the package from source control to their desktop or to another un-secure environment, it will fail to connect when run outside the proper context.

    Regarding the issue of copying data for analysis or reporting purposes from the production environment into un-secure environments, it should be considered a best practice to provide the BI team with a restricted account that does not allow ad-hoc SELECT on tables. The BI account should only be allowed access to views or stored procedures created for the specific purpose of reporting, and these datasets should not contain any personally identifying items. Instead of customer account number or customer name, it should instead return a surrogate key, like an identity column. I understand that often times the BI team needs to analyze data at the most granular level. For this they need a unique identifer for each customer, but that idenifier doesn't need to be the actual account number or name. If a loptop gets stolen or the reporting datamart gets hacked, then it will be of much less use to the hacker without actual account numbers and names.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • @eric, so what if the exported data doesn't contain the names of customers, but DOES contain actual sales-figures of the company, sure a hacker can make use of that in the stock-market... (just an example). It doesn't matter how granular data is, ANY data poses a security risk, since it can be useful for some purposes (otherwise the data would not be analyzed in the first place).

  • marco-870908 (1/17/2011)


    @Eric, so what if the exported data doesn't contain the names of customers, but DOES contain actual sales-figures of the company, sure a hacker can make use of that in the stock-market... (just an example). It doesn't matter how granular data is, ANY data poses a security risk, since it can be useful for some purposes (otherwise the data would not be analyzed in the first place).

    Of course any data used for internal reporting is not something the company wants released to the public. I'm just saying that removing personal identifiers mimimizes the damage, and often times more columns than are needed are brought across into reporting datamarts or Excel sheets. By granting only access to views and stored procedures, more constraints can be placed on what tables, columns, and filter criteria are included in external data extracts.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric Russell 13013 (1/17/2011)


    Regarding the issue of copying data for analysis or reporting purposes from the production environment into un-secure environments, it should be considered a best practice to provide the BI team with a restricted account that does not allow ad-hoc SELECT on tables. The BI account should only be allowed access to views or stored procedures created for the specific purpose of reporting, and these datasets should not contain any personally identifying items.

    Eric, I agree with you but this does not agree with any BI project that accessed sensitive data that I have seen.

    short story: All Company Security starts in the Boardroom. You can not enforce anything that the people you work for won't follow.

    I have seen this happen twice... Non-IT Executive trying to make a name for himself gets okay to give presentation to boardroom detailing how a good data whare house can improve returns. Executive gets ok and a budget to hire a thrid party and outsourcers to build the data warehouse and reporting that he wants. No money or time was figured in to remove customers private data including address and Social numbers. Executive gets access to the nightly full backups of the database by calling the Data Center Server support hotline and sends those to the outsource partner. All ability to secure that data is lost. Since there was no NDA signed the outsourcer can sell this data to anyone with no legal recourse.

    Both times the IT Manager was fired for allowing the Executive access to sensitive data.

  • SanDroid (1/17/2011)


    ...

    Eric, I agree with you but this does not agree with any BI project that accessed sensitive data that I have seen.

    short story: All Company Security starts in the Boardroom. You can not enforce anything that the people you work for won't follow.

    ...

    Of course there are times when it's necessary to peform ad-hoc querying against the source transaction tables; but that should be rare and narrow in scope. When we define the requirements of the reports and dashboards upfront, and then designing a set of views to fulfill those requirements; not only does that improve security, but it also standardizes the queries and reduces the BI development time going forward. How wonderful it is when the developer is handed a list of a dozen or so views from which to report on instead of a entity diagram with 100 tables. As for saving money, the company may discover that they can get by with fewer fulltime BI developers and less disk storage space for the datamart when the most granular source data needed for reporting is cut and dried ahead of time.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • As far as security goes for ETL, this is another reason to understand how to limit the SSIS package's account. If it's running under a sysadmin account, it may very well have sysadmin access to data, which may very well be data that really shouldn't go into whatever the package's target is.

    There's a lot to keep track of in securing a system. This is just another facet to it.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Something else that came up during the Wikileaks bit was the idea of volume confidentiality differences, which is becoming a more severe issue only in recent years.

    To simplify the problem: Knowing that one tank was refuled via one voucher at xyz coordinate, while important to some gas-jockey somewhere, is not a confidential item. Having a list of every tank refueled in the last week, and where, will show troop movements, deployment strengths, and possible holes in the defensive lines.

    The ability to use a volume of data to discern things not intended by the source data is one of the things data analysts do. So, while we might allow our users access to x data, they are expected to be working one record at a time. This can also open unexpected holes. The average DBA doesn't have time to dig into every DB (especially at large shops) to determine these kinds of holes. The average developer is too low on the totem pole to fight that battle with the executives, they usually don't have the clout, if they even have enough knowledge of the big picture of what's outbound from their project.

    Without anyone usually having the holistic view of everything occuring with any piece of data, we're spiraling out of intelligent data access control. We need to find a different solution to the eventual result of data tyranny or anarchy.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I'm curious, we are calling these security holes in SSIS.. Do we have any other packages that handle it better and is it the duty of the ETL to enforce security or is its job to extract transform and load the data according to the business requirements within the access allowed to it?

    I'm thinking we are putting duties and obligations onto SSIS and other ETL packages that are outside their function or responsibility. I expect developers to follow our internal security protocols and standards and their code is periodically reviewed to be sure it does. This includes making sure that no or minimal PII is extracted, unless it MUST be. For example, I don't want to see SSN in a cube, there is no reason for it, but I might see employee id and name. But if I were doing an extract for a health insurance company I would be VERY concerned if they wanted us to pull in individual customer information, perhaps group information would be ok but not much more granular. But even that might be ok if we were extracting data to cut checks.. I guess I'm back to it not being the ETL tools job..

    CEWII

  • GSquared (1/18/2011)


    As far as security goes for ETL, this is another reason to understand how to limit the SSIS package's account. If it's running under a sysadmin account, it may very well have sysadmin access to data, which may very well be data that really shouldn't go into whatever the package's target is.

    There's a lot to keep track of in securing a system. This is just another facet to it.

    Agreed. A lot of what security is - is the attention to detail of all the systems involved.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Craig Farrell (1/18/2011)


    ...

    Without anyone usually having the holistic view of everything occuring with any piece of data, we're spiraling out of intelligent data access control. We need to find a different solution to the eventual result of data tyranny or anarchy.

    In my universe, there are Data Analysts and Data Stewards who look at the big picture regarding what data is allowed to be stored or extracted and by whom. The DBA comes into the picture when decisions are made, and the app / db developers also need to keep everyone in loop regarding what potential sensitive data is in the database, because in some organizations the DBA just looks at the database in terms of LUNs, indexes, bandwidth, and backup schedules.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Elliott Whitlow (1/18/2011)


    I'm thinking we are putting duties and obligations onto SSIS and other ETL packages that are outside their function or responsibility. I expect developers to follow our internal security protocols and standards and their code is periodically reviewed to be sure it does.

    100% Agree. The reporting platform must be able to serve all kinds of reports. HR may need reports that DO show detailed employee information (identify who should be urged to take some time off because their annual leave balances are too high); Sales may need customer names and contact info of biggest customers, or those with the longest history of no product license renewals). And how about a report for submitting money-laundering suspects to the appropriate government-agency (no idea which those are in the US, here in AU every institution doing big volumes of monetary transactions must report on suspicious transactions which can only be found with a good analysis tool for finding particular patterns and should include tax-file-numbers (equivalent of social security number) and other highly privacy-sensitive data).

    Reporting platforms are used for far more than BI reports. A lot of operational reports can also be efficiently implemented with the same reporting-engine/technology. So it is not the tool that must enforce security, it's the report-developer that must ensure a particular report is not showing unnecessary sensitive information and that said report can only be retrieved by appropriate staff.

    The next thing is: even when a user has a very legitimate use for extracting reports on particular data (the finance-staff responsible for the above mentioned anti-money-laundering reporting for example) has the data on his/her workstation and can do with it what (s)he wants, such as dumping it on a memory-stick and sell it on the street. Platforms such as SSIS cannot do a thing about thatm except enforcing that Joe the sales-rep has no access to the anti-money-laundering report.

  • Eric Russell 13013 (1/18/2011)


    Craig Farrell (1/18/2011)


    ...

    Without anyone usually having the holistic view of everything occuring with any piece of data, we're spiraling out of intelligent data access control. We need to find a different solution to the eventual result of data tyranny or anarchy.

    In my universe, there are Data Analysts and Data Stewards who look at the big picture regarding what data is allowed to be stored or extracted and by whom. The DBA comes into the picture when decisions are made, and the app / db developers also need to keep everyone in loop regarding what potential sensitive data is in the database, because in some organizations the DBA just looks at the database in terms of LUNs, indexes, bandwidth, and backup schedules.

    Your department head and data integrity are miles above the average company I've worked in. That includes financial firms and healthcare providers. No I won't mention names.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply