Scrubbing Data - Good idea or bad idea?

  • Wondering what the pitfalls or benefits are for developing scrubbing data within the development of a report?

    Is this a good idea or should it be handled in other ways such as, cleaning it up in the tables, or having oprations change it to sound values through the software?

    I can see where in situations where operations is lose and doesn't follow standard practices in entering the data correctly in large data implementations having to try and program around any possible variation would be virtually impossible and adding complexity to deployed reports that make the management and upkeep very difficult. Are there industry standard documentation that specifically addresses the correct way to handle this?

    Thanks,

    Sean

    A clever person solves a problem. A wise person avoids it. ~ Einstein
    select cast (0x5365616E204465596F756E67 as varchar(128))

  • One option might be SSIS to a new database? (then you can clean in between... in the SSIS pipe).

  • TeraByteMe (2/21/2016)


    Wondering what the pitfalls or benefits are for developing scrubbing data within the development of a report?

    Is this a good idea or should it be handled in other ways such as, cleaning it up in the tables, or having oprations change it to sound values through the software?

    I can see where in situations where operations is lose and doesn't follow standard practices in entering the data correctly in large data implementations having to try and program around any possible variation would be virtually impossible and adding complexity to deployed reports that make the management and upkeep very difficult. Are there industry standard documentation that specifically addresses the correct way to handle this?

    Thanks,

    Sean

    If the data is coming from the database, then you shouldn't have to "scrub" any data because, ostensibly, you have validation rules both at the database level and at the GUI level that should not allow any questionable data into your database.

    Of course, real life dictates that people aren't so good at design and control. I NEVER change original data by scrubbing it. If it comes from a file, I make sure I keep the file so that I can prove the original condition of the data and that the scrubbing has not altered the meaning of the data. Same holds true for tables and things like reports. The original data is always available for audit or whatever proof is necessary.

    Protect yourself and the company. If data must be scrubbed, always make sure that you can easily get back to the original for audit and "lawyer" purposes. Changing "Stret" to "Steet" or "Av" to "Ave" in an address column probably won't cost you but even with such an obvious scrubbing, make sure the original data is somewhere.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Are there industry standard documentation that specifically addresses the correct way to handle this?

    The only one I know is that you must do requirements gathering from your team/client on how to handle these data inconsistency issues. It should be documented somewhere what those decisions, if any , are. I dont want to be making executive decisions. I do the investigation and present the situation to the correct decision making people.

    ----------------------------------------------------

  • MMartin1 (2/22/2016)


    Are there industry standard documentation that specifically addresses the correct way to handle this?

    The only one I know is that you must do requirements gathering from your team/client on how to handle these data inconsistency issues. It should be documented somewhere what those decisions, if any , are. I dont want to be making executive decisions. I do the investigation and present the situation to the correct decision making people.

    Heh... and if I think they came to the wrong decision, I make darned sure that I have it all documented so that I'm not the goat they decide to sacrifice.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/22/2016)


    MMartin1 (2/22/2016)


    Are there industry standard documentation that specifically addresses the correct way to handle this?

    The only one I know is that you must do requirements gathering from your team/client on how to handle these data inconsistency issues. It should be documented somewhere what those decisions, if any , are. I dont want to be making executive decisions. I do the investigation and present the situation to the correct decision making people.

    Heh... and if I think they came to the wrong decision, I make darned sure that I have it all documented so that I'm not the goat they decide to sacrifice.

    Very true. You may still encounter unscrupulous people who would dare , in light of documented evidence, downplay that evidence to save their face. THey may say things like "you did no emphasise it enough " or "you did not communicate it well." Why I try to explain in no uncertain terms my thoughts on a matter and that it is their responsibility to make sure that they comprehend the situation. Still, you cant please everyone.

    ----------------------------------------------------

  • Thanks for all the advice. They all would work well if there was any form of procedure and accountability in the company I work with.

    I started the job almost 6 weeks ago. There had been two people in the positiion over the last year before. This situation having some 1900 base reports and about 1600 linked reports. When I came in there was a backlog of requests that had built up over time - especially over the month that no one was in the position before I came on. I think about 60 change report, new report, ah-hoc report, and data valiation tickets. There is no formal process just an email to a ticketing system. Very complex backend with thousands of tables and data coming from dozen of sources into the tables. About 25 databases on the main server and a dozen linked servers. If that isn't bad enough there are pushy/agressive type women that enjoy having someone to push around and I am the easy target. These business user women emailing, chatting, and phoning me all day long with one emergency or another. I can't sit uninterrruped for over a half hour. Just want to dictate not listen or respond. I am constantly being pulled off my tasks. And always after I have deployed a report to their specification they come back and say will you change this or change that. I do and deploy then they come aain. Mean while the tickets and emails keep pouring in. Co-workers in my group pass work off to me. Everybody pushing work off on me.

    I am resigning. I have been working some 70 hours a week spending my weekends working remotely and waking up in the middle of the night being to anxious to fall asleep.

    A clever person solves a problem. A wise person avoids it. ~ Einstein
    select cast (0x5365616E204465596F756E67 as varchar(128))

  • TeraByteMe (2/24/2016)


    Thanks for all the advice. They all would work well if there was any form of procedure and accountability in the company I work with.

    I started the job almost 6 weeks ago. There had been two people in the positiion over the last year before. This situation having some 1900 base reports and about 1600 linked reports. When I came in there was a backlog of requests that had built up over time - especially over the month that no one was in the position before I came on. I think about 60 change report, new report, ah-hoc report, and data valiation tickets. There is no formal process just an email to a ticketing system. Very complex backend with thousands of tables and data coming from dozen of sources into the tables. About 25 databases on the main server and a dozen linked servers. If that isn't bad enough there are pushy/agressive type women that enjoy having someone to push around and I am the easy target. These business user women emailing, chatting, and phoning me all day long with one emergency or another. I can't sit uninterrruped for over a half hour. Just want to dictate not listen or respond. I am constantly being pulled off my tasks. And always after I have deployed a report to their specification they come back and say will you change this or change that. I do and deploy then they come aain. Mean while the tickets and emails keep pouring in. Co-workers in my group pass work off to me. Everybody pushing work off on me.

    I am resigning. I have been working some 70 hours a week spending my weekends working remotely and waking up in the middle of the night being to anxious to fall asleep.

    If you have other opportunities, I think that resigning is probably the right thing to do in this case. No one needs to work in that kind of unappreciative totally out of control sweat shop. I have no problems working with women (or anyone for that matter) but I won't take a load of hooie or unnecessary aggressiveness from anyone, male or female. I'd be tempted to make that clear if there's an exit interview to be had but it won't do the company any good. Like Ron White says, you can fix ugly but you can't fix stupid. Find a company that appreciates your talents, has reasonable processes and expectations in place, and count your blessings that you now know what a "dirt bag" company looks like so that you now know what to "sense" for during an interview when you're looking for a good company.

    One of the keys to your interviews after this episode will be to rehearse the answer for the question "why did you leave that company"? Remember that they may focus on that in the form of many questions that you'll have to field on the fly and do so in a polite and elegant fashion. Be real careful to not sound incredibly negative even if there's nothing positive and be careful to not say you had a better opportunity as it will identify you as a flight risk.

    Still, you must tell the truth. I went through the same thing and here's how I answer that particular question and, to emphasize, it IS the absolute truth.

    Interviewer: I notice that you only worked 7 weeks for company XYZ. What happened there?

    Me: That was a tough one for me. I've been in IT for a long time and I realize that all-night-death-marches can and do happen and I certainly step up to the plate when such urgencies happen. None of the tasks they gave me were individually unreasonable but, like anyone else, it's really difficult to work 70 to 80 hours a week with no end in site and still be effective. After 5 straight weeks, it started to affect my health and knew I had to make a change. Since I couldn't change their processes to be more effective, I gave them my two weeks notice and continued supporting their goals during those two weeks as I had the previous 5. I even worked overtime on my last day to try to leave them in the best condition I could.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • TeraByteMe (2/24/2016)


    Thanks for all the advice. They all would work well if there was any form of procedure and accountability in the company I work with.

    I started the job almost 6 weeks ago. There had been two people in the positiion over the last year before. This situation having some 1900 base reports and about 1600 linked reports. When I came in there was a backlog of requests that had built up over time - especially over the month that no one was in the position before I came on. I think about 60 change report, new report, ah-hoc report, and data valiation tickets. There is no formal process just an email to a ticketing system. Very complex backend with thousands of tables and data coming from dozen of sources into the tables. About 25 databases on the main server and a dozen linked servers. If that isn't bad enough there are pushy/agressive type women that enjoy having someone to push around and I am the easy target. These business user women emailing, chatting, and phoning me all day long with one emergency or another. I can't sit uninterrruped for over a half hour. Just want to dictate not listen or respond. I am constantly being pulled off my tasks. And always after I have deployed a report to their specification they come back and say will you change this or change that. I do and deploy then they come aain. Mean while the tickets and emails keep pouring in. Co-workers in my group pass work off to me. Everybody pushing work off on me.

    I am resigning. I have been working some 70 hours a week spending my weekends working remotely and waking up in the middle of the night being to anxious to fall asleep.

    Where is your manager in all of this? There needs to be a more formal process to quell the madness and at the same time prioritize the must haves from the nice to haves. You have raised this as an issue to management have you not?

    I am sure hiring managers are well aware that sweat shops like this exist and I imagine would not hold it against you your escape. If they do I have to wonder about their system and their qualifications as managers in the support their employees. So it sort of filters out future places of this sort in your new search as well. IE, an interviewer that symapthises with you more than judges you is probably someone you want to work for.

    ----------------------------------------------------

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply