Confused about how to design a new process (third week into new role)

  • Good afternoon,

    I'm not sure if I'm posting this in the right section but I could really do with some advice on how to progress this project. Any suggestions/ideas would be greatly appreciated!

    I have been tasked with automating the following process:

    Existing setup:

    Our data services department offer various services such as:

    1) Batch duplicate processing

    2) Data cleansing, address standardisation / validation.

    3) Matching routines (i.e. Individual, Business, Family & Household)

    ...etc

    The way the current process works is like this:

    * Company ABC will provide us with a mailing list and a job spec (e.g. cleanse names, validate addresses and merge Household level)

    * A network folder is created for the job which contains the source data files, documentation and also a SQL folder

    * A new database is created for the customer (if one doesn't already exist) which will store the staging and output tables.

    The SQL folder for the job will contain a dozen scripts to perform various data cleansing/matching routines, for example:

    1) Data Hygiene script

    2) Hygiene Quality Check script

    3) merge script

    4) Post merge script

    5) Final Hygiene script

    6) Output script

    7) Hygiene Quality Check output script

    There are 7 data processing staff members using 4 SQL server instances with each instance holding over 30 databases. The database for a job could be on any of these 4 SQL servers! The 4 SQL Servers are 2005 but we also have SQL Server 2008 instances used for other projects.

    What I have found so far is that the scripts in each folder are all kind of similar but each one is specific to the job it was used for. So for example a merge script in one folder will contain various update statements to perform further data cleansing for that job while another have less etc...

    The same applies to the other scripts. All the folders contain their own versions of these scripts and they're all similar but not the same.

    What I am tasked with is automate the following process:

    1) Data processing staff imports the raw file into SQL (this bit I don't have to deal with as we already have a data loader program implemented which does the job well.

    2) They gather the required scripts from various folders and manually run each one (adding further updates as they go along)

    3) Another staff member performs the Quality Check after each step

    4) If another job comes from the same customer then the same database can be used and the same scripts get reused (again they get modified if further data cleansing is required)

    I'm sure you guys get the picture by now!

    So what we have is 4 SQL Servers each with 30+ databases in each one (some customers will have multiple databases whilst others will have just one database for recurring jobs)

    40+ SQL folders and each folder will have its own version of the merge, data hygiene scripts etc...

    I am now in my third week in this job and I can't make head no tail of this mess! I've spent hours going through scripts, then I started looking at the merge process last week and now I'm back to the beginning.

    Now I am beginning to stress big time! I'm really keen to impress but I've not seen anything like this and I'm beginning to doubt myself. I've spent the last 4 years working in a Service Desk as a SQL Support Analyst so I have strong problem solving skills but I'm struggling to design a process to automate this mess.

    Basically the ultimate goal is to automate the process so a source table goes through:

    1) Data Cleansing.

    2) Data Standardisation

    3) Data Merge

    Nothing is committed at this stage but a rich report should be provided to the data processing staff member so they get counts of:

    1) Count of record with missing title/forename/surname etc.

    2) Count of cleansed forenames/surnames/addresses etc...

    3) Count of individual records/Business records or Family & Household.

    4) Sample of matches and probably matches based on a match score

    ..etc ...etc...

    Once the user is happy with the result they can then commit the result which would ultimately merge it with the Single Customer View database for that customer.

    Thanks for taking the time to read this and if anyone can offer help/advice I would be eternally grateful!

    Regards.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • It sounds, personally, like there's actually two primary stages to this process. Standard and Custom. Right now they're rolled into one.

    What I would recommend you do is create a standard 'cleansing' protocol. This would be a primary load location (or a series of scripts deployed to the database that you've just built via Model) that would take care of all standard cleanup components.

    Then, customs would be ran after the standard, hopefully keeping what's standard from now on and what's custom per client/data component separate.

    Also, you might think about giving your users DDLAdmin on the databases so they can deploy their scripts as procs so you can more easily keep track of what belongs to whom. My guess is this is the way it's easiest for business because they've never been shown an alternative.

    However, really, that's a big deal you're dealing with over there and it sounds like that's their primary source of revenue, at least for the department you're supporting. I'd tread lightly and first learn their process before you go and start making major alterations.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I agree with Craig.

    Break the process down into smaller pieces first. Conquer each piece, one at a time.

    In this kind of confusing situation, don't try to get a grasp on all of it all at once. Pick one piece, get a solid grip on that, then work out to other pieces from there.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • This would be a primary load location (or a series of scripts deployed to the database that you've just built via Model) that would take care of all standard cleanup components.

    Firstly let me say thank you for your time reading my long post. I really appreciate your opinion.

    You are right and I really like your suggestion.

    Also, you might think about giving your users DDLAdmin on the databases so they can deploy their scripts as procs so you can more easily keep track of what belongs to whom. My guess is this is the way it's easiest for business because they've never been shown an alternative.

    They already have and they create many staging tables and do all sorts of inserts and deletions. I've seen this in a lot of scripts.

    However, really, that's a big deal you're dealing with over there and it sounds like that's their primary source of revenue, at least for the department you're supporting. I'd tread lightly and first learn their process before you go and start making major alterations.

    You're spot on once again. I had a meeting with the Director of Software development and she made a point about the data services team being afraid of change. This explains why the the Director of Data Services is so keen to know what it is that I'm doing! She's asked for two updates this week already!

    So I have some decisions to make.....

    a) Do I start with small changes like enhance existing functions/sprocs?

    For example yesterday I rewrote two data cleansing functions. The original version took 27 seconds to run on a data set of 250k record. My new version took 5 seconds! But this is a small cog in a large data hygiene/merge machine!

    b) Do as you suggest and start by creating a generic data cleanse script?

    I think I bit off more than I could chew taking on this role! :doze:

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • agree with Craig.

    Break the process down into smaller pieces first. Conquer each piece, one at a time.

    Once again thanks for your input G2 your time is much appreciated.

    Please see my reply to Craig. What do you think?

    In this kind of confusing situation, don't try to get a grasp on all of it all at once. Pick one piece, get a solid grip on that, then work out to other pieces from there.

    Yes that makes sense. But the Director of Software Development also made a point about me not losing site of the big picture. What she meant was that I should design an over all system that works which doesn't have to be perfect. I don't think she realizes how complex this project is or am I the one making a big deal out of this project?!

    I was nosing around the other day looking for some SQL scripts when I stumbled across folders containing SQL scripts and crude documentation of various failed attempts at automating the data hygiene process. Weird.... I've no idea why these were not used. Maybe these guys are scared of change!

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Abu Dina (8/8/2012)


    You're spot on once again. I had a meeting with the Director of Software development and she made a point about the data services team being afraid of change. This explains why the the Director of Data Services is so keen to know what it is that I'm doing! She's asked for two updates this week already!

    That's not horribly uncommon anyway, particularly for a new employee when there's no direct peers (it doesn't sound like you have any, anyways).

    a) Do I start with small changes like enhance existing functions/sprocs?

    Absolutely! Ask to sit with one person through their day too, so you can see all of what they're doing end to end. What you want to do is understand the business workflow from the eyes of the current users. Find their pain points. Whatever you can do to remove those will give you more 'bank' for later when you start introducing things they're not comfortable with.

    b) Do as you suggest and start by creating a generic data cleanse script?

    Perhaps. This will require more buy-in from the business users (Data Services) but could be a long-term goal to make sure new 'common' changes are included from now on (as you'll update the model scripts) but customs can stay custom without affecting any other clients. You'll definately want to find some way of having a common build script that everyone's working from so that fixes you make today don't get lost tomorrow when someone copies from a different folder.

    You're going to spend a lot of time here just playing catchup with the business, and understanding what the intention behind all of the seeming traditions are. If you can stay true to the intention without making anyone feel silly for not 'thinking of that before', particularly in an entrenched business process, you'll be ready to start making strides into making that into a well oiled machine.

    Just be careful. Too much, too fast, and you'll end up with 'pushback' because of people being afraid of losing their jobs if you automate too much. This is something you'll want to speak with directly to the director about, and I recommend it be a face to face. What you're looking for, in the immediate sense, is ways to make them more productive without taking away their job entirely. That starts with the 'pain points' from earlier.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Evil Kraig F (8/8/2012)


    Abu Dina (8/8/2012)


    You're spot on once again. I had a meeting with the Director of Software development and she made a point about the data services team being afraid of change. This explains why the the Director of Data Services is so keen to know what it is that I'm doing! She's asked for two updates this week already!

    That's not horribly uncommon anyway, particularly for a new employee when there's no direct peers (it doesn't sound like you have any, anyways).

    Yes, I'm working on my own on this project :pinch:

    Absolutely! Ask to sit with one person through their day too, so you can see all of what they're doing end to end. What you want to do is understand the business workflow from the eyes of the current users. Find their pain points. Whatever you can do to remove those will give you more 'bank' for later when you start introducing things they're not comfortable with.

    I spent two weeks with the data services team (in fact I asked for the second week) so I can get a good understanding of the process. Maybe my original post sounds confused but I do have developed a decent knowledge of some of the processes but I'm no genius to be honest. Maybe I'm' just slow but there is still a lot for me to learn. You're making me feel better already. It sounds like I have made some right decisions but I'm just panicking at what I consider lack of progress. I can tell you straightaway that one of the problem they have are slow running queries. Some merge scripts take hours to run. I've already posted this n a different thread and another SSC member identified some bad coding in there so there's already things that I can do to existing scripts to make them run faster etc..

    b) Do as you suggest and start by creating a generic data cleanse script?

    Perhaps. This will require more buy-in from the business users (Data Services) but could be a long-term goal to make sure new 'common' changes are included from now on (as you'll update the model scripts) but customs can stay custom without affecting any other clients. You'll definately want to find some way of having a common build script that everyone's working from so that fixes you make today don't get lost tomorrow when someone copies from a different folder.

    You're going to spend a lot of time here just playing catchup with the business, and understanding what the intention behind all of the seeming traditions are. If you can stay true to the intention without making anyone feel silly for not 'thinking of that before', particularly in an entrenched business process, you'll be ready to start making strides into making that into a well oiled machine.

    Just be careful. Too much, too fast, and you'll end up with 'pushback' because of people being afraid of losing their jobs if you automate too much. This is something you'll want to speak with directly to the director about, and I recommend it be a face to face. What you're looking for, in the immediate sense, is ways to make them more productive without taking away their job entirely. That starts with the 'pain points' from earlier.

    Wise words.. wow, I have so much to learn. You know at first I wasn't sure if I should post my problem on this site but I'm so glad I did.

    Thank you.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • I can relate but in a different way. One of the bloodest projects I had was automating an invoicing process at a former employer. I was very familiar with system and it still killed me while trying to automate the process. We are talking a COBOL application (spagetti COBOL to be precise). I thought it would take 3 months, took 6 months before it went into production.

  • Lynn Pettis (8/8/2012)


    I can relate but in a different way. One of the bloodest projects I had was automating an invoicing process at a former employer. I was very familiar with system and it still killed me while trying to automate the process. We are talking a COBOL application (spagetti COBOL to be precise). I thought it would take 3 months, took 6 months before it went into production.

    I have no idea what you're talking about Lynn... My current contract was a 3 month 'write up some simplistic SSIS tasks to move data around'.

    1.5 years later we deployed a warehouse that actually delivered the required specifications. THEN the users started testing... *facepalm*

    Basically we're sharing a horror story or two so you realize they're out there. Business looks at problems very differently than we do and expectations can run wildly offcenter to the reality of the requirements. The trick is presenting them effectively to the personalities involved.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • On the point of "don't lose sight of the big picture", I'll just say that this is one of the biggest mistakes people make, assuming they can't do anything with the details till the big picture has been fully "grocked". This leads to more decision-deadlock than any other factor in management.

    Especially when you consider that "the big picture" is frequently the equivalent of a bunch of random paint thrown through a jet engine blast onto a canvas, and called "art": http://www.youtube.com/watch?v=SFG1LOy0ehk

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (8/9/2012)


    On the point of "don't lose sight of the big picture", I'll just say that this is one of the biggest mistakes people make, assuming they can't do anything with the details till the big picture has been fully "grocked". This leads to more decision-deadlock than any other factor in management.

    Well, the opposite can be just as big of a problem. I've had all sort of headaches caused when a developer would change a piece of functionality in one module, but b/c he had no grasp of the "big picture" he didn't anticipate how this would change things in other areas of the app - then we have to hope that the broken use case isn't to "edgy" to be caught by our regression tests before it gets into production.

    So, in my humble opinion, saying that fostering "forest" vision is the "biggest mistake" is an overstatement; but, I would agree that there's no way for a new developer to grasp the forest unless you let him/her build a tree house or two along the way. It's up to the team leads and managers to put the quality controls in place to allow this to happen in a way that ensures the app's integrity.

    I'll sometimes pair two developers together: one's a good forest guy, one's better with the trees.

    I'd say that the OP needs to, as you suggest, solve small sections of the puzzle to get started, but he should make sure he tests these changes along the whole app cycle.

  • Steve Thompson-454462 (8/10/2012)


    GSquared (8/9/2012)


    On the point of "don't lose sight of the big picture", I'll just say that this is one of the biggest mistakes people make, assuming they can't do anything with the details till the big picture has been fully "grocked". This leads to more decision-deadlock than any other factor in management.

    Well, the opposite can be just as big of a problem. I've had all sort of headaches caused when a developer would change a piece of functionality in one module, but b/c he had no grasp of the "big picture" he didn't anticipate how this would change things in other areas of the app - then we have to hope that the broken use case isn't to "edgy" to be caught by our regression tests before it gets into production.

    So, in my humble opinion, saying that fostering "forest" vision is the "biggest mistake" is an overstatement; but, I would agree that there's no way for a new developer to grasp the forest unless you let him/her build a tree house or two along the way. It's up to the team leads and managers to put the quality controls in place to allow this to happen in a way that ensures the app's integrity.

    I'll sometimes pair two developers together: one's a good forest guy, one's better with the trees.

    I'd say that the OP needs to, as you suggest, solve small sections of the puzzle to get started, but he should make sure he tests these changes along the whole app cycle.

    I'm talking from the viewpoint of organizational analysis.

    Before I turned into a DBA (by accident), my education, training, and experience, was in analyzing organizations for management patterns, strengths, weaknesses, etc. There are patterns that groups of people tend to fall into, some complex, some simple, some subtle and some obvious, that tend to create either effective groups or ineffective groups, depending on the complex of patterns.

    It's a complex subject.

    One of the most common complexes is groups that try to bite off more than the human mind can easily chew. "Big picture" is seen as more important than it actually is. This is a common group-killer, because it has a lot of hype telling people that "the big picture" is what smart, competent, etc., people will see, and everyone wants to see themselves as "smart, competent, etc.". So people will try to convince themselves that they are "big picture people", and will end up operating on an innappropriate decision-scope.

    Think of it this way:

    Imagine you are trying to decide whether to have chicken or fish for dinner. In the big picture, you need to calculate total annual caloric intake, work out a ballanced diet so you get adequate protein, vitamins, trace nutrients, and don't get too much of anything that might create a health risk (like cholesterol, triglycerides, sugars). You need a balance of the various amino acids that go into protein synthesis, you need a certain amount of fat in your diet to maintain immune and nervous tissues. And so on and so on and so on. Analyzing dinner in this manner is likely to end up taking at least a weak, by which time you're worn out from hunger and thirst and your spouse (who was going to cook for you) has probably left home to find a less obsessive person in his/her life.

    Ignoring the big picture in diet will kill you. It can cut decades off your life. So, does every decision about fish or chicken need to be analyzed as a complex, life or death, big-picture decision?

    Of course not. You think, briefly, "I haven't had a lot of omega-3 fatty acids recently, so I'll have the fish", or "I could use a more complete protein, so I'll have chicken", or just "chicken tastes better", and you don't spend a ton of time on analysis, calculations, etc.

    Nobody would ever do the week-long analysis marathon on that decision about dinner. But "big picture" obsessed organizations will do the equivalent of that on business matters, and end up with "decision paralysis". Long chains of e-mails, multiple meetings, decision-escalation to C-level for day-to-day operations, these are all signs of decision paralysis. It's how big, inefficient bureaucracies are born.

    It's a very common situation. Its mirror, which could be called "snap judgement obsession", shows up when tactical levels of management don't want to know about the day-to-day or even week-to-week operations of their areas. They'll rationalize this with "we don't want to micro-manage". Yes, micro-management will kill a business (or other group), but when "not micro-managing" becomes tied to their ego-reflection (asserted self-image), then it's just as destructive, but in a different way.

    So, it's a complex subject, there's a lot to it.

    That's what I'm talking about when I say obsessing about the big picture is one of the biggest mistakes people make. It's easy to rationalize, it spreads like wildfire in the summer, and it makes people feel good about themselves while they strangle the very organization they think they're helping.

    Does that clarify what I meant?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply