SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
Search:  
 
 

Devin Knight

Add to Technorati Favorites Add to Google
Author Bio
Devin is a BI consultant at Pragmatic Works Consulting. Previously, he has tech edited the book Professional Microsoft SQL Server 2008 Integration Services and was an author in the book Knight's 24-Hour Trainer: Microsoft SQL Server 2008 Integration Services. Devin has spoken at past conferences like PASS, Code Camps and several SQL Saturday events. He is a contributing member to the Business Intelligence Special Interest Group (SIG) for PASS as a leader in the SSIS Focus Group. Making his home in Jacksonville, FL, Devin is a participating member of the local users’ group (JSSUG).
More Posts Next page »
All Posts

SSRS - Setting Default Parameter Values for Analysis Services Datasets

By knight_devin@hotmail.com in Devin Knight 10-23-2009 7:54 PM | Categories: Filed under: ,
Rating: |  Discuss | 1,192 Reads | 1192 Reads in Last 30 Days |2 comment(s)

When developing Reporting Services reports that use Analysis Services as a data source you may find that it is difficult to dynamically set default parameter that are passed into the dataset being used.  It turns out this is a lot simpler to do then many people think.  In this post I’ll walk you through the steps of setting a default date value in a dataset’s parameter. 

Step One

When creating a dataset that uses Analysis Services you will select Query Designer and then drag over the measures and dimension attributes that are needed for the report where it says “Drag levels or measures here to add to the query” .  To add a parameter to the dataset you need to add a filter above where you just dragged over your report fields.  In my example I am filtering by the year 2008.  If I want to make this a parameter I just check Parameter box.  Now when you hit OK and then OK again to return to your report you will have a new parameter added.  When I made 2008 the Filter Expression it automatically sets that value as my default parameter value.

image

Step Two

What I really want to do is setup this parameter so that it dynamically changes.  In my report I want it to always show the current year by default in the parameter.  So instead of 2008 the report should display 2009 and automatically change to 2010 on January 1st.  You can do this using the SSRS expression language in the default value of the expression.  The tricky part about it is that the default value of the parameter is using a MDX value.  If you open the report parameter and select the Default Values tab you can change the value used for this parameter to use a SSRS expression so it dynamically changes to the current year no matter what the date is.  The expression used for my example was ="[Year].[Year].&["+CSTR(Year(Today))+"]".  This is using the Today function and only returning back the year using the Year function.  Lastly it converts the value into a string. 

This is a very basic example but I could use a similar expression if I had used a date hierarchy instead of just the year attribute.  The key is to find out how the MDX that is being passed into the value looks like and then you can duplicate them using the SSRS expression language.  Now every time I run the report It will have the current year populated by default in the parameter.

image


Post SQL Saturday #21 Orlando

By knight_devin@hotmail.com in Devin Knight 10-17-2009 7:50 PM | Categories: Filed under:
Rating: (not yet rated) Rate this |  Discuss | 458 Reads | 458 Reads in Last 30 Days |4 comment(s)

Today I gave a two part session at SQL Saturday in Orlando.  When originally planning this years event Andy Warren said many people who have attended previous events were interested in sessions that were split into two parts.  So I decided to submit an Introduction to SSIS that was split into a part 1 and 2. 

I was a little unsure on how to split an Intro to SSIS session into two parts but it turned out I actually preferred it this way.  I decided to have the first session cover general understanding of the development tool, Control Flow, and Event Handlers.  The second session was on the Data Flow, Variables, and deployment.  I think this worked out well and it made me not feel as rushed on the topics. 

Andy knows how to run these events and I think it’s interesting to see how the event has transformed over the years now.  This event as a whole was stacked with talented speakers.  I think this may have been one of the most talented group of speakers I’ve seen to date at a SQL Saturday.  Not only does SQL Saturday have seasoned speakers but it is an opportunity for new and local speakers to share with others the knowledge they’ve gained from their experience.  I think a lot of people forget about how much great information they’re getting out of this free event.  Make sure you make it to the next event near you.  www.sqlsaturday.com


SQL Saturday #21 Orlando

By knight_devin@hotmail.com in Devin Knight 10-07-2009 7:44 PM | Categories: Filed under:
Rating: (not yet rated) Rate this |  Discuss | 550 Reads | 508 Reads in Last 30 Days |2 comment(s)

It’s that time again!  SQL Saturday has returned to Orlando with an all star group of speakers http://www.sqlsaturday.com/schedule.aspx.  The event will be October 17th and you can find all the important information at www.sqlsaturday.com.  If you’ve never attended one of these free events make sure you make this one.  It’s well worth attending even with a couple hours of a drive like me!  The topic I will be speaking on is introduction to SSIS in a two part session.  Hope to see everyone there!


SQL Server 2008 Change Data Capture

By knight_devin@hotmail.com in Devin Knight 10-05-2009 11:37 PM | Categories:
Rating: (not yet rated) Rate this |  Discuss | 609 Reads | 484 Reads in Last 30 Days |no comments

SQL Server 2008 has a new Change Data Capture feature that allows you to track Inserts, Updates, and Deletes on a table. A lot of the features I’ve used in the past to detect changes are generally very expensive operations so I’m always looking for alternatives.  There are several caveats you need to know about before attempting to use CDC.  View this blog with screenshots at my regular blog http://blogs.pragmaticworks.com/devin_knight/.

1.  You must be using either Enterprise, Developer, or Evaluation editions of SQL Server 2008

2.  You must have a sysadmin fixed server role

3.  You must have a db_owner fixed database role

4.  CDC can only be applied to user databases

5.  SQL Server Agent must be running

All of the stored procedures used for CDC can be found on msdn http://msdn.microsoft.com/en-us/library/bb500244.aspx.

Step One

CDC must be enabled on each database you decide to check for changes.  To enable CDC on a database using the following script:

Use AdventureWorks2008
exec sys.sp_cdc_enable_db
select name, is_cdc_enabled from sys.databases

The select statement shows which databases have Change Data Capture enabled.

Step Two

Create a table to test CDC:

Create Table dbo.CDC_Test
(ID int,
Description varchar (50))

Step Three

CDC must also be enable on each individual table:

exec sys.sp_cdc_enable_table 
    @source_schema = 'dbo', 
    @source_name = 'CDC_Test',
    @role_name = 'CDCRole'
select name, is_tracked_by_cdc from sys.tables

Here I’ve created a new role called “CDCrole” that didn’t previously exist.  This stored procedure has many parameters but I’ve only used the required ones here.  View all the other available parameters here http://msdn.microsoft.com/en-us/library/bb522475.aspx.  The select statement displays all the tables that have Change Data Capture enabled.  You may have also noticed that two SQL Agent Jobs were created (cdc.AdventureWorks2008.capture – watches for changes in transaction logs, cdc.AdventureWorks2008.cleanup – automates cleanup). 

Step Four

Insert a new record into the test table:

INSERT INTO AdventureWorks2008.dbo.CDC_Test
           (ID
           ,Description)
    Values(1, 'CDC Test' )

Step Five

Update the record in the table:

UPDATE AdventureWorks2008.dbo.CDC_Test
   SET ID = 3
      ,Description = 'NewTest'
 WHERE ID = 1

Step Six

Use the function created by CDC to return which columns are Inserted, Deleted, or Updated

DECLARE @from_lsn binary(10), @to_lsn binary(10)
SET @from_lsn =
   sys.fn_cdc_get_min_lsn('dbo_CDC_Test')
SET @to_lsn   = sys.fn_cdc_get_max_lsn()
SELECT * FROM cdc.fn_cdc_get_all_changes_dbo_CDC_Test
  (@from_lsn, @to_lsn, N'all')

How is this useful?  The _$operation column is used to indicate the change (1 = Delete, 2 = Insert, 3 = Update with row filter option set to “all update old”, 4 = More typical Update).  So my example shows one insert and one update.

Step Seven

Use the following procedure to view all tables with CDC enabled.

exec sys.sp_cdc_help_change_data_capture

Step Eight

Disable Change Data Capture on table and database.  To disable just the individual table run the following:

exec sys.sp_cdc_disable_table 
  @source_schema = 'dbo', 
  @source_name = 'CDC_Test',
  @capture_instance='dbo_CDC_Test'

Run the following to disable CDC for the database:

exec sys.sp_cdc_disable_db

You don’t need to do both if you want to disable the table and database just run the disable database stored procedure.  You will also notice the SQL Agent Jobs are removed when the last procedure is ran.


Adding a folder to the SSIS Package Store

By knight_devin@hotmail.com in Devin Knight 09-28-2009 10:56 PM | Categories: Filed under:
Rating: (not yet rated) Rate this |  Discuss | 893 Reads | 624 Reads in Last 30 Days |no comments

The SSIS Package Store is a nice option available for storing SSIS packages.  The Package Store lists packages stored on the server in the MSDB folder and all packages stored on the file system as long as they are in the default folder (C:\Program Files\Microsoft SQL Server\100\DTS\Packages for SQL 2008) in the File System folder.  View my regular blog to see screenshots included http://blogs.pragmaticworks.com/devin_knight/.

It is possible to adjust the Package Store to look for packages in different file system folders or different server instances.  To do this you must change the MsDtsSrvr.ini.xml file in the C:\Program Files\Microsoft SQL Server\100\DTS\Binn (folder is in SQL 2008 only). 

It’s a pretty simple change that you can even use notepad to do.  Here are the steps to follow to add a new File System folder to the Package Store.

1. Navigate to the file C:\Program Files\Microsoft SQL Server\100\DTS\Binn\MsDtsSrvr.ini.xml and open it notepad or any program that can edit an xml file.

2. To add a second File System folder Package Store add the following underlined code with the file path you want between <StorePath> and </StorePath> (Be careful when making these changes.  You may want to make a copy of the file before making changes):

<?xml version="1.0" encoding="utf-8"?>
<DtsServiceConfiguration xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <StopExecutingPackagesOnShutdown>true</StopExecutingPackagesOnShutdown>
  <TopLevelFolders>
    <Folder xsi:type="SqlServerFolder">
      <Name>MSDB</Name>
      <ServerName>.</ServerName>
    </Folder>
    <Folder xsi:type="FileSystemFolder">
      <Name>File System</Name>
      <StorePath>..\Packages</StorePath>
    </Folder>
    <Folder xsi:type="FileSystemFolder">
      <Name>My Packages</Name>
      <StorePath>C:\SSIS</StorePath>
    </Folder>

  </TopLevelFolders>
</DtsServiceConfiguration>

3. If you want to change to a different server instance then type the new server instance between <ServerName> and </ServerName>.

4. After you have made these changes and saved the file restart the SQL Server Integration Services in the SQL Server Configuration Manager.

Your new folder will not appear until you restart the service.  You can connect to Integration Services in Management Studio and find your new folder.


SSIS - Data Provider Does Not Allow Parameters in OLE DB Source

By knight_devin@hotmail.com in Devin Knight 09-23-2009 10:38 PM | Categories: Filed under: , ,
Rating: (not yet rated) Rate this |  Discuss | 1,078 Reads | 650 Reads in Last 30 Days |no comments

I recently worked on a project that involved loading a Data Warehouse from a DB2 source.  In this project we used the Microsoft OLE DB Provider for DB2 Data Provider for the ETL process in SSIS. 

Unfortunately, one of the limitations to using this Data Provider is the the OLE DB Source does not allow you to pass in parameters which I do frequently in Data Warehouses for incremental loading.  The work around I did for this was to store the query I wanted to use in my source in a table on the SQL Server side with a place holder for for my parameter, something like:

Select *

From DimEmployee

Where RowStartDate > %parameter% 

Then I used a stored procedure with a REPLACE function to change my placeholder with the value I want in SSIS. 

After doing the prep work of storing the query and writing the stored procedure I used SSIS to execute the stored procedure in an Execute SQL Task and pass in the parameter I want using the task.  Store the results in a variable and then you can use that variable as your source query in the OLE DB Source.


Error Importing DTS package in SQL Server 2008

By knight_devin@hotmail.com in Devin Knight 08-26-2009 9:01 AM | Categories: Filed under:
Rating: (not yet rated) Rate this |  Discuss | 1,174 Reads | 483 Reads in Last 30 Days |1 comment(s)

Opening DTS packages on a 2008 SQL Server requires several steps to be able to view the packages in a DTS designer.  Here’s a link to these steps on a 32-bit machine or on a 64-bit machine.  After following these step you will be able to open your DTS packages in the DTS Designer.

These steps are great for opening DTS packages but I ran into a problem recently where I was not able to import packages on my server which shouldn’t require as many prerequisites like opening packages does.

When I attempted to import a package I received the following error:

[DBNETLIB][ConnectionOpen (Connect()).]SQL Server does not exist or access denied. (Microsoft OLE DB Provider for SQL Server)

After doing some research I found the problem in a blog by Xinwei Hong.  He has several possible problems listed, which luckily was solved by his first solution.  It is was a simple fix and here are the steps I followed to fix my problem:

1. Open the SQL Server Configuration Manager

2.  Expand on SQL Server Network Configuration and select Protocols for MSSQLSERVER

3.  Enable TCP/IP.  Not sure why this was disabled to begin with but that was my problem


Does File Exist Check in SSIS

By knight_devin@hotmail.com in Devin Knight 08-11-2009 9:11 AM | Categories: Filed under:
Rating: |  Discuss | 3,398 Reads | 709 Reads in Last 30 Days |5 comment(s)

A very common need in SSIS is to check to see if a file exist before you run what could be a very long process in your package.  There are no native tasks inside SSIS that can do this check but you can accomplish this using a Script Task.  Here are the steps to check to see if a file exist.  To view this blog with screenshots visit my regular blog at http://blogs.pragmaticworks.com/devin_knight/.

1.  Setup two variables.  The variable named strFileLocation has a string data type with the value being the location of the file I want to check for.  The variable bolFileExists has a boolean data type with the value changing based on whether the file exist or not.  If the file is found the value will be changed to True otherwise it stays False.

2.  Use a Script Task in the Control Flow and set the ReadOnlyVariables to use the strFileLocation variable and the ReadWriteVariables to use the bolFileExists variable then select Edit Script.

3.  There are two methods for writing this script.  The first method you must first add the namespace System.IO.  The second method does not require this.

4.  Scroll down the editor until you find the green commented out text that says Add your code here.  Replace that with the following code then save and close the editor :

Dts.Variables("bolFileExists").Value = File.Exists(Dts.Variables("strFileLocation").Value)

The second option for the script is (Remember this option is not using the Imports System.IO step) :

Dts.Variables("bolFileExists").Value = My.Computer.FileSystem.FileExists(Dts.Variables("strFileLocation").Value) 

5.  Now that you have the script done you can use things more familiar to you in SSIS like precedence constraints with expressions to get the desired results.  Connect the completed Script Task to a path that you want the package to move if the file does exist.  Open the precedence constraints editor by double clicking on the line. change the Evaluation operation to Expression and Constraint and add the expression @bolFileExists==True.  Remember the double equals is a comparison.  If it was a single equal sign it would be trying to set the value of the variable to True.  Click OK once complete.  If you want a path for if the file is not found then follow the same steps with the precedence constraint but the expression should be @bolFileExists==False.

I’m using script tasks just as placeholders here for how the rest of the package may look.  If my file did exist it would go down the left path.  If it did not exist my package flow would go right.


SSIS – Send Errors to Message Box

By knight_devin@hotmail.com in Devin Knight 08-10-2009 12:36 AM | Categories: Filed under:
Rating: |  Discuss | 1,809 Reads | 559 Reads in Last 30 Days |no comments

While in the process of developing SSIS packages it’s likely you will deal with errors occasionally unless you’re the Superstar developer that never makes mistakes.  Of course your first steps to deciphering the problem is to look at either the progress/execution results tab in BIDS.  The problem with viewing error messages here is that more than likely the error you receive does not fit on the single line and goes off the screen.  So what do you do?  Right-click select Copy Message Text and paste it in something like the notepad is what most people do.  This is an quick way to view a full error message, but I’m going to share with you another option that can be used during development to view your full error message.  To read this blog post with screenshots visit my regular blog at http://blogs.pragmaticworks.com/devin_knight/.

This method involves using a little bit of scripting and taking advantage of Event Handlers.

1.  Create an event handler on the package level that will kick off a Script Task when an OnError event occurs.

2.  Drag a Script Task from the Toolbox onto the Event Handler design surface then double-click to open it’s editor.

3.  Select the System::ErrorDescription Variable From the ReadOnlyVariables list.  (This variable is only available in the Event Handler tab)  Then click Edit Script.

4.  Once the Editor opens scroll down to where it says Add your code here and replace that line with the following code:

MsgBox(Dts.Variables("ErrorDescription").Value)

This code is looking waiting for this system variable ErrorDescription to be populated and when it is a message box will appear with it’s content.

Now when your package is run if it encounters an error that causes the package to fail you will receive a message box with the same error you would find in the progress/execution results tab.  If you have multiple errors you will receive multiple message boxes.  This message can easily be copied from the message box by using the CTRL+C if you will wish to copy this to a notepad.


SQL Saturday South Florida

By knight_devin@hotmail.com in Devin Knight 08-09-2009 11:47 PM | Categories: Filed under: ,
Rating: (not yet rated) Rate this |  Discuss | 1,650 Reads | 579 Reads in Last 30 Days |no comments

Saturday I had the great opportunity to speak at SQL Saturday in South Florida.  The event was great.  It was very well organized and the facility was perfect for the amount of people.  There was a total of about 350 people who made it.   

I spoke on Performance Tuning SSIS, which is one of my favorite topics.  Due to an absent speaker I also took over another session after lunch which was an Introduction to SSIS.  I wasn’t really prepared to do this session but I think it still went well.  If you were in my Performance Tuning session here is a link to download my slide deck and packages used in demos.

http://www.pragmaticworks.com/DevinKnight/PerformanceTuning.zip


HTTP 404 Not Found with Reporting Services and Windows 7

By knight_devin@hotmail.com in Devin Knight 07-24-2009 3:30 PM | Categories: Filed under:
Rating: (not yet rated) Rate this |  Discuss | 2,806 Reads | 736 Reads in Last 30 Days |4 comment(s)

Here are some steps to fixing an issue I ran into with Window 7 and Reporting Services 2008.  The problem was when I attempted to access the Report Server or Report Manager from my browser I received the error HTTP 404 Not Found.  I didn’t find a direct answer on the web for this but I did see a lot of people being directed to change the port used to 8080 without any explanation why.  I don’t think this is an issue exclusive to Windows 7 but it happened immediately after installing it (maybe someone could shed some light on this part for me).  I didn’t have this same problem while using Vista.  I did the following steps to solve the problem.

1.  Open Reporting Services Configuration Manager

2.  Changed the Web Service URL to use TCP Port 8080 instead of the default 80.

3.  Changed the Report Manager URL Advanced properties to use TCP Port 8080 instead of the default 80.

4.  Restarted the service and I was good to go.

I’m not sure why this fixed the problem maybe someone can leave me a comment to help me out on that.


Changing Startup News Feed in Visual Studio

By knight_devin@hotmail.com in Devin Knight 07-24-2009 2:49 PM | Categories: Filed under:
Rating: (not yet rated) Rate this |  Discuss | 1,453 Reads | 427 Reads in Last 30 Days |no comments

Are you tired of the same old Microsoft news feed when you open Visual Studio (BIDS)?  Why not go ahead and update it so you every time you load VS you get news that you’re actually interested in.  I’ve changed my news feed to display all the SQL Server Central blogs so now instead of ignoring the news feed I actually read through each.  I’m sure many people have blogged about this before but I want to give credit to Scott Gleason for showing me this nice feature several months ago.  Read my regular blog to see these steps with screenshots. http://blogs.pragmaticworks.com/devin_knight/

1.  Open Visual Studio (BIDS)

2.  Use the Tools Menu and select Options

3.  Under the Environment list select Startup

4.  Type in the news feed of your choice.  I used SQL Server Central blogs http://www.sqlservercentral.com/blogs/MainFeed.aspx 


SSIS Expressions Basics

By knight_devin@hotmail.com in Devin Knight 07-23-2009 10:36 AM | Categories: Filed under: ,
Rating: (not yet rated) Rate this |  Discuss | 2,066 Reads | 588 Reads in Last 30 Days |no comments

For those that are new to SSIS, using expressions may seem very foreign and unpleasant.  Not only is it a new language for you to learn but new data types to understand as well.  If you can get past this unwelcoming feelings you’ll realize using expressions is a great method for making your SSIS packages dynamic.  Read my regular blog if you would like to see screenshots with this post http://blogs.pragmaticworks.com/devin_knight/.

I will tell you about the basics you need to know before starting to use expressions.  If you’re interested in a crash course on writing expressions read this White Paper by Brian Knight.  http://www.pragmaticworks.com/downloads/ssisexpressionswhitepaper/

Using expressions isn’t nearly as difficult as many believe.  Provided to you is the Expression Builder that will save many hours of research on the web trying to determine which function you need for what you’re trying to accomplish.  Luckily you can skip this research step because all the possible functions you can use are located in the Expressions Builder.

There are a variety of places in SSIS where you have the ability to make a package dynamic using expressions.  Connection managers and tasks have options for building expressions to dynamically change the value of a property.  For example, a very common use for expression in connection managers involves using a Foreach Loop Container with the Foreach File Enumerator.  This involves looping through a collection of files and performing the same tasks (usually loading to a table) to each file.  You would use expressions in this case to change the value of the connection string property in the connection manager that stores the file location so after each iteration of the loop you’re loading a new file.

Expressions are not exclusive to connection managers and tasks.  Variables also have the ability to use expression to change how they will evaluate.  Selecting a variable and changing the EvaluateAsExpression property of the variable to True will all for this variable to dynamically change at runtime.  This opens up many possibilities to any object in your package that may want to use variables with.

Of course it must be mentioned that there are Data Flow transforms that require basic knowledge of expressions.  Derived Column and Conditional Split are just a couple of the transforms that require the expressions language to do business.  Using expressions in transforms like these are not necessarily making your package dynamic but they could make your data dynamic.  Expression in transforms can restrict what rows are loaded, change the value in rows, or even replace the value in rows.

I will provide future blogs with more details on expressons but it's always nice to have a little understanding before jumping right in.


Better Know A SSIS Transform - Conditional Split

By knight_devin@hotmail.com in Devin Knight 07-13-2009 11:40 AM | Categories: Filed under: , ,
Rating: (not yet rated) Rate this |  Discuss | 3,077 Reads | 754 Reads in Last 30 Days |1 comment(s)

This is part 3 of my 29 part series called Better Know A SSIS Transform.  Hopefully you will find the series informative.  I will tell you a little about each transform and follow it up with a demo basic you can do on your own.  Screenshots that go with examples can be viewed from my regular blog site.  http://blogs.pragmaticworks.com/devin_knight/

The Conditional Split provides a way to evaluate incoming rows and separate those rows by an expression your design.  After these rows are separated they are sent to different outputs so they can either be cleansed, loaded separately, or detect changing data (a good substitute for the slowly changing dimension).  I will provide you some scenarios where you may have to use the conditional split for these reasons and how you would use it.  There are of course other possible reasons you may use the Condition Split but these what I typically use it for.

Cleansing Data Example

The scenario is I have a package that loads Company A customers.  The data that we receive is not always complete though.  Often I will have a zip code for a customer but no city or state.  Because this is a known issue the IT department has purchased a zip code extract that list all zip codes and their associated cites and states.

  • Add Flat File Source pointing to incoming customer data
  • Ensure all zip codes are standardized with a Derived Column Transform
  • Use Conditional Split to separate data that does not have a city and state
  • Send rows without city and state to Lookup Transform that will match zip codes and return missing city and states.  If it doesn’t find a match send the output to a table so the rows can be corrected by hand.
  • Use a Union All to combine original good data with corrected data from the Lookup Transform.
  • Send to Destination Table

Conditional Split Configuration

  • The condition is trimming any blank spaces in the columns and checking to see if the City and State columns are empty.  If they are empty those rows are sent to a Bad Data output.
  • All rows that don’t meet this condition are sent to the Default output Good Data.
  • Another method could be to convert these blank spaces to null before the Conditional Split then just check for null in the Bad Data condition.

Load Data Separate Example

The scenario is I have a package that loads customer mailing lists.  Company B sends out promotions and wants to separate those mailing list depending on a customers education level.  Those with some college and high school or less education will more likely receive my promotion to attend a career college. 

  • Add a OLE DB Source to bring in data from my customer table
  • Use a Conditional Split to separate customers by education level
  • Connect outputs to Flat File Destinations to create mailing lists.

Conditional Split Configuration

  • The Completed College output is checking the EnglishEducation column for either a string value of Bachelors or Graduate Degree
  • The Some College output is checking the EnglishEducation column for a string value of Partial College
  • All other rows are sent to the default output named High School Education or Less

Detecting Changing Data Example

This common scenario is using an alternative method to using the Slowly Changing Dimension.  I have incoming records from Company C’s ecommerce system that need to be loaded to my data warehouse.  Before these records get loaded I need to check to see if they are either new, updated or duplicate records.

  • Add a OLE DB Source pointing to ecommerce database
  • Use a Lookup Transform on the destination table joining by the table primary key and rename all Output columns Target_(column_name).  Tell the transform to ignore failure when no matches are found.  A better method is to use either Checksum a Hash byte column for comparison, but this is a good starting method. The Checksum or Hash byte method creates a unique identifying number for each row so instead of comparing each column of a row you can compare just one column to detect a change.
  • Use Conditional Split to determine which records are new, updates or duplicates.
  • Send New records to final destination table
  • Send Updates to a staging table
  • Use an Execute SQL task in the Control Flow to process the updated rows into the destination table.  (This method is much faster than using OLE DB Command)

Conditional Split Configuration

  • The New Record output is checking to see if the Target_(primary_key) is null.  If it is null then we know it’s a new record.
  • If the Target_(primary_key) is not null then the Update output will compare each column to the destination table to see if there are any differences so we know that it needs to be updated.  Again the best method for doing this would be to use either Checksum or Hash byte to create a unique number that represents a row.  Then just compare that one column instead of all columns.
  • Anything that doesn’t meet these conditions are duplicates and we do not want to load.  Just don’t connect the Duplicate output to anything and these rows will not be loaded.

Upcoming News and Events

By knight_devin@hotmail.com in Devin Knight 07-07-2009 4:49 PM | Categories: Filed under: , , , ,
Rating: (not yet rated) Rate this |  Discuss | 2,785 Reads | 710 Reads in Last 30 Days |1 comment(s)

There are a lot of exciting events happening in the next 2 months that you should be involved in so I want to give you a quick run down of each event that I will either be presenting at or just attending.

Today – The release of my first book!  Knight's 24-Hour Trainer: Microsoft SQL Server 2008 Integration Services

July 13 – 16 – Teaching online SSIS Class (Monday – Thursday from 2 to 5 pm EST)

July 14 – Attending PASS BI SIG (SSIS Focus Group) meeting online 12 pm EST.  Mike Davis presenting on Developing Dynamic SSIS Packages.

In this session you will learn how to make your SSIS packages dynamic, by using Configuration Files, Configuration Tables, Parent to Child Package Relationships. You will see how to use variables and expression together in all areas of a package and understand how update multiple packages with a single change without having to redeploy packages multiple times.

July 20 – 23 – Teaching online SSIS Class (Monday – Thursday from 2 to 5 pm EST)

July 21 – Attending PASS BI SIG (Data Warehousing Focus Group) meeting online 12 pm EST.  Javier Loria presenting on Contemporary alternatives for BI architectures.

August 1 – SQL Saturday Baton Rouge.  Not attending but have to plug this.  If you’ve never been to one and you live in the area or even if you don’t live in the area make this event!

August 8 – SQL Saturday South Florida.  Submitted session to speak on SSIS performance tuning.

August 29 – Jax Code Camp in Jacksonville, FL.  Great event for a great cause.  All proceeds not used for the production of the event are donated to the Wolfson Children’s Hospital.  There will be a SQL Track that I will be presenting a beginner’s SSIS session.

 

So now you have no reason not to get involved in something at least online!

More Posts Next page »