What is a Data Governance Policy?
Data governance policy is concerned with how an organization collects, stores, accesses and maintains its data. As data is now a core enterprise asset, ensuring it is properly maintained and controlled is critical.
When creating a data governance program there are at least 4… Read more
Text Mining with RapidMiner for Loch Ness Monster Sightings
Text mining involves pulling root words from text in a system. In this example, I pulled all of the Loch Ness Monster sightings from 2000 to 2015 from the Official Loch Ness Monster Website into an Excel spreadsheet. Then using the… Read more
If you are searching for a data mining solution be sure to look into RapidMiner. RapidMiner is an open source predictive analytic software that provides great out of the box support to get started with data mining in your organization. They offer a free desktop software version to… Read more
Azure Data Catalog
Now available in public preview is the Azure Data Catalog. The Data catalog provides an enterprise data repository to enable end users self service data discovery. The data catalog assists IT and business users by allowing a collaborative solution to publish documented data sets.
Users can access… Read more
SQL Server 2016 Stretch Database
One of the features in SQL Server 2016 that you will want to explore is the ability to store your historical data in the Azure cloud. By leveraging the Stretch Database feature your applications can silently migrate data to the cloud without having to change… Read more
Machine Learning Data Sets
When starting out with data mining and machine learning you will need to have access to sample data in order to learn the technology. Often times leveraging data outside of your corporate data sets makes it easier to learn. There is a large collection of sample… Read more
What is SQL Server Polybase?
Coming in SQL Server 2016 is the ability to query relational tables as well as data stored in Hadoop or Azure Blob storage. This technology will allow you to continue to leverage SQL Server for your data warehousing and access data stored in different formats… Read more
I presented on how to become a BI developer at the Houston Area SQL Server User Group (HASSUG). This session resulted from feedback we received in the user group on differences between database developers and BI developers. If you are interested in seeing the presentation content it is uploaded here: Read more
Leveraging SQL Server Database Schema
I am presenting the lunch keynote at the Toronto Chief Data Officer Summit on June 4th. My session is on how to leverage predictive analytics to reduce customer churn. I will be going over how I leveraged a decision tree model to create a customer risk score for active customers.… Read more
HOST_ID and HOST_NAME
Included in T-SQL for SQL Server is a functions that returns the workstation information that executes a SQL statement. You can use this to record what workstation the transaction came from in your application or business intelligence processes.
HOST_ID will return the Process ID from the client… Read more
SQL Server 2014 Deprecated Features – Database Engine
Everyone looks at the new features when trying to determine when to upgrade from an older version of SQL Server to a newer release. There are many factors to consider and for every environment certain features have a higher priority and need… Read more
Call for speakers is now open for SQL Saturday #408 in Houston, TX. If you are interested in presenting please fill out the form – https://www.sqlsaturday.com/408/callforspeakers.aspx. The call for speakers ends on 4/14/2015.
For complete details on the event see – https://www.sqlsaturday.com/408/eventhome.aspx.
SQLSaturday is a training event for SQL… Read more
What is the Azure Data Factory?
The Azure Data Factory is a managed service for data storage and processing. It allows you to build cloud based solutions to move and store your data in a centralized managed environment. Source data can be pulled from on premise or cloud environments consisting… Read more
T-SQL EXCEPT and INTERSECT
Both T-SQL EXCEPT and INTERSECT are set based operators that combine multiple query results back in the same result set. EXCEPT returns the records from the query on the left that are not found in the right query. INTERSECT returns the distinct rows that are in… Read more
SSRS Farm Overview
Building a SSRS farm will require Enterprise or Business Intelligence editions of SQL Server for versions 2012 and 2014. SQL Server allows for the deployment of 2 or more servers running SQL Server Reporting Services to increase the performance of your reporting environment.
SSRS stores its data… Read more
T-SQL Random Numbers using RAND()
SQL Server includes the T-SQL RAND() function to create a random value between 0 and 1 of float datatype. To create a random number execute Select Rand(), in my example it returned .0131039082850364. If I wanted to always return the same number I can… Read more
SQL Server T-SQL coalesce simplifies the use of a case statement to find the first non-null value of your expression. For example, I want to return the products in the AdventureWorks sample database and show the SellEndDate if it exists or the SellStartDate if the SellEndDate is null. Read more
Row_Number SQL Server
SQL Server includes several Ranking Functions that can be called in T-SQL. One of these is the Row_Number() function. You can use this function to return a sequential number in your result set that begins at 1.
There are 2 arguments that can be passed into the… Read more