Stay Tuned…for a SQL Server Tutorial Series Juggling Act

by Steve Bolton

…………If all goes according to plan, my blog will return in a few weeks with two brand new series, Using Other Data Mining Tools with SQL Server and Information Measurement with SQL Server. Yes, I will be attempting what amounts to a circus act among SQL Server bloggers, maintaining two separate tutorial series at the same time. Like my series A Rickety Stairway to SQL Server Data Mining, it might be more apt to call them mistutorials; once again I’ll be tackling subject matter I’m not yet qualified to write about, in the hopes of learning as I go. The first series will compare and contrast the capabilities of SSDM against other tools in the market, with the proviso that it is possible to run them on data stored in SQL Server. That eliminates a lot of applications that run only on Linux or otherwise cannot access SQL Server databases right off the bat. The software packages I survey may include such recognizable names as RapidMiner and WEKA, plus a few I promised others to review long ago, like Autobox and Predixion Software. The latter is the brainchild of Jamie MacLennan and Bogdan Crivat, two former members of Microsoft’s Data Mining Team who were instrumental in the development of SSDM.
…………The second series of amateur tutorials may be updated more frequently because those posts won’t require lengthy periods of time to evaluate unfamiliar software. This will expand my minuscule knowledge of data mining in a different direction, by figuring out how to code some of the building blocks used routinely in the field, like Shannon’s Entropy, Bayes factors and the Akaike Information Criterion. Not only can such metrics be used in the development of new mining algorithms, but they can be applied out-of-the-box to answer myriad basic questions about the type of information stored in our SQL Server databases – such as how random, complex, ordered and compressed it might be. No mining company would ever excavate a new quarry without first performing a geological survey of what precious metals might be beneath the surface; likewise, it may be helpful for data miners to have a surface impression of how much useful information might be stored in our tables and cubes, before digging into them with sophisticated algorithms that can have formidable performance costs. Scores of such measures are scattered throughout the mathematical literature that underpins data mining applications, so it may take quite awhile to slog through them all; while haphazardly researching the topic, I ran across a couple of quite interesting measures of information that seem to have been forgotten. I will try to make these exercises useful to SQL Server users by providing T-SQL, MDX, Common Language Runtime (CLR) functions in Visual Basic and perhaps even short SSDM plug-in algorithms, as well as use cases for when they might be appropriate. To illustrate these uses, I may test them on freely available databases of interest to me, like the Higgs Boson dataset provided by the Univeristy of California at Irvine’s Machine Learning Repository. I may also make use of a tiny 9 kilobyte dataset on Duchenne’s form of muscular dystrophy, which Vanderbilt University’s Department of Biostatistics has made publicly available, and transcriptions of the Voynich Manuscript, an enigmatic medieval manuscript with an encryption scheme so obscure that even the National Security Agency (NSA) can’t crack it. Both tutorial series will use the same datasets in order to cut down on overhead and make it easier for readers to follow both. When and if I manage to complete both series, the next distant item on this blog’s roadmap will be a tutorial series on how to use various types of neural nets with SQL Server, which is a topic I had some intriguing experiences with many years ago.

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

Stay Tuned…for a SQL Server Tutorial Series Juggling Act

Rate

Share

Share

Rate

Stay Tuned…for a SQL Server Tutorial Series Juggling Act

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts