Technical Article

Data Mining Part 30: System Views

,

Introduction

Sometimes we need to recover information from the Mining Model, the Mining structure or other information related to Data Mining. We also need to monitor who is using our database, what applications are being used, what machine is accessing to our data mining models. For this purpose, there are system views used to retrieve this information.

Requirements

1. For this chapter, we are using the Adventureworks Multidimensional model and the AdventureworksDW database.

2. The SQL Server (in this example we are using the SQL 2014, but you can use earlier versions).

3. The SQL Server Analysis Services installed.

Getting Started

1. In order to start, open the SQL Server Management Studio and connect to the Analysis Services.

2. Right click on the database and select New Query. You can query the Data Mining system views using DMX or MDX.

3. Let's start with a nice and simple system view the $System.DMSCHEMA_MINING_COLUMNS:

SELECT *
 FROM $System.DMSCHEMA_MINING_COLUMNS

4. The query will return all the columns from all the Data Mining models.

5. There are several columns to highlight about this view:

  • MODEL_CATALOG contains the SSAS Database name.
  • MODEL_NAME contains the Data Mining Model Name.
  • COLUMN_NAME contains the name of the columns using in the models. See the step 7,8 of the chapter 1 for more information about columns.
  • CONTENT_TYPE is the type of content of the data like discretized, discrete, continuous, key, key sequences, etc.
  • The Data_type is similar to the concept in the SQL Server. It indicates to the model, if it is text, Long, Boolean, etc.

6. Let's run a simple example. The following example shows the column name, Data mining model, and the column predictable:

SELECT COLUMN_NAME,MODEL_NAME,IS_PREDICTABLE
 FROM $System.DMSCHEMA_MINING_COLUMNS
 WHERE IS_PREDICTABLE
7. As you can see, the MDX is a little bit different than the SQL to handle boolean values. For the column IS_PREDICTABLE, we are using:
WHERE IS_PREDICTABLE to indicate that IS_PREDICTABLE is true which in SQL would be like this:
WHERE IS_PREDICTABLE=1
8. In system view if you want to see the column, model of the non-predictable columns (IS_PREDICTABLE is false), you can use this query: 
SELECT COLUMN_NAME,MODEL_NAME,IS_PREDICTABLE
 FROM $System.DMSCHEMA_MINING_COLUMNS
 WHERE NOT IS_PREDICTABLE
9. Another interesting query is the $System.DMSCHEMA_MINING_FUNCTIONS. This query will show information about different Data Mining Functions:
SELECT *
 FROM $System.DMSCHEMA_MINING_FUNCTIONS

10. The results are the following:

11. This view is important because not all the function applies to any algorithm. The query helps to find which function can be applied.
12. There are several columns to mention:
  • The SERVICE_NAME is the algorithm name,
  • The FUNCTION_NAME is the name of the function. 
  • The FUNCTION_SIGNATURE contains the syntax used to call the function.
  • The RETURNS_TABLE specifies if the function returns a table or not. For example, the PredictHistogram returns a table and the Exist function does not return a table.
  • The DESCRIPTION column describes what the function does.
  • The HELP_FILE and HELP_CONTEXT are always blank.
13. For example, if you want to know the functions available for the Time Series algorithm that return a table you can use this query:
SELECT *
 FROM $System.DMSCHEMA_MINING_FUNCTIONS
 WHERE SERVICE_NAME='Microsoft_Time_Series' 
 and RETURNS_TABLE

14. Another Important view is the   $System.DMSCHEMA_MINING_MODEL_CONTENT. This function shows internal content information about the models. For example it shows information about the Cluster nodes and Sequence cluster and other algorithms.

15. The following query shows all the information:

SELECT * 
 FROM $System.DMSCHEMA_MINING_MODEL_CONTENT

16. The information displayed is similar to the following:

17. You will have information about the nodes, node names, model,  probability of the nodes and the NODE_DISTRIBUTION that contains the attributes distribution in the node.

18. Another important system view is the 

$System.DMSCHEMA_MINING_MODEL_CONTENT_PMML view. This view is supposed to deliver the Predictive Model Markup Language of the Data Mining models. If you run this query, you will have an error:

SELECT *
 FROM $System.DMSCHEMA_MINING_MODEL_CONTENT_PMML

The error displayed is the following:

Executing the query ...

Obtained object of type: Microsoft.AnalysisServices.AdomdClient.AdomdDataReader
Error (Data mining): The algorithm does not support the functionality requested by the '' model.
Execution complete
19. Just the Decision Trees and the Cluster support this query:
SELECT MODEL_CATALOG,MODEL_NAME,MODEL_PMML,SIZE 
 FROM $System.DMSCHEMA_MINING_MODEL_CONTENT_PMML
 WHERE MODEL_NAME='TM Clustering'

20.  The DMSCHEMA_MINING_MODEL_XML is similar to the $System.DMSCHEMA_MINING_MODEL_CONTENT_PMML view.

21. Another view is the $System.DMSCHEMA_MINING_MODELS:

SELECT * 
 FROM $System.DMSCHEMA_MINING_MODELS
22. The view returns information about the last process date, if the model is populated, the creation and modification date and the parameters.

22. The $System.DMSCHEMA_MINING_SERVICE_PARAMETERS contains information about the Data Mining Parameters:

SELECT * 
 FROM $System.DMSCHEMA_MINING_SERVICE_PARAMETERS

23. You will find useful information like the default values, description, types and names.

23. There is also another system view used to get the information about the data mining algorithms:

The DMSCHEMA_MINING_SERVICES.

24. When you run the query:

select *
 from $system.DMSCHEMA_MINING_SERVICES
You will receive information about the algorithm included the supported distribution, supported input content, supported prediction content, etc.
24. The DMSCHEMA_MINING_STRUCTURE_COLUMNS views is similar to the  $System.DMSCHEMA_MINING_COLUMNS of the step 3, but this view is of the columns of the structures deployed.
25. You can get the information with the following query:
select *
 from $system.DMSCHEMA_MINING_STRUCTURE_COLUMNS
26. The results displayed are the following:
27. You can also get the structure information like the date created, modified, last processed, holdout  max percent, etc.
28. The following query can be used to retrieve all the columns:
select *
 from $system.DMSCHEMA_MINING_STRUCTURES
29. If you need to monitor the connections to SSAS including the data mining models, you can run this query:
select *
 from $system.discover_connections

30. The query shows the connection users, the machine that was connected, the application used, the start and end time, etc.

31. You can also check the session's information using this query: 
select *
 from $system.discover_sessions

32. You will be able to see the session SPID, command start time, CPU time and command information.

Conclusion

As you can see, you can monitor and administer your mining models using some queries to the System Views.

There are views to check internal information and other views to monitor the security and the activity of your mining models.

References

For a complete list of system views refer to this link:

Analysis Services Schema Rowsets

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating