Explanatory Analytics
Explanatory analytics procedures can be used to better understand the data in your dataset. Through a variety of advanced algorithms, these procedures let you discover useful patterns and features that are strong indicators of goal outcomes within your data. Explanatory analytics is not required to generate prediction models, but it can provide insights that will improve the models you generate. ThingWorx Analytics provides the explanatory analytics procedures described below.
Signal Detection
When analyzing volumes of data, it is helpful to know which data is actually useful and which data is just noise. Signals are based on a correlation algorithm that examines historical data to identify the strength of a given input in predicting future outcomes. Signals can identify meaningful correlations within the data.
Signals are useful during initial analysis to determine which features you want to curate in a given dataset for predictive model generation. For example, knowing the month of the year is more important to accurately predicting tomorrow’s weather than knowing the day of the week. The month has a much stronger signal than the day of the week for this prediction. However, to predict what traffic will be like tomorrow, the day of the week presents a very strong signal whereas the month may only be marginally useful. So, signal strength can vary greatly depending on the analysis or prediction you are trying to make.
ThingWorx Analytics reports signal strength in a mutual information (MI) score that represents the probability of predicting the goal variable when a given feature is provided. It can effectively capture non-linear relationships. ThingWorx Analytics evaluates each feature, or combination of features, to identify the top signals.
Data Profiles
A profile is a distinct subpopulation within your data that shares similar characteristics and is different from other subpopulations in statistically significant ways. ThingWorx Analytics uses advanced search and pattern-matching algorithms, as well as automated decision-path analysis, profiles. This type of information is typically used to focus on the margins of the dataset, to identify populations that are over- or under-performing relative to the goal (high or low outcome records).
Profiles are mutually exclusive, meaning each record can be grouped into only one profile. Because there is no overlap, profiles can be used to target specific populations. The results are easily readable and business-actionable.
A profile must meet a user-defined minimum threshold to be identified as a profile. This threshold prevents finding profiles that describe a single record or a very small number of records, which are likely outliers. Profiles are not required in order to make a prediction, but they contribute to a strategic understanding of the complex factors associated with specific outcomes.
Cluster Analysis
Cluster analysis categorizes the records in a dataset into groups based on their similarities. Clusters are mutually exclusive, meaning that each record can belong to only one cluster. This analysis is a non-goal centered process. It performs true, unsupervised clustering, using a k-means clustering algorithm, which separates the dataset into a user-defined number of clusters.
Cluster results are returned in the form of a PMML model.
How to Access Explanatory Analytics Functionality
ThingWorx Analytics explanatory analytics functionality can be accessed via the following methods:
• ThingWorx API – In ThingWorx Foundation, explanatory analytics procedures are accessible through connected Things that represent individual microservices, including Signals, Profiling, and Clustering. These microservices can be used to submit analysis requests, retrieve results, list jobs, and more. Requires installation of both ThingWorx Foundation and ThingWorx Analytics Server.
• Analytics Builder – As part of the ThingWorx Analytics Extension, Analytics Builder provides a user interface for interacting with your data. In addition to generating and scoring predictive models in Analytics Builder, you can also run procedures to generate signals and profiles. (No cluster procedure is available.) Requires installation of both ThingWorx Foundation and ThingWorx Analytics Server.