Synchronous Prediction Custom Processor

FactoryTalk Analytics Integration > Synchronous Prediction Custom Processor

Overview

The Synchronous Prediction custom processor can be built into a DataFlowML pipeline and used to run real time scoring jobs against ThingWorx Analytics models. Real time scoring results are returned directly without creating a job that can be stored. It is recommended for use only with small datasets because of possible effects on performance. For larger datasets, use the Asynchronous Prediction custom processor.

The data evaluated by the Synchronous Prediction processor can be streamed from any location but must be in a format that matches the schema of a trained ThingWorx Analytics model. When the processor is built into a DataFlowML pipeline and launched, scoring runs and prediction results are output. The prediction results are returned to your pipeline and are available for use in a downstream processor.

Uploading and Configuring the Processor

To use the Synchronous Prediction processor, add it to a pipeline in DataFlowML and configure it with parameters as described below.

1. In DataFlowML, select Data Pipeline from the left navigation panel (

). The Pipeline Definition page opens.

2. In the panel on the right, ensure that the Auto Inspection option is enabled. The default is enabled:

3. Upload the JAR file containing the custom processors as follows:

◦ Click the Upload option (

). A file selection dialog box opens.

◦ Select the JAR file that contains the custom processors.

◦ Click OK. The JAR file is uploaded.

This JAR file must be uploaded once for each pipeline that you create.

4. Navigate to the Processors tab and select the Custom processor. Drag is to the pipeline page on the left to add it to the pipeline.

5. When the Custom processor has been added to the pipeline, right-click on the processor icon (

). The Configuration Settings – Custom dialog box opens.

6. On the Configuration tab, enter the following Implementation Class value to identify the Custom processor as the Synchronous Prediction processor:

com.thingworx.analytics.rockwell.processor.SyncPredictionProcessor

7. Click Add Configuration. A parameter row with two columns is added.

8. In the left column enter a key and in the right column, enter a corresponding value. For a list of the required and optional configuration parameters, see the charts below.

9. Repeat steps 7 and 8 until all necessary parameters are added.

10. After all parameters for the processor are added, click Next. The Add Notes tab is displayed.

11. Add any notes about the configuration and click Save. The processor configuration is saved and the dialog box closes.

Required Configuration Parameters
Key	Value
Implementation Class	com.thingworx.analytics.rockwell.processor.SyncPredictionProcessor
TWA_PREDICTION_IP	The IP address of your ThingWorx Analytics Prediction microserver.
TWA_PREDICTION_PORT	The port where ThingWorx Analytics Prediction microserver is connected. To locate the port number, navigate to your ThingWorx Analytics Server installation directory and open the config/system-environment-variables.properties file. Port numbers are listed for each microservice.
TWA_USE_PROXY	false = not installed behind a reverse proxy, true = is installed behind a reverse proxy
GOAL_FIELD	The name of the goal data field on which scoring will run.
MODEL_ID	The Model Result ID that is output when a model is trained in ThingWorx Analytics. This parameter cannot be configured until after a training job has been run manually in Analytics Builder. To find the Model Job ID, navigate to the Models page in Builder, select the model, and click View to open the Model Results page.

Optional Configuration Parameters
Key	Value
CATEGORICAL_LIMIT	Limits the number of values returned when scoring with a categorical goal field. Scoring with a categorical goal results in a prediction for every value category. Using a goal with many categories (such as postal code) can greatly affect performance time. This parameter allows you to limit a categorical goal to the top N values returned.
CAUSAL_TECHNIQUE	The technique used to measure the influence of each field on the goal in a scored record. Options include: • Full Range – Searches for the fields that, when changed, show the largest overall variation in the prediction values. Measures the distance across the range of values from the minimum to the maximum. • Distance from Max – Searches for the fields that, when changed, increase the value of the prediction the most. Measures the distance from the current value to the maximum value. • Distance from Min – Searches for the fields that, when changed, decrease the value of the prediction the most. Measures the distance from the current value to the minimum value.
IDENTIFIER_FIELDS	Any additional fields you want to include in the scoring job to help identify which row of data a score applies to. Use a comma to separate each field.
IMPORTANT_FIELD_COUNT	The number of important fields you want returned in the results. The most influential fields for each record, up to the selected number, are returned with the scoring job results. A weight of influence is also included for each important field returned
PREFERRED_CATEGORICAL_VALUES	Limits the number of values returned when scoring with a categorical goal field. Scoring with a categorical goal results in a prediction for every value category. Using a goal with many categories (such as postal code) can greatly affect performance time. This parameter allows you to specify a limited set of categorical goal values to score. Use a comma to separate each value.
TAGS	String values that can be leveraged for search and filter purposes. Use a comma to separate each tag.
TWA_PREDICTION_PROXY_PATH	A URL for the reverse proxy, if in use. Optional parameter. Default path is /prediction.
TWA_PREDICTION_USE_SSL	false = running on HTTP, true = running on HTTPS

Processor Input

Input to the Synchronous processor can include any streaming CSV data that matches the schema of a trained ThingWorx Analytics model.

Processor Output

Prediction results from the Synchronous processor are output in the form shown below. The results are returned to your pipeline for use in other processors.

{
"result": [
{
"predictiveScores": {
"score": {
"GOAL_FIELD": 0.0
"GOAL_FIELD_mo": 0.0
},
"errorMessage": ""
},
"importantFields": {
"importantFields": [
{
"name": "",
"weight": 0.0
}
],
"errorMessage": ""
},
"identifier": [
0.0
]
}
]
}

For information about how to understand prediction results, see Interpreting Prediction Processor Output.

Was this helpful?