Troubleshooting the Azure IIoT OPC UA Integration
This topic presents troubleshooting tools as well as some common scenarios and solutions in the following sections. Click a section title to display its content. To hide the content, click the title again:
Tools for Troubleshooting
The following tools can help you determine the cause of an issue so that you can fix it:
Azure IoT Edge check, command, which is explained in Troubleshoot your IoT Edge device. Use this tool for configuration, connection, and production readiness checks. For more details about this tool, refer to Built-in troubleshooting functionality.
Visual Studio Code — This tool is useful for verifying that telemetry is being sent to the Azure IoT Hub and also for accessing the resources of your Azure Industrial IoT deployment. If telemetry is not being sent, check that no one else is using the same consumer group. This tool is available from Microsoft.
Postman — This tool is useful in solving telemetry not flowing issues. Use the tool to
Troubleshoot configuration.
Verify that the correct nodes are published.
Verify that a certificate is trusted.
Verify the user name and password credentials for an endpoint are correct.
Setting Up to Troubleshoot with Postman
For Postman to work, you will need to update your “client” application registration to accept https://opcua-<your_resource_group_name>.<selected_region>.cloudapp.azure.com/.auth/login/aad/callback as a redirect URL. You can then download a collection of sample REST calls for testing here.
* 
The Azure Industrial-IIoT OPC UA integration is using OAUTH 2 to connect Postman to the Azure Industrial IoT (IIoT) Microservices. The OAUTH token expires quickly, so you can find yourself all too frequently requesting a new token. To work around this, you need to configure secrets so that you can request a new token to access the Azure IIoT Microservices from Postman.
1. From your Resource Group, navigate to your <resource_group_name>-iiot-client application registration.
2. In the navigation panel on the page for the resource_group_name>-iiot-client application registration, click Authentication and scroll down to Platform configuration.
3. Click Add a platform, and on the right side of the page, under Configure, add the redirect URL. The value comes from the value that you configured for the ingress controller (NGINX). For example,opcua-<your_azure_login_name>.<selected_region>.cloudapp.azure.com/auth/login/aad/callback. This should be similar to the baseUrl used when configuring the Azure IoT Hub Connector.
4. There is a collection of Postman requests that is in the Industrial-IoT repository that you can pull down from within the tutorial in Visual Studio Code and add to Postman. From Visual Studio Code, open and follow the tutorial for using postman, in the tut-use-postman.md file in the Industrial-IoT\docs\tutorials directory of the cloned repository.
Microservices Crash After Running the Failover for the IoT Hub
According to Microsoft, the ability to recover from an IoT hub failover is not supported at the moment. The workaround is to update the <your_Hub_name>-eventhubendpoint value in the Azure Keyvault resource and restart all the microservices. The new value will be picked up and used.
On the Connector side, the configuration of the Connector needs to be updated, and the Connector must be restarted. In addition, the checkpointing information is no longer valid for the new endpoint for the new endpoint and must be reset or deleted manually.You can update the checkpointing information through Azure Storage in the Azure Portal or via a Blob storage template in ThingWorx.
Unable to Get Consistent Scan Rate
You notice that a property is being scanned at a higher rate than what is set. Nodes can be published at multiple scan rates. The Azure IoT Hub Connector scans at faster rate if nodes are publishing at different scan rates. A node will publish at the highest rate mapped. Check if the node is bound to another property.
Invalid Property Name
The default names for OPC UA tags (properties) contain characters that result in messages that the property names are invalid in ThingWorx. Consider writing a script for adding properties, especially if you are creating multiple Things with hundreds of tags/properties. The topic, Naming Entities, in the ThingWorx Platform help center provides the naming rules for ThingWorx entities, including characters that are not allowed.
* 
Endpoint names are unchangeable. It is recommended that you edit the description to identify the endpoint.
Unable to Get Consistent Results in Batched Telemetry
To reduce cost, you can batch messages before sending them to the Azure IoT Hub. However, keep in mind that it may cause latency of data. For example, if the scan rate of property Temperature is faster than the batching interval, this property will receive updates at the batching interval, not at the faster scan rate.
Unable to See Application Using GetApplications After Discovery Appears to Succeed
If everything in the discovery logs looks as though an application was added, but the GetApplications service is not returning anything, it is likely that the IoT Edge Runtime device is not using a layered deployment. The device is created on the Azure Portal under IoT Hub.
Unable to see telemetry from the OPC UA Server in ThingWorx
There could be many reasons for this issue. First verify that the node is published:
1. Go to your OPC Twin Postman collection and select the Twin: Get currently published nodes
2. In the URL change twin to publisher and use your registered endpoint. For example:
https://{{OPC-SERVICEURL}}/publisher/v2/publish/<yourRegisteredEndpoint>
* 
If you do not know your endpoint ID, go to the Registry: browse endpoints in the OPC twin Postman collection and send the request. In the response you can find your registered server. The endpoint ID is the applicationId value.
3. Execute the request in step 2 to verify your node is being published. If the list returned is empty or your node is not showing up, go to the next section for further diagnosis. If the node is in the returned list, the connecter successfully published the node.
Unable to confirm telemetry was sent to Azure IoT Hub
It is possible that the node has been published but telemetry is not being sent to the Azure IoT Hub. To verify telemetry is being sent to the hub, follow the Microsoft Azure tutorial, Use Azure IoT Tools for Visual Studio Code to send and receive messages between your device and your Azure IoT Hub. If you do not see any telemetry being sent:
1. Use VS code to verify that no one else is using the same consumer group. Every consumer group on an IoT Hub receives a copy of each message. However, if two clients are pulling messages off of the same consumer group, it is possible that one will not receive any of the messages from the queue. If necessary, create a new consumer group and restart the Publisher. Check the Publisher logs.
a. To create a new consumer group from the Azure Portal, go to your IoT Hub > Built-in endpoints > Consumer Groups.
b. To configure a specific consumer group in Visual Studio Code go to Settings, and search for Consumer Group. It will be the first option.
2. On your IoT Edge Runtime device restart the OPC Publisher module. If that does not trigger telemetry to flow, go to next step.
3. Look at the Publisher module’s logs: docker logs publisher. Check for a message indicating that something is misconfigured. If you cannot fix the misconfiguration or still do not know why no telemetry is being sent, go to next step.
4. Create an Issue on the Industrial IoT GitHub repo.
If you do see telemetry being sent, it is possible that the telemetry format changed, and the Azure IoT Hub Connector is unable to parse the data. To fix this, verify you have the supported versions of the modules.
Unable to get telemetry after saving the Thing
After saving a Thing, it is possible to get data without telemetry being sent. This can be caused by the gateway not being on the allow list for access to the device, which is the VM running the OPC UA Server. Saving the Thing performs an explicit read from the Connector to the VM while telemetry requires a publisher VM that is on the allow list for communications from the VM to the Azure Industrial IoT (IIoT) Microservices. The twin module performs the explicit read.
It is also possible that the certificate for the OPC Publisher is not trusted. To try to resolve this issue, set the Publisher logging to the Debug level and then look for the message, job orchestrator not found. If you see that message, the orchestrator URL is wrong or the VM is not on the allow list. You can confirm that the orchestrator URL is wrong by looking at the Kubernetes secret in the Azure Portal.
To set the Publisher logging level, the module deployment needs to be created with different Container Create Options. Either you create a new layered deployment with the desired logging level set, or the layered deployment should be disabled by setting the "tag" in the device twin to null, and then modify the Create Options. However, keep in mind that if you disable and then re-enable the layered deployment by adding the tag back to the device twin, you will lose your changes.
For example, suppose an OPC UA Publisher Module uses the following Container Create Options:

{
"Hostname": "publisher",
"Cmd": [
"--aa"
]
}
The logging level can be changed by navigating to your OPC Publisher Module, Module Details page, and modifying the Container Create Options to the following:
* 
In this example, the "--ll=debug" setting is the new log level setting.
If you change the log level in the Container Create Options tab, you need to redeploy the OPC Publisher module. The module configuration would be almost the same as the default deployment, with the following differences:
The change you just made to the --ll (log level) flag, as shown above.
The priority setting — For this new layered deployment, this setting should be higher than what is there for the default deployment so that the debug deployment is used.
Navigate to the Create Layered Deployments page for your Azure IoT Hub. In the Target Devices tab, change the Priority setting, as indicated below:
OPC Publisher has a command line option for diagnostics, --di, --diagnosticesinterval=VALUE and an option for the log level, --ll, --loglevel=VALUE. The OPC Publisher Module supports the following log levels:
Log Levels for OPC Publisher Module
Level
Description
verbose
The noisiest level, it is rarely enabled for a production application. Useful for troubleshooting an application during development.
debug
Debug is used for internal system events that are not necessarily observable externally, but useful when troubleshooting.
info
Information events are actions that the system can perform and that can be observed externally. Log messages at this level describe events in the system that are part of its functionality.
warn
When service is degraded, endangered, or behaving outside of expectations, this level of event is used.
error
This level message is generated when operations are not available or expectations were not met.
fatal
The most critical level. Fatal events require immediate action.
Unable to add an application and see it using GetApplications
If everything in the discovery logs looks as though an application was added, but the GetApplications service is not returning anything, it is likely that the IoT Edge Runtime device is not using a layered deployment. The IoT Edge Runtime device is created on the Azure Portal under IoT Hub.
To resolve this issue, go to<your_IoT_Hub> > IoT Edge > <your IoT Edge Runtime device> > Device Twin. On this page the following JSON object should appear at the root level of the device twin:

"tags" : {
"__type__": "iiotedge",
"os": <device OS>
}
If these tags are missing, add them and restart the modules of the IoT Edge Runtime device.
Unable to upload or download a Kepware project
The upload and download features for projects on a ThingWorx Kepware Server (TKS) are not supported in the ThingWorx Azure IIoT OPC UA integration. If you are running another type of OPC UA Server that supports upload and download of project files, it is likely that it also will not work.
Unable to see Kepware diagnostic properties
The Kepware diagnostics properties are not supported by this integration. If you are using an OPC UA Server other than TKS and have similar properties, it is likely that they will not work.
Errors When Using Microsoft Azure SQL Database as the ThingWorx Persistence Provider
When using the Microsoft Azure SQL database as the ThingWorx Persistence Provider in the ThingWorx Microsoft Industrial IoT OPC UA integration, property bindings do not work. The cause of this issue is the limitation of index size on Microsoft Azure SQL.
The following error message in the ApplicationLog indicates that the ThingWorx AzureOpcUaPropertyMapDataTable data table is present and the index exceeds the Microsoft Azure SQL limitation on indexes:

Unable to add data table entry because com.thingworx.common.exceptions.DataAccessException: [1,018]
Data store unknown error: [Error occurred while accessing the data provider.] Unable to dispatch
[ uri = /Things/AzureOpcUaPropertyMapDataTable/Services/AddOrUpdateDataTableEntries/]:
Unable to Invoke Service AddOrUpdateDataTableEntries on AzureOpcUaPropertyMapDataTable :
java.lang.RuntimeException: com.thingworx.common.exceptions.DataAccessException: [1,018]
Data store unknown error: [Error occurred while accessing the data provider.]
WORKAROUND: To avoid this error, follow these steps:
1. Log in to ThingWorx Composer and navigate to the AzureOpcUaPropertyMapDataTable.
2. Click Configuration to display the following page:
3. Click Edit to display the editing tools. Click the icon next to each index to delete them, and then click Save, as shown here:
4. Under Configuration, select the two indexes listed, Thing Property and Endpoint and Note IDs, and click Delete to remove them.
5. For this change to take effect, you must restart the ThingWorx Platform.
Restarting One or More Connectors Disconnects Industrial Connection But Telemetry Still Flows
This issue is specific to the ThingWorx High Availability Clustering environment. When one or more Connectors are restarted, you need to wait for at least two minutes for the Industrial Connection and the Remote Things to reconnect to the ThingWorx Platform instance. It can take this long for the Industrial Connection and Remote Things to return to the connected state in an HA Cluster.