Storing Data in ThingWorx
ThingWorx provides entities and methods to store data. You can store data in data tables, Thing properties, streams, value streams, and configuration tables.
While developing your solution, you need to consider how ThingWorx handles data storage. Choosing the correct data storage is very important since it affects the outcome of the project, its scalability and reliability, and the user experience.
The following section describes the storage options in the ThingWorx model:
Data Table
Use for less than 100,000 rows of data.
Use for static datasets and static lookup tables. For highly dynamic and larger datasets, use a relational database that is connected via a Database Thing Template.
Use a relational database for complex queries and joins.
Use for key-based queries and storage as well as for easily enabling updates and deletes based on the primary key.
For example: You can store information about the inventory of a smart connected vending machine, where each position in the inventory is a primary key. You can also use data tables to store information about the irrigation programs that are available for a crop management device, where each irrigation program is a row with a primary key.
To manipulate or query the data row-by-row, use data tables.
Use indexes when working with data tables.
Thing Property
Use Thing properties to store data about a Thing in ThingWorx. Properties have the following data storage options:
Read-only - Use the read-only option for static values that should not be changed at runtime. However, if you want, you can change the default value.
Persisted - Use the persisted option if you want the value of the property to be saved even after a ThingWorx server restart, and if the value of the property can be changed at runtime.
Logged - Use the logged option for properties whose values update continuously. This is time series data that can be stored in value streams.
Do not use explicit properties to store historical data. Instead use streams or value streams.
Use for logging time-driven process events or activities your devices.
For example, create a stream to log issues about your device activities, or to record when your device disconnects from and reconnects to the ThingWorx Platform. Streams are optimized for high-speed writing, and they have a configurable cache system.
Value Stream
Use for storing time series data that is obtained from the properties of a thing.
With streams, data tables are created. When you use value streams, the creation of sparsely populated data tables is eliminated.
Thing-centric access of data in a value stream provides in-built support for multi-tenancy.
If you are using an RDMS (PostgreSQL, MSSQL, H2), all records, even from different value streams for different Things, are written to the same table in the database.
If you use PostgreSQL, in the table for ValueStream in the PostgreSQL database, each row holds the record for only one property. This means that the value stream tracks the value change of each property independently. After using the QueryPropertyHistory service, it checks the data flow for each property in the Thing and collects all latest updates (each has a different update time) into one infotable result.
The following table provides information about the key differences between streams and value streams. Use this information to decide on the type of entity to use to store time series data in your solution:
Value Streams
Streams can store any type of time series data.
Value streams can store time series data from the property of a Thing.
Value streams are bound to the properties of a Thing.
You can query data from streams directly by using their own services. The result of the query is the entire row of data.
You cannot query data from value streams directly. Instead, use services defined on the Thing to query data from the value stream. For example: QueryPropertyHistory
To add a row of data to a stream, use the WritePropertiesToStream service.
To add data to a value stream, select the Logged check box for a property.
Streams can store contextual data. For example, whenever a specific event is triggered, you can add the values of the other properties. This helps in analysis of data.
Value streams cannot store contextual data.
When to use streams or value streams?
Use value streams and streams to store and retrieve time series data. Depending on the amount of data that you need to store, choose the correct data storage option.
Use streams when you want to query data only within small time periods.
Split Things across several value streams to improve index performance.
Configuration Table
Use to build customizable solutions that can be safely upgraded on the ThingWorx Platform.
You can add a configuration table on a mashup entity based on a pre-defined Data Shape. This simplifies the process of building extension mashups that can be customized, while also still supporting in-place upgrades, because configuration table values are carried forward during extension upgrades.
Recovery from database outage for Data Providers
In case of database restart or outage, if ingested data fails to persist in data providers, you can configure the system to retry those operations for avoiding data loss by setting the following configurable properties for required persistence privider in platform-settings.json:
acquireRetryAttempts: Defines how many times ThingWorx will try to acquire a new connection from the database before giving up.
acquireRetryDelay: The time (in milliseconds) ThingWorx will wait between acquiring attempts.
Accordingly, ThingWorx will retry up to acquireRetryAttempts times, with a delay of acquireRetryDelay between each attempt, as per the configured settings. (e.g if ThingWorx needs to wait for 5 seconds waiting for database to come back online set acquireRetryAttempts=5 & acquireRetryDelay=1000).
If all retry attempts fail, the following error message will be logged in Application Log, indicating that entries failed to persist: Failed to connect to persistence provider after retrying 5 times in 10 seconds.
It is not recommended to set acquireRetryAttempts less than or equal to zero, as the application will retry indefinitely to persist entries and may cause the platform to hang in the case prolonged database outage
While using InfluxDB as an optional persistence provider, you need to configure DatabaseWriteRetryAttempts to retry database operations as configured number of times.
For frequently changing persisted and logged properties, there will be an unavoidable, very minor data loss due to batch-based processing. In such scenarios, the following messages will be logged in Application logs:
For persisted property: BatchUpdateException error occurred executing batch update of persistent properties.
For logged properties or ValueStream ingestion: Error executing batch.
Queue sizes should be configured appropriately to hold the data as per the rate of ingestion.
For more information on the performance, see Performance Report.
Best Practices for Handling Data-Centric Modeling
Use the following general best practices to handle data-centric modeling in ThingWorx:
If you have data that will not change or will be overwritten the next time it is changed/loaded, and it is associated with a Thing, create an infotable property for that Thing and assign a proper Data Shape. In this way, you can access the data through the Thing. You can also use configuration tables, or for larger amounts of data, use a data table.
Use data caching as much as possible.
For example, instead of querying the database on each DataChange event, implement a cache, as an infotable, that is refreshed at set intervals.
Archive the data that you no longer need.
While designing your solution, you must decide what data is frequently used. You can store this data in the solution database. Move the old data as soon as possible to an external server, such as a ThingWorx federated instance or a database server.
Provide start and end date parameters to the query methods to limit the amount of data that the query retrieves. This reduces the processing time and improves performance.
For high-volume data ingestion scenarios (over and above the ingestion rates outlined in the ThingWorx Platform Sizing Guide), consider creating multiple persistence providers that connect to separate database instances. This ensures that data goes into different tables in the database. If you add multiple databases, the persistence providers can point to specific databases. In this case, you need data migration.
Ensure that your data tables have less than 100,000 rows.
Querying data from data tables and streams should only take a few seconds. If these data tables and streams have more than 100,000 rows, the queries perform slowly.
Ensure that you plan how you intend to purge your old data. Purging data is important, as it helps improve the performance of a solution.
Was this helpful?