Data Centric Modeling in ThingWorx

ThingWorx Model and Data Best Practices > Storing Data with ThingWorx > Data Centric Modeling in ThingWorx

Choosing the Correct Model Elements

PTC recommends getting familiar with ThingWorx modeling concepts to follow the rest of this section. Reference ThingWorx Model Definition in Composer for more information. A model in ThingWorx is a logical representation of your physical and solution landscape. This logical representation is realized with creating instances of the built-in model element templates such as Thing, Thing Template, Thing Shape, and Data Shape. This section provides recommendations on how to design the data storage aspect of the ThingWorx model-based solution. In ThingWorx, there are several options for storing your data. Understanding each option will help you determine the best storage solution for your data:

• Thing Properties

• Streams

• Value Streams

• Data Tables

Use the Sizing Guide for a useful method to estimate the amount of processing and memory that ThingWorx may need to meet your requirements.

What Are Base Types and Data Shapes?

ThingWorx base types provide an abstraction layer that isolate the ThingWorx application development environment from the specific data types of the Edge, as well as the data store. This allows ThingWorx applications to be data store-agnostic and allows changing the base types at run time without having to change the underlying database schema.

A Data Shape is a named set of field definitions and related metadata, where each field is a base type. A Data Shape loosely matches the concept of a relational database table where the base types resemble the data type of a field.

Thing Properties

The most prominent entry point of data ingestion into the ThingWorx Platform is via Thing properties, where a connected device is modeled as a Thing within ThingWorx. Refer to Thing Properties for a general description.

The following use case illustrates the different data storage options available to store the Thing properties. Where a tractor Thing would have a tractor engine Thing Shape with the following properties: Max RPM, Engine Temperature, and Last Oil Service Date.

The properties have three data storage options: read-only, persisted, and logged. Given the above example, the following is recommended:

• Max RPM – Use the read-only option since this is a static value that should not be changed at run time. However, if the engine is upgraded, this value can be changed by changing the default.

• Last Oil Service Date – Use the persisted option since this property can be changed at run time and you are only interested in the latest date. Using the persisted option will also survive a ThingWorx server restart.

• Engine Temperature – Use the logged option since this is a continuously changing value that is essentially time series data.

Streams

A stream is intended to store a blob of time series data. Each stream entry has a timestamp, source, source type, field values, data tags, and location field. The list of fields is defined in a Data Shape and associated with the stream. The field values of this list of fields is stored in a single column as a json or a text blob in the stream. As a result, when a single field value is queried, the entire row or rows that contains matching field values are returned. In other words, the data retrieval is quicker when the streams are queried to return the field values for a given source for a short period of time. Querying with a conditional for a specific field value will result in having to filter the field value data at the application level.

Best practices include:

• Use for arbitrary time-series data that are not directly associated with a Thing in the ThingWorx model (compared to value streams)

• Use when the stored data is not required to be queried with extensive filtering based on the field values

• Use when the queries are constrained with small time periods

• Avoid querying across multiple sources for a long time spans

Value Streams

A Value Stream is intended to store individual properties of a Thing as time series data. A property in a Thing must be defined as logged for it to be considered as time series data and must use a value stream for data storage. Each value stream entry has a time stamp, source, property type, property name, and property value. This is in contrast with the storage model of the streams because, as opposed to storing the entire set of field values in a field values column of a single row as a json/text blob, value streams store each property value in a separate row with the associated source and time stamp. When querying a Thing's property data in a value stream , values are returned for that property only.

Value streams are beneficial for Thing-driven models. It is best practice to split Things across several value streams to improve index performance. While not strictly a best practice, certain high-volume data ingestion scenarios (over and above the ingestion rates outlined in the sizing guide) can also consider creating multiple persistence providers pointed to separate database instances. This ensures data goes into different tables in the database. If multiple databases are added, the persistence providers can be pointed to specific databases. This scenario would also require data migration.

If using a RDMS (PostgreSQL, MSSQL, H2), all the records, even from different value streams for different Things, are written to the same table in the database.

If using PostgreSQL, in the table for ValueStream in the PostgreSQL database, each row only holds the record for one property. This means the value stream keeps track of the value change of each property independently, and after using the QueryPropertyHistory service, it will check the data flow for each property in the Thing and collect all the latest updates (each has a different update time) into one infotable result.

Data Tables

A Data Table is a ThingWorx entity that is essentially an abstraction of a standard relational database table that you can use to simplify and accelerate ThingWorx application development. However, it should be noted that the backend implementation of a data table is not equivalent to a relational database table and will not have the full-fledged flexibility of a relational database table. Data tables as a first class ThingWorx entity allows processing the data access functionality at the ThingWorx model level a lot easier. A Data Shape that would be associated with the data table defines the columns or fields of the data table and its primary key. However, it is not meant to replace a standard relational database table in terms of performance and scalability.

See Data Table Best Practices and Sizing Limits of Data Tables for more information.