Partitioning Runtime Data

ThingWorx Model and Data Best Practices > Data Management > Partitioning Runtime Data

Partitioning value_stream table in PostgreSQL (BETA)

A value_stream table with a lot of data can cause time-based queries and purge operations to take significant amounts of time due to the resulting table-scans and add to the overall server load. Having weekly or monthly partitions and matching drop-partition operations would streamline some of these current bottlenecks.

For large-scale data ingestion, use the partition_value_stream_by_range_setting command line to partition the value_stream table to easily control partitioning ranges, such as when to start or end the partitioning and the partitioning intervals. This includes repartitioning old data and changing the start, end, or interval time.

Using the partition_value_stream_by_range_setting command line supports the following:

1. Adding new partitions to an existing table or partition with minimum downtime.

2. Repartitioning old data from a specific time or all data. Doing so can take some time.

3. Repartitioning with a different range setting.

4. Using the DEFAULT partition to store info if the data does not belong to any partition.

To partition your value_stream table, create periodic partitions. Use the database scheduler tool to ensure that all future injected data will enter the next partitions.

The partition_value_stream_by_range_setting command line has the following parameters:

Parameter name

Description

Default value

Comment

host

database host

Mandatory

port

database port

Mandatory

database

database name where the value_stream table exists

Mandatory

username

database user

Mandatory

password

database password

Mandatory

partition_start_date

If the provided data is in the past, this may cause downtime depending on the amount of data that needs to be partitioned. It is advised to first execute this in a test environment and analyze any potential downtime. If the partition_start_dateis left empty, there will be no downtime.

Define from which point of time you want to create partitions. It may be in the past or future.

The last value of the “time” column in the value_stream table. If there is no data, the start date=current..

Optional

Possible formats:

• YYYY-MM-DD

• YYYY-MM-DD HH:MM

• YYY-MM-DD HH:MM:SS

• YYYY-MM-DD HH:MM:SS.MS

partition_end_date

Define from which point of time you want to stop creating partitions.

Mandatory

Possible formats:

• YYYY-MM-DD

• YYYY-MM-DD HH:MM

• YYY-MM-DD HH:MM:SS

• YYYY-MM-DD HH:MM:SS.MS

Must be in the future and at least greater than the last value of the “time” column plus the interval. If there is no data, then it must be greater than current + interval.

partition_interval

Mandatory interval for each partition

Possible formats:

• 1h or hours

• 1d or 1day

• 1 month

• 1y or 1year

repartition_all_data

Mandatory

Possible values:

• yes or no

• if = yes (optional), the script will ignore the start date and repartition all data

After executing the command line, the value_stream table is partitioned according to the parameters and data is pushed to the correct partition.

To use the partition_value_stream_by_range_setting command line, do the following:

1. Download the partition_value_stream_by_range_setting.sh script. Request the download location from support.ptc.com.

2. Copy the script to a server connected to the ThingWorx database.

3. Open a command line and run the command as desired using the above parameters.

For example, ./partition_value_stream_by_range_setting.sh -host localhost -port 5432 -database thingworx -username DBA -password password -partition_start_date '2024-06-17 00:00:00.000' -partition_end_date '2025-06-17 00:00:00.000' -partition_interval '1 months'

To prevent repeatedly providing certain parameters, you can edit the script and set the following parameters: DB_HOST, DB_PORT, DB_NAME, DB_USER, DB_PASSWORD.

4. Maintain your partitions. For example, create a scheduler to add future partitions and archive or delete old ones. However, a large number of partitions can significantly increase query planning time, so avoid using many thousands of partitions.

Was this helpful?