Data sync pipelines

Automation in PTC Orbit > Data sync pipelines

Data sync pipelines

Orchestrate full and incremental data synchronization between Salesforce and Snowflake through Matillion pipelines that load, transform, and validate records automatically.

Overview

Moving data between Salesforce and Snowflake requires more than a simple copy. Records must be extracted, transformed into the PTC Orbit schema, validated against staging rules, and loaded into the correct tables: all without manual intervention at each step. Data sync pipelines in Matillion handle this end-to-end orchestration.

Two pipeline types address different synchronization needs. A full sync pipeline replaces all existing data with a complete reload from the source. An incremental sync pipeline transfers only records that changed since the last run, reducing processing time and system load for ongoing operations.

How it works

Each pipeline executes three stages in sequence:

1. Load from Salesforce. A Salesforce Load component extracts records from the source object (for example, SVMXC__Installed_Product__c for assets). Full sync uses Full Load mode with a Replace strategy to overwrite existing data. Incremental sync uses timestamp-based filtering to pull only recently modified records.

2. Transform. A Run Transformation component maps Salesforce fields to the PTC Orbit schema using the Calculator component. This stage sets the IO_DATA_SOURCE expression to the data connector UUID and maps relationship fields to their corresponding shadow fields with the _EXTERNAL_ID suffix.

3. Validate and group. An SQL Script component runs stored procedures that validate synced records and set default status values. For Staging Review objects, a grouping task organizes records for manual review in Data Foundry.

Implementers configure both pipeline types during initial tenant setup. After configuration, the pipelines run on their defined schedule or on demand from the Matillion interface.

Considerations

• A Matillion account and Snowflake instance are prerequisites. Complete tenant provisioning and Matillion project setup before configuring pipelines.

• Full sync replaces all target table data on every run. Schedule full syncs during off-peak hours to minimize impact on downstream reporting.

• Incremental sync depends on accurate timestamp fields in Salesforce. If source records lack reliable modification timestamps, use full sync instead.

• Prebuilt pipeline templates are available for import. Use them as a starting point and customize field mappings for your data model.