Data Quality: from Ingestion to Clean Records
Follow the data quality pipeline from external source ingestion and connector configuration through matching rule enforcement, duplicate review, and validated production records.
Overview
Clean data underpins every capability in PTC Orbit: scoring models produce meaningful results only when the underlying records are accurate, and work orders resolve the right assets only when duplicates have been eliminated. The data quality workflow spans three personas and four modules, moving records from raw external sources through staged validation into production.
Workflow Stages
1. Data connector configuration. An implementer creates data connectors in Max Designer to identify each external source: Salesforce orgs, AWS S3 buckets, or custom REST APIs. Each connector specifies connection credentials, object mappings, and scheduling intervals. For details, see
Configuring Data Connectors for ServiceMax Core Integration.
2. Data sync pipeline setup. The implementer configures full and incremental sync orchestration pipelines in Matillion. Full sync loads all records for initial population; incremental sync captures ongoing changes. Ingestion modes (Staging Review or Pre-DB Sync) control whether records pause for manual validation or commit automatically. For details, see
Setting Up Data Sync.
3. Matching rule definition. An end user or administrator defines matching rules in Data Foundry to detect duplicate records across incoming data. Rules specify which fields to compare, the matching logic, and whether flagged records require manual review. Rules can be activated, deactivated, or updated as data sources change. For details, see
Data Foundry and
Matching Rules.
4. Flagged record review. Records that fail matching rule validation surface on the
Matching Data Review page for manual inspection. Reviewers evaluate each flagged record, decide whether to merge, keep, or discard the duplicate, and sync verified data to production. For details, see
Matching Data Review.
5. Validated production records. After review, clean records enter the production database. Scoring models now calculate against accurate data, and asset views reflect deduplicated, verified records. For details on how scores consume this data, see
Scores.
Personas Involved
• Implementer: configures data connectors and sync pipelines.
• Administrator: might define matching rules for organization-wide data governance.
• End user: defines matching rules, reviews flagged records, and validates production data.
What To Do Next