Step Two: Transformation Updates
Background
The transformation process moves the data from the bundle into the DCTM_Output directory. Once in the directory, the data can be used by the load process (Publish to Preview). The DCTM_Output directory contains the following subdirectories:
COL1
The COL1 directory contains the following type-specific subdirectories for a collection:
GRAPHIC
IEXML
IS
PARTSLIST
PDFM
PARTS
During the transformation process, the relevant files are generated under the directory WORK/DCTM_OUTPUT/COLLECTION/BUNDLEID/TYPE. COLLECTION is the collection with which the bundle is associated. BUNDLEID is ID of the bundle. TYPE is the type of object being processed.
Usually the type-specific directory contains the following subdirectories:
Update – Contains the source XML along with metadata that needs to be loaded into the repository
UpdateHierarchy – Contains a listing of all source XML files in the Update folder
Images – Contains items associated with the source that cannot be indexed.
Usually images are placed here.
Deleted – Contains the source files that need to be deleted.
The source file name should be identical to what has been loaded into the system.
Besides these regular files, the COLLECTION or BUNDLEID folder also contains the SourcesList.xml that lists the documents that need to be processed and collections to which they are associated.
The next sections provide the steps required to include the new type specific transform flow into the main TAL flow. In general, you should develop four sub-flows:
Initialization sub-flow
Data validation sub-flow
Transformation sub-flow
Post process sub-flow
Initialization Sub-Flow
Follow these steps to develop the initialization sub-flow:
1. Add a new bundle category (not required).
This step is required only if the new type is not going to be loaded as part of an existing bundle category (for example: PH, IS, BOM, and so forth).
This step is used to initialize the transform process for the new bundle category. In this step, the process first identifies the category of the bundle (for example, a PH or IS bundle). It then initializes the types that should be loaded for each bundle type and run the specific Java transformation classes accordingly.
The following file contains the configuration for the types supported for a bundle type: CONFIG/Applications/ContentManager/Config/Common/Templates/TransformationDriver/transform.properties.xml. Following is an example of that file:
In addition, add an entry in the decideTheFlow node of the sub-flow in the following file: CONFIG/System/Config/Flows/TransformationDriver/transformDriverSelector_PD.xml.
2. Add a new JAVA transform CLASS
Develop a new transformer specific class for this object type by inheriting the default class com.ptc.sce.transformDriver.transform.Transformer and its related factory method by inheriting the class com.ptc.sce.transformDriver.transform.TransformerFactory. Usually the transformer class is initialized through a factory class inherited from the Transform Factory class. In this factory class, the following interfaces must be overloaded:
getTransformerInstance – Provides the customized transformer component instance
getObjectType – Provides the object type for which the transformer has been developed
When done, register the interfaces in the configuration file named transform.properties.xml located at CONFIG/Applications/ContentManager/Config/Common/Templates/TransformationDriver by making an entry in the TransformerFactory.factories key value. For example:
3. Register the type to a bundle category.
Register the type to a bundle category (for example: PH, IS, or SAP) by updating the transform.properties.xml file with the new TYPE name. For example, the IS and PS bundle types will contain the new data type:
This file contains the types included in each bundle category. This information is present in all the file’s entries, except for the one for transform factory registration, and indicates the bundle categories supported in TAL. The transform factory registration should contain the new transform class that was created in the previous step.
4. Identify the files that need to be processed.
In the case where the new type is a single file in the bundle like an artifact (for example, referencedParts.xml), then this step is not required.
In the case where the new type has multiple files in the bundle (for example, PartsList), you should customize the initialization sub-flow so after executing the sub-flow a file is generated with the flowing output. The file can have any name, but it is recommended it be named Type Namefile where Type Name is the name of the new type.
The file must contain the following output format:
<root>
<FileName value="path to the file in bundle" href=”[uri]”/>
<FileName value="path to the file in bundle" href=”[uri]”/>

</root>
5. Configure identity management (IMAN).
This step is required only if the new type requires timestamp or multilingual support. If this is required, follow these steps to provide an identity in IMAN:
a. Register the new type under the PTC source element in the IMANConfig.xml file located at CONFIG/Applications/DataProcess/Config/Common/Templates/IMANIntegration. For example:
b. Add an entry for the type in the IMANRecordMap.xml file, located CONFIG/Applications/ContentManager/Config/Common/Templates/TransformationDriver/IMANIntegration. For example:
c. Generate an IMAN records file for each file in the bundle.
To do this make sure that the createIMANRecords.xsl file, that is located under CONFIG/Applications/ContentManager/Config/Common/Templates/TransformationDriver/IMANIntegration, conforms to the structure of the input file that is configured in IMANRecordMap.xml file: ${Type Name file}. This stylesheet is applied on the ${Type Name file} input and generates the desired IMAN records file.
The output of this transformation is located in WORK/Applications/ContentManager/Work/PreProcessing/TAL/TransformationDriver/TASKID/IMAN. The expected output should be similar to the following for each file in the bundle:
Once you create the IMAN records file the transform reads the file, tries to locate or create a unique identity for each record, and creates a Registry file. This file is being referenced by the validation and transform sub-flows. Usually this file named TYPE.xml is located under WORK/Applications/ContentManager/Work/PreProcessing/TAL/TransformationDriver/TASKID/Registry.
Data Validation Sub-Flow
This sub-flow enables the TAL process to determine whether the documents need to be processed or not. Follow these steps to develop this sub-flow:
1. Create a validation-specific sub-flow.
For example, you could create a sub-flow named symptomsValidationSubFlow_PD.xml for the new type SYMPTOMS.
2. Create the validation support XSL files.
Be sure to do these things as part of this step:
Check whether the type applies for the current bundle category or not.
For example, you could create this XSL file for the new type SYMPTOMS:
CONFIG/Applications/ContentManager/Config/Common/Templates/Validation/checkIfSYMPTOMSFileTobeExcluded.xsl.
Include timestamp-based validation.
That is, compare the current object time stamp from the bundle with the one from the registry file. For example, you could create this XSL file for the new type SYMPTOMS:
CONFIG/Applications/ContentManager/Config/Common/Templates/Validation/SYMPTOMSValidation.xsl.
3. Include the new validation sub-flow in main sub-flow (validationSubFlow_PD.xml).
Transformation Sub-Flow
Follow these steps to create the new transformation sub-flow:
1. Develop the transform-specific sub-flow.
Create a sub flow for the new type that handles transforming the content according to the repository expectations (for example, aligned with the template and type property definitions). Besides the transformation, the script initializes and closes notification related logging.
The following basic operations are done by the transform flows:
Initialize notification specific logging
Set the object type first in the sub-flow
The output folder under DCTM_Output has the same name as the one passed here.
Transform the content using the stylesheet through the transformer class.
Call the executeTransformer() method of TransformExecutor for transformation by providing the following information:
Object type
XSL path
This contains the full path to the transform XSL file that transforms the bundle input into DCTM output. Usually transform specific stylesheets are located in the %CONFIG%/Applications/ContentManager/Config/Common/Templates/TransformationDriver/TYPE directory.
The valid input file name patterns
This Indicates the file to be processed. It is not a regular expression.
Close notification specific logging
Usually these flows are located in the CONFIG/System/Config/Flows/TransformationDriver directory and has the naming convention TYPESubFlow_PD.xml. For example, for a new type SYMPTOMS, the sub-flow would be named SYMPTOMSSubFlow_PD.xml and the transform style sheet created for this type would be named transformSYMPTOMS.xsl.
2. Include the sub-flow for the new type in the product (product-specific sub-flow) or the shared collection specific sub-flows (CONFIG/System/Config/Flows/TransformationDriver/SharedFamilyFlow_PD.xml), depending on the collection where you expect it to load.
3. Develop your transform XSL.
The XSL file should do the following things:
Convert source XML present in bundle to target XML understood by PTC Arbortext Content Delivery.
Generate a manifest file, if required, in the temporary location.
Generate the SourcesList XML in shared mode.
Follow these guidelines for developing the new transformation sub-flow:
Usually non-Indexed content like images are stored on the file system.
For shared mode, usually one copy of an object is maintained across all segments or collections.
Use these guidelines for shared mode:
During processing make sure that the same source name is created for all object types regardless of the bundle or collection to which they belong.
Create a SourcesList XML file.
Create a TransformCollectionList.xml file that lists the collections to which the data is to be loaded.
For non-shared mode you maintain different copies of an object in each collection.
During processing make sure that you create a unique source name or persistent ID for all object types in each collection.
Update the MIME type mapping, if needed.
There is a default set of supported extensions in PTC Arbortext Content Delivery. If you want to add a new file type to be processed, then you must update the MIME type mapping file.
Use the regular structure for the DCTM directory under WORK/DCTM_OUTPUT/COLLECTION/BUNDLEID/TYPE
Usually the type-specific directory contains the following subdirectories:
Update – Contains the source XML along with metadata that needs to be loaded into the repository
UpdateHierarchy – Contains a listing of all source XML files in the Update folder
Images – Contains items associated with the source that cannot be indexed, such as images.
Deleted – Contains the source files that need to be deleted.
Post Process Sub-Flow
In the post process sub-flow, deletion related tasks are done for all the types after the saveRegistry step for all the object types completes. Usually deletion related flows are triggered by the presence of the excludedObjects.xml file. By default, deletions are handled by a generic excluded object handler.
In most cases, you must develop your own exclude style sheet with name excludedObjects.xsl in the CONFIG/Applications/ContentManager/Config/Common/Templates/TransformationDriver/TYPE folder to identify objects specific to the new type.
Once all the sub flows complete, load specific tasks are triggered for each DCTM output content generated while processing this bundle.