User Help > Working With Documents > Comparing Unrelated Documents
  
Comparing Unrelated Documents
CLI EQUIVALENT 
im diffsegments --compareUnrelatedDocuments
This capability provides a solution to handle use cases when there is a need to compare two unrelated documents. Consider following use case:
In a typical industry scenario, an organization receives inputs from multiple OEMs. The organization may choose to store these inputs in separate input documents. These input documents form the basis of subsequently evolved requirement documents. The Compare Unrelated Documents functionality helps the organization to:
locate the common items from these different requirements documents.
locate the common items from the input and requirement documents if the inputs get changed over a period of time.
Compare unrelated documents lets you compare textual differences between two documents of different types. The documents that are being compared are considered unrelated if they do not have the same root ID. Unrelated documents can also be considered as:
Those that are not the same document or a branch or a version of the same document
Those that are not the As Of configurations of the same document or As Of configurations of a branch or version of the same document
For example, this feature lets you compare a requirements document with an input document or any unrelated requirements document.
* 
The two unrelated documents being compared must have the same primary long text field.
The output is presented in the Document Difference view that presents structural and content changes side-by-side in two document panes. Nodal additions, deletions, moves, changes with respect to text differences are highlighted. You can use navigation tools to review and navigate the textual and structural differences in the document.
* 
If the Document Difference view is not visible after upgrading to the latest version of Windchill RV&S, you must customize the existing Windchill RV&S viewsets and the shortcut menu to enable this view in the GUI. Viewset and shortcut menu customization is saved and remembered the next time you access the client.
To customize existing viewsets, select ViewSets > Customize. On the Actions tab, select Workflows and Documents > Document > Document to make the option visible in the Document menu.
To customize the right-click shortcut menu, right-click and select Customize This Menu. Click Add Action. Select Workflows and Documents/Item and then select Compare Unrelated Documents.
For more information on customizing viewsets or shortcut menus, see Customizing a ViewSet or Customizing Shortcut Menus.
To compare two unrelated documents in the GUI
* 
Unrelated document difference calculation is performed on the server and is a memory-intensive operation. Performing this operation repeatedly may cause the server to run out of memory. Therefore, this operation should be performed cautiously.
Also, the performance of unrelated document difference is dependent on document complexity, the extent of changes in terms of additions, deletions, and modifications in the documents, and the number of affected nodes. The time taken to display the comparison depends on the number of changes and the complexity of the documents.
1. Select Document > Compare Unrelated Documents.
The Compare Unrelated Documents window displays.
2. Specify a primary document to compare by entering a Document ID or by clicking Select and locating the documents.
3. You can select an As of configuration option for the source and the target documents to compare them at two different points in time. The following table describes the As of options for document comparisons:
To compare a document as of
Do this...
Now
Select Now to compare the current version of the document.
Revision
Select Revision. A list of all of the labels associated with revisions displays. Select the revision that you want to perform the document comparison against.
* 
This option is available only when you are comparing a document that allows versioning.
Label
Select Label. A list of all labels on the document displays. Select the label that you want to perform the document comparison against.
Branch
Select Branch. A list of all the branches associated with the document displays. Select the branch that corresponds to the time you want to compare the document as of.
* 
This option is available only when you are comparing a document that was branched or is a branch.
4. Optionally, you can select the Hide items without differences checkbox to hide items that have no content or structure differences.
5. When searching for existing items, you can filter by visible text or fields, and you can search by item ID. For more information, see Document Finder.
Reviewing Document Differences
The Document Difference view contains two document panes. The source document you specified first displays in the left pane. The target document you specified second displays in the right pane. You can navigate through each pane independently. The document pane currently in focus is outlined in blue.
The view visually presents differences using color-coded highlighting, color-coded connector arrows, text difference highlighting, and interactive row header icons representing the type of difference. Color-coded connector arrows dynamically join content and field-level differences between panes when both differences are at least partially visible in each pane. As you scroll through either document, the connector arrows refresh to connect visible content in the view.
The following table describes how the view visually represents document differences:
Type of Document Difference
Highlight/Connector Arrow Color
Interface Icon
Added content
An item that is not present in the source document but is present in the target document.
A placeholder offers a visual indication of where the addition was made in the source document.
green
Added content
Moved content
An item present in both the source document and target document that:
has a different parent
has the same parent but is in a different relative position
has a parent that has been moved.
blue
Moved content
Deleted content
An item that is present in the source document but is not present in the target document.
A placeholder offers a visual indication of where the deletion was made in the target document.
red
Deleted content
Modified content
Any non-structural change to item or field content.
yellow
Modified content
Moved and modified content
purple
Moved content
and
Modified content
Text differences
The Document Difference view compares document content at the text level, including alpha-numeric characters, punctuation, spaces, and hypertext link text. Text differences are highlighted in each document pane, as follows:
Deleted text is highlighted red in the source document in the left pane.
Added text is highlighted green in the target document in the right pane.
Changed text displays as a deletion in the left pane and an addition in the right pane.
The view does not compare or highlight the following document elements:
Text formatting, such as italics, font attributes, or numbered lists
Images
Hypertext link target content
Short text field content
Left pane: Deleted text is highlighted red
Right pane: Added text is highlighted green
Click an interactive row header icon in one document pane to see the related content at the top of the other pane. To compare content side-by-side, click the coordinating row header icons in both panes. Hover over a row header to display a tool tip with section information. Status details also display in the bottom left corner of the view window.
You can perform item operations, such as editing or viewing items, from the Item menu or from the shortcut menu with a specific node selected.
The view displays all items by default. To hide items that have no field or document structure differences, select View > Hide items without differences.
Refresh the view at any time by selecting View > Refresh or by pressing F5.
Key Considerations for Compare Unrelated Documents Difference View Results
Using a proprietary algorithm, the documents are compared based on the contents of the primary text field values of both documents. The algorithm calculates the probabilistic similarity score among the nodes of the two documents and identifies the pairs of best matching nodes with the highest score. It is important to note that the accuracy of the algorithm improves with the increase in the number of words in the content.
Matching Nodes
Two nodes are considered as best match if they match any of the below criteria:
They have exactly the same content.
They have greater than 50% of common text content.
The matching nodes with greater than 50% common text are not so different in terms of the distance from their root/parent nodes
Classification of Nodes Depending On Similarity Scores
Matching nodes are classified under the following types depending on their probabilistic similarity scores.
* 
Comparison results taken at different intervals on the same set of documents (if either of the documents has been modified) may not be identical.
If the score is 100%, then the nodes are considered to be identical.
If the score is less than 50%, it implies that there is no similar content in the nodes and therefore they are highlighted as either Added or Deleted.
It is important to note that the distance of the matching nodes from their root/parent nodes is also considered while calculating the scores. Therefore, the nodes moved with reference to their parents or the parents of their matching nodes are highlighted as Moved.
If the score is greater than or equal to 50%, it implies that the content in the matched nodes is changed to some extent and the nodes are highlighted as Changed. Additionally, if these nodes are moved anywhere within the document, then these nodes are highlighted as Moved and Changed.
The following are some illustrations that describe this in more detail.
The following is an image that displays two unrelated documents in their original state. The document ID 642 is the source document and document ID 1079 is the target document. As seen in the following, although there is no actual move operation in the target document, the sections 2.1 and 1.2 are identified as Move as they are exact match.
In the next image, the Section 1.2 is moved to Section 2.1 in the target document 1079. As a result, the documents are shown as identical.
The following image shows the same set of documents compared after Section 2.1 is moved to Section 2.4.
The following image illustrates a situation when the same content is added in a different location in the target document. In this example, the text from Section 2.4 is copied to Section 2.7.
From the above examples, it is apparent that the differences calculated are always with respect to the actual text in the other document and it is independent of the actual operation done in each individual documents.
Comparison of Subdocuments
Included sub documents are compared separately based on their identical summary/description short text field values. If neither documents contain the subdocument with matching short text field value, the entire subdocument is highlighted as Added or Deleted.
Inserted sub documents are displayed as a single node in the differences view.
Known Issues and Limitations
PTC does not recommend concurrent comparison of large unrelated documents as this may lead to a significant increase in server resources.
The ability to compare two unrelated documents with additional fields is not supported. Considering that the nodes are compared and matched based on their text only, the comparison of additional fields may appear completely out-of-context and could be misleading in the comparison view.
For example, if the text content of a requirement node and an input node match exactly, the nodes will be shown as identical. However, that does not necessarily mean that the other fields visible on these two nodes are comparable. Any attempt to compare these fields may generate meaningless results.
Navigating Document Differences
You can navigate incrementally through each difference in the current document pane using View menu options, or by using the following toolbar or shortcut key options:
Previous difference: Previous difference or press F7
Next difference: Next difference or press F8
First difference: First difference or press CTRL+F7
Last difference: Last difference or press CTRL+F8
Select View > Wrap Difference Search to configure the Previous and Next difference search options to wrap to the top or bottom of the document pane. Clear the Wrap Difference Search option to stop at the beginning or end of the document pane.
Navigating the Document Panes
The following document pane keyboard navigation options are available in the Document Difference view for the current document pane:
Shortcut Keys
Action
CTRL+TAB
Switches the focus to the other document pane.
TAB
Selects the next field.
up arrow key
Selects the previous row header.
down arrow key
Selects the next row header.
HOME
Selects the first row header.
END
Selects the last row header.
SPACEBAR
Selects the row header icon in focus and moves to the corresponding difference in the other document pane.
PAGE UP
Scrolls up approximately one page.
PAGE DOWN
Scrolls down approximately one page.