Comparing Unrelated Documents
CLI EQUIVALENT
|
im diffsegments --compareUnrelatedDocuments
|
This capability provides a solution to handle use cases when there is a need to compare two unrelated documents. Consider following use case:
In a typical industry scenario, an organization receives inputs from multiple OEMs. The organization may choose to store these inputs in separate input documents. These input documents form the basis of subsequently evolved requirement documents. The Compare Unrelated Documents functionality helps the organization to:
• locate the common items from these different requirements documents.
• locate the common items from the input and requirement documents if the inputs get changed over a period of time.
Compare unrelated documents lets you compare textual differences between two documents of different types. The documents that are being compared are considered unrelated if they do not have the same root ID. Unrelated documents can also be considered as:
◦ Those that are not the same document or a branch or a version of the same document
◦ Those that are not the As Of configurations of the same document or As Of configurations of a branch or version of the same document
For example, this feature lets you compare a requirements document with an input document or any unrelated requirements document.
|
The two unrelated documents being compared must have the same primary long text field.
|
The output is presented in the Document Difference view that presents structural and content changes side-by-side in two document panes. Nodal additions, deletions, moves, changes with respect to text differences are highlighted. You can use navigation tools to review and navigate the textual and structural differences in the document.
|
If the Document Difference view is not visible after upgrading to the latest version of Windchill RV&S, you must customize the existing Windchill RV&S viewsets and the shortcut menu to enable this view in the GUI. Viewset and shortcut menu customization is saved and remembered the next time you access the client.
To customize existing viewsets, select > . On the Actions tab, select > > to make the option visible in the Document menu.
To customize the right-click shortcut menu, right-click and select Customize This Menu. Click Add Action. Select Workflows and Documents/Item and then select Compare Unrelated Documents.
|
To compare two unrelated documents in the GUI
|
Unrelated document difference calculation is performed on the server and is a memory-intensive operation. Performing this operation repeatedly may cause the server to run out of memory. Therefore, this operation should be performed cautiously.
Also, the performance of unrelated document difference is dependent on document complexity, the extent of changes in terms of additions, deletions, and modifications in the documents, and the number of affected nodes. The time taken to display the comparison depends on the number of changes and the complexity of the documents.
|
1. Select > .
The Compare Unrelated Documents window displays.
2. Specify a primary document to compare by entering a Document ID or by clicking Select and locating the documents.
3. You can select an As of configuration option for the source and the target documents to compare them at two different points in time. The following table describes the As of options for document comparisons:
To compare a document as of
|
Do this...
|
Now
|
Select Now to compare the current version of the document.
|
Revision
|
Select Revision. A list of all of the labels associated with revisions displays. Select the revision that you want to perform the document comparison against.
|
This option is available only when you are comparing a document that allows versioning.
|
|
Label
|
Select Label. A list of all labels on the document displays. Select the label that you want to perform the document comparison against.
|
Branch
|
Select Branch. A list of all the branches associated with the document displays. Select the branch that corresponds to the time you want to compare the document as of.
|
This option is available only when you are comparing a document that was branched or is a branch.
|
|
4. Optionally, you can select the Hide items without differences checkbox to hide items that have no content or structure differences.
5. When searching for existing items, you can filter by visible text or fields, and you can search by item ID. For more information, see
Document Finder.
Reviewing Document Differences
The Document Difference view contains two document panes. The source document you specified first displays in the left pane. The target document you specified second displays in the right pane. You can navigate through each pane independently. The document pane currently in focus is outlined in blue.
The view visually presents differences using color-coded highlighting, color-coded connector arrows, text difference highlighting, and interactive row header icons representing the type of difference. Color-coded connector arrows dynamically join content and field-level differences between panes when both differences are at least partially visible in each pane. As you scroll through either document, the connector arrows refresh to connect visible content in the view.
The following table describes how the view visually represents document differences:
Type of Document Difference
|
Highlight/Connector Arrow Color
|
Interface Icon
|
Added content
An item that is not present in the source document but is present in the target document.
A placeholder offers a visual indication of where the addition was made in the source document.
|
green
|
|
Moved content
An item present in both the source document and target document that:
• has a different parent
• has the same parent but is in a different relative position
• has a parent that has been moved.
|
blue
|
|
Deleted content
An item that is present in the source document but is not present in the target document.
A placeholder offers a visual indication of where the deletion was made in the target document.
|
red
|
|
Modified content
Any non-structural change to item or field content.
|
yellow
|
|
Moved and modified content
|
purple
|
and |
Text differences
The Document Difference view compares document content at the text level, including alpha-numeric characters, punctuation, spaces, and hypertext link text. Text differences are highlighted in each document pane, as follows:
• Deleted text is highlighted red in the source document in the left pane.
• Added text is highlighted green in the target document in the right pane.
• Changed text displays as a deletion in the left pane and an addition in the right pane.
The view does not compare or highlight the following document elements:
• Text formatting, such as italics, font attributes, or numbered lists
• Images
• Hypertext link target content
• Short text field content
|
Left pane: Deleted text is highlighted red
Right pane: Added text is highlighted green
|
|
Click an interactive row header icon in one document pane to see the related content at the top of the other pane. To compare content side-by-side, click the coordinating row header icons in both panes. Hover over a row header to display a tool tip with section information. Status details also display in the bottom left corner of the view window.
You can perform item operations, such as editing or viewing items, from the Item menu or from the shortcut menu with a specific node selected.
The view displays all items by default. To hide items that have no field or document structure differences, select > .
Refresh the view at any time by selecting > or by pressing F5.
Key Considerations for Compare Unrelated Documents Difference View Results
Using a proprietary algorithm, the documents are compared based on the contents of the primary text field values of both documents. The algorithm calculates the probabilistic similarity score among the nodes of the two documents and identifies the pairs of best matching nodes with the highest score. It is important to note that the accuracy of the algorithm improves with the increase in the number of words in the content.
Matching Nodes
Two nodes are considered as best match if they match any of the below criteria:
• They have exactly the same content.
• They have greater than 50% of common text content.
• The matching nodes with greater than 50% common text are not so different in terms of the distance from their root/parent nodes
Classification of Nodes Depending On Similarity Scores
Matching nodes are classified under the following types depending on their probabilistic similarity scores.
|
Comparison results taken at different intervals on the same set of documents (if either of the documents has been modified) may not be identical.
|
• If the score is 100%, then the nodes are considered to be identical.
• If the score is less than 50%, it implies that there is no similar content in the nodes and therefore they are highlighted as either Added or Deleted.
• It is important to note that the distance of the matching nodes from their root/parent nodes is also considered while calculating the scores. Therefore, the nodes moved with reference to their parents or the parents of their matching nodes are highlighted as Moved.
• If the score is greater than or equal to 50%, it implies that the content in the matched nodes is changed to some extent and the nodes are highlighted as Changed. Additionally, if these nodes are moved anywhere within the document, then these nodes are highlighted as Moved and Changed.
The following are some illustrations that describe this in more detail.
The following is an image that displays two unrelated documents in their original state. The document ID 642 is the source document and document ID 1079 is the target document. As seen in the following, although there is no actual move operation in the target document, the sections 2.1 and 1.2 are identified as Move as they are exact match.
In the next image, the Section 1.2 is moved to Section 2.1 in the target document 1079. As a result, the documents are shown as identical.
The following image shows the same set of documents compared after Section 2.1 is moved to Section 2.4.
The following image illustrates a situation when the same content is added in a different location in the target document. In this example, the text from Section 2.4 is copied to Section 2.7.
From the above examples, it is apparent that the differences calculated are always with respect to the actual text in the other document and it is independent of the actual operation done in each individual documents.
Comparison of Subdocuments
Included sub documents are compared separately based on their identical summary/description short text field values. If neither documents contain the subdocument with matching short text field value, the entire subdocument is highlighted as Added or Deleted.
Inserted sub documents are displayed as a single node in the differences view.
Known Issues and Limitations
• PTC does not recommend concurrent comparison of large unrelated documents as this may lead to a significant increase in server resources.
• The ability to compare two unrelated documents with additional fields is not supported. Considering that the nodes are compared and matched based on their text only, the comparison of additional fields may appear completely out-of-context and could be misleading in the comparison view.
For example, if the text content of a requirement node and an input node match exactly, the nodes will be shown as identical. However, that does not necessarily mean that the other fields visible on these two nodes are comparable. Any attempt to compare these fields may generate meaningless results.
Navigating Document Differences
You can navigate incrementally through each difference in the current document pane using View menu options, or by using the following toolbar or shortcut key options:
• Previous difference:
or press F7
• Next difference:
or press F8
• First difference:
or press CTRL+F7
• Last difference:
or press CTRL+F8
Select > to configure the Previous and Next difference search options to wrap to the top or bottom of the document pane. Clear the Wrap Difference Search option to stop at the beginning or end of the document pane.
Navigating the Document Panes
The following document pane keyboard navigation options are available in the Document Difference view for the current document pane:
Shortcut Keys | Action |
CTRL+TAB | Switches the focus to the other document pane. |
TAB | Selects the next field. |
up arrow key | Selects the previous row header. |
down arrow key | Selects the next row header. |
HOME | Selects the first row header. |
END | Selects the last row header. |
SPACEBAR | Selects the row header icon in focus and moves to the corresponding difference in the other document pane. |
PAGE UP | Scrolls up approximately one page. |
PAGE DOWN | Scrolls down approximately one page. |