Basic Customization > User Interface Customization > Windchill Search Customization > Customizing Solr > Filtering Text Field Searches
  
Filtering Text Field Searches
* 
For more detailed information about analyzers, filters and tokenizers, see the following link:
Every text field uses the com.ptc.solr.analysis.PTCWordDelimiterFilterFactory filter. This filter splits words into subwords and performs optional transformations on subword groups. Words are split into subwords using the following rules:
Rule
Example
Split on intra-word delimiters (by default, all non alpha-numeric characters)
"Wi-Fi" splits into "Wi" and "Fi"
Split on case transitions.
"TransAM" splits into "Trans" and "AM"
Leading and trailing intra-word delimiters on each subword are ignored.
"__hello---there, 'dude'" splits into "hello", "there", and "dude"
Trailing "'s" characters are removed for each subword.
* 
This step is not performed in a separate filter because of possible subword combinations.
"O'Neil's" splits into "O" and "Neil".
This filter is a replica of solr.WordDelimiterFilter, which is shipped with Solr. It has been customized to protect the following characters: ".", "-" and "_"
Splitting is affected by the following parameters:
generateWordParts=1
Parts of words are generated: "whistle-blower" = "whistle" "blower"
generateNumberParts=1
Number subwords are generated: "500-42" = "500" "42"
catenateWords=1
Maximum runs of word parts are catenated: "re-confirm" = "reconfirm"
catenateNumbers=1
Maximum runs of number parts are catenated: "500-42" = "50042"
catenateAll=1
All subword parts are catenated: "wi-fi-4000" = "wifi4000"
splitOnCaseChange=1
Split on case transitions: “PowerShot” = "Power" "Shot"
preserveOriginal=1
Includes original words in subwords: "500-42" = "500" "42" "500-42"
The com.ptc.solr.analysis.PTCSpecialCharacterFilterFactory filter is also used. This filter creates sub-tokens for tokens that end with PTC protected special characters. Currently there are only three protected special characters:
dot or period (.)
dash (-)
underscore (_)
Sub-tokens are created with the following rules:
Rule
Example
Tokens ending with a period(.)
"dot." = "dot.", "dot"
Tokens ending with a dash (-)
"dash-" = "dash-", "dash"
Tokens ending with an underscore (_)
"under_" = "under_", "under"
* 
Ensure that the same order of tokenizers is maintained at indexing and query time. Tokens generated at query time should be the same as when indexing for a given word.
Stop Words
The words mentioned in $solr-home\wblib\conf\stopwords.txt are not indexed. These words should be words that a user would not enter in a meaningful search. For example, “if” or “not”. To include these words in searches, remove them from stopwords.txt.
For English, the text field is used and is configured using the StopFilterFactory filter.
Synonyms
The synonym entries in $solr-home\wblib\conf\synonyms.txt ensure that searching on one word can find records with synonymous words. You can edit this file to enter or remove synonyms.
The SynonymFilterFactory filter is configured for English text fields.
autoCommit
Windchill uses the Solr auto commit feature to commit the index information automatically after certain criteria is met.
You can configure autoCommit in solrconfig.xml.
This criteria is specified under the following element:
<updateHandler class="solr.DirectUpdateHandler2">
<autoCommit>
<maxDocs>1000</maxDocs>
<maxTime>60000</maxTime>
</autoCommit>
maxDocs
maxDocs is the maximum uncommited Windchill business object documents before autocommit triggered
maxTime
maxTime is the maximum time (in milliseconds) after adding a Windchill business object document before an autocommit event is triggered
Indexed searches perform better when the maxTime and maxDocs values are higher.
However, an object does not appear in search results unless the index information is committed.
* 
Use higher values when you run bulk indexing.
Enabling Alphanumeric Splits
By default, Windchill search does not tokenize alphanumeric transitions. For example, the string “ABC123” is indexed as “ABC123.”
You can customize Solr to enable alphanumeric splits. When enabled, the string “ABC123” is indexed as the following:
ABC123
ABC
123
To enable alphanumeric splitting, perform the following actions:
1. Stop Windchill.
2. Navigate to the following file:
/solr-home/wblib/conf/conf_generic_field_types.xml
3. Locate all instances of the following: splitOnNumerics="0"
And replace with the following: splitOnNumerics="1"
4. Restart Windchill.
5. Once Windchill is restarted, re-index data using the Bulk Index Tool.