Language Settings
Documents to be published via your
Arbortext Styler stylesheet may contain language definitions. If you want these language definitions to drive styling or formatting such as hyphenation or translation, you must set up your stylesheet to recognize the definitions. The
Language tab of the
Stylesheet Properties dialog box contains options by which you can make these settings. You can access the tab from the
Stylesheet Properties dialog box, or by choosing the > menu option.
You can configure the following language settings:
• Default document language: confirming the language of a document will ensure that the correct language-specific formatting or information is provided when your document is published. The document language value will be analyzed when Arbortext Styler is processing the following:
◦ Hyphenation
The default document language information provided here will give Arbortext Styler the information it needs to set the required hyphenation language when:
1. The
Hyphenation field in the
Breaks category is set to a value of
(Use document language), and the document language is not specified in the document scope.
2. The Hyphenation field is set to <Derive> and there is no value to inherit.
◦ Generated text translation
Arbortext Styler will use the document language specification methods defined here to assess the document language that is in effect when it encounters a piece of generated text that is set to be translated. It can then output the correct language version of that generated text. This applies when the document language is not specified in the document scope.
• Mappings for language codes that could be encountered in documents
Documents to be formatted by the stylesheet could contain generic or non-standard language codes. Here you can provide more specific information about the actual language Arbortext Styler should use when it encounters such codes. You can also provides alternatives to the code that is encountered, for example swapping US English (en-US) for UK English (en-GB).
For example, a stylesheet may have its hyphenation Language field in the Breaks category set to (Use Document Language), and a document formatted by the stylesheet has de as the code for its document language. This specifies German, but does not make it clear if this should be Reformed or Traditional German. To apply the correct German hyphenation rules, this distinction must be made. A mapping matching de to de-1996 (German, Reformed) in the Document Language Mapping field will ensure that reformed rules are used in hyphenation.
Similarly, if a document contains a non-standard language code, a mapping here could provide the additional information to identify the correct language. For example, if the document’s language code is ENG, a mapping here could align it with the en-US (United States) language.
|
Languages for generated text, and hyphenation set to (Use document language), must be specified via standard language codes. You will need to provide mappings to standard codes if your document uses non-standard ones.
|
• Source and target languages for translation of generated text
You can use unqualified language codes here even if you have entered a mapping to a more specific code in the Document Language Mapping field. Your generated text will still match if its target includes the more generic code. For example, if you have mapped de to de-1996 to implement the correct hyphenation rules for a German document, translation of generated text will still match translations for de if they exist. Note that you should still set a target for the qualified language code if you have translations with that code.
In general, use general language codes where possible. For example, if your document requires translation into one German language only, use de as the language code. This will work for de-1901, de-1996, de-DE. You only need to specify the variation of a language if you want to translate to multiple variations of the same base language.
• Value of Lang attribute when generating tagged PDF or HTML outputs.
If the source document specifies a language via the recognized language attribute identified for the stylesheet, this will be used as the value of the Lang attribute in output. If the source document does not specify a language in this way, the value of the stylesheet’s Default language setting will be used.
How (Use document language) is Determined
The rules below control how language settings are matched in the context of generated text language and hyphenation when the hyphenation language field is set to (Use document language). It is a two step process:
1. The document language is determined
2. The document language is matched to a generated text language or hyphenation rule
The following rules define how Arbortext Styler determines the document language:
1. A document language is derived from the document by following the Document language specification parameters defined in the Language tab of the Stylesheet Properties dialog box.
2. If no document language can be extracted from the document, the value of the Default document language field is set as the document language.
3. If the extracted document language is defined with a Document language mapping entry, the mapped Language to use (alias) value becomes the document language. Otherwise the extracted document language is set as the document language.
The following rules define the precedence used to determine a match between the document language and a target. A target is either a language with an associated hyphenation rule or a generated text language. Note that language matching is case insensitive.
1. If the document language exactly matches the target language, for example if both have a value of en-GB, the target is used.
2. If the first part of the document language (before a hyphen) exactly matches a target language, it is considered a match. The largest portion of the target language to match takes precedence. For example, for a document language of en-US-nor, Arbortext Styler will test en-US-nor, en-US, and en in that order to find a match.
3. If no match is found after carrying out the two set of tests above, next steps will differ between generated text and hyphenation targets:
◦ For generated text targets, the Source Language defined in the Generated Text field is used. You may see a warning message if this happens.
◦ For hyphenation targets, hyphenation is deactivated. You will not be presented with a warning in this case as this may be intentional. Some languages do not include hyphenation rules and should not be hyphenated.
You may see a warning if the target is invalid.