About XML Parser Rules with Examples

Parser rules describe the structure on an XML file type. Each parser rule specifies how WorldServer treats matching XML elements. The parser rule categorizes elements (for example as translatable or not-translatable), and assigns a tag type. You set these rules, for each XML file type, by using the Parser Rules dialog box.

Key components of a parser rule

A parser rule has the following key components:

Rule identifier
The specification of the nodes to which the rule applies, one of the most important properties of a parser rule. The software uses XPath to specify the applicable nodes. The software uses this specification to identify the rule, and so does not allow you to have two rules with the same XPath location.
Translate setting
Content that is translatable can be edited. Content that is not translatable might or might not be displayed: that depends on its tag type.

Content is translatable if any of the following is true:

  • The content is not subject to a parser rule.
  • The content is set in the parser rule to translatable.
  • The content inherits the setting translatable from its ancestor.
Tag type setting
A parser rule can force an element to be presented to the translator as a tag. A tag can also have translatable text. A tag can be specified as any of the following:
Inline with the text
The content is displayed to the translator.
Structure tag
The element creates a new paragraph unit in the target file.
Not specified
The tag is marked as a structure tag.
Note: You cannot set both the Translate setting and the tag type to Not specified.
Whitespace
For XML file types, you can choose whether the file filter software preserves whitespace or normalizes it. Preservation means that whitespace is not modified. Normalization means replacing whitespace with a single space. Normalization is done when the file is converted to sdlxliff. You can set the whitespace setting for a file type as a whole in the Whitespace page. You can also set the whitespace setting in a parser rule. Where a parser rule applies, it overrides the setting from the Whitespace page).
Formatting and Structure information
The formatting and structure information specifies how tags and tagged content are presented.
  • Formatting applies to inline tags and specifies how the contents of an inline tag appear to the translator. For example, you might want text within an inline tag pair <b> ... </b> to be displayed as bold text.
  • Structure info applies to structure tags and determines how text is displayed on the translator's screen. Structure info is called Context on the Parser rule dialog box.

Example: Select an element for translation based on an attribute value

The following parser rule specifies all elements that have the attribute value pair 'translate=yes', and ensures that these elements are presented for translation:

Rule identifier
The XPath identifier is:
//*[@translate="yes"]

This specifies that the rule applies to all elements that have the attribute translate with the value yes.

Translate setting
Always translatable.

Add the translate setting in the Properties group, Translate box.

Example: XPath and equivalent element specification

The SDL file support software uses XPath to specify elements , and you too can use XPath. However, you can also specify the element more simply in the Add/Edit Parser Rule dialog box.

XPath syntax
//text
Equivalent element syntax
In the Rule group box, Element box, enter the element name as: text

Example: XPath and equivalent for an element attribute pair

XPath syntax
//diagram/@address
Equivalent element syntax
  • In the Rule group box, Element box, enter the element name as: diagram
  • In the Rule group box, Attribute box, enter the element name as: address

Example: Use XPath to specify an attribute value pair

The XML example includes elements that have an attribute translatable, which can take the values yes and no. The intention is clear: all content within the scope of an attribute value translatable="yes"" should be translated.
The XPath expression to specify this scope is:
//*[@translate="yes"]
We can break down this expression as follows:
//*
Specifies all elements, at any depth.
[...]
Square brackets delimit selection criteria. To specify elements by the values of an attribute (for example with =, > or <), surround the expression with square brackets.
@translatable
Specifies the translatable attribute.
="yes"
Specifies elements for which the attribute has the value "yes".