Use the parser rule settings available on the Add Rule, Edit Rule and Copy Rule pages to define the properties of the HTML parser rules. These settings help WorldServer sort better translatable from non-translatable text and display correctly the content extracted from the HTML documents.
Rule section | Description |
---|---|
Name | The name of the element for which you are modifying the parser rule. For example, the name of the rule which affects the
<a> elements of HTML documents is named
a.
Note: The Parser rules are case insensitive. This means that WorldServer will consider 'TITLE' and 'title' to be the same.
|
Conditions | The conditions which define the extraction settings. Specify under which conditions should WorldServer extract the content inside the selected element. For example, you might modify the
a rule so that WorldServer will extract the content from an
a element only if the
a element is placed inside a text paragraph written in English. To do this, create a condition that will check the language of the paragraphs and the location of the
a element inside the structure of the HTML documents.
|
Option | Description |
---|---|
Attributes | The localization setting which determines whether the attributes of an element becomes editable after extraction. Specify which of the attributes that could define the selected HTML element should be extracted as editable text in WorldServer and which attributes should be extracted as non-editable text. For example, for situations where you do not want to translate the ToolTip of a hyperlink, change the
Translate property of the
title attribute inside the
a rule:
The next time that you open an HTML document, WorldServer will extract the ToolTips from hyperlinks but will not allow you to edit them. Note: Element attributes for which you do not specify a localization attribute appear as non-translatable in the
Editor.
|
Option | Description |
---|---|
Translate |
The localization setting which determines whether the content of the selected element becomes editable after extraction. Specify if WorldServer should allow you to translate in the Editor the content extracted from the selected element. You can set the
Translate property to one of the following options:
Note:
|
Whitespace |
The setting which defines how WorldServer deals with any extra whitespace characters it finds in the translatable content extracted from the selected HTML element. Specify if you want WorldServer to keep or remove extra whitespace. To edit the settings for the whitespace in non-translatable content and in element attributes, use the Whitespace in tags option on the global Whitespace page. Set the Whitespace property to one of the following:
|
Tag Type | The settings which control how the HTML elements are displayed in the
Editor. HTML elements are extracted and shown in the
Editor as tags. The translatable content inside the elements is displayed as editable text.
Tags can be displayed as:
Note: For more information on Inline tags and Structure tags, see the WorldServer–Studio integration documentation.
|
Segmentation Hint (applicable to inline tags) |
Segmentation hints help WorldServer better segment the HTML document when converting it to a translatable format. Segmentation hints determine if WorldServer will position the element within a segment, outside of the segment or whether it will force a segmentation break. Set the
Segmentation Hint to one of the following:
|
Formatting |
The settings which define how the content extracted by the parser looks like in the Editor. Click Edit and select one of the following options for each of the available styles:
The Sample box shows a preview of how the text extracted by the rule looks in the WorldServer Editor. |
Type of element | Setting | Description |
---|---|---|
Standard | Offers a list of standard HTML structure elements with predefined context information. Choose Custom if you want to create your own element and customize its context information. | |
Custom | For custom elements you can specify the following properties: | |
Purpose | ||
Document Explorer | Select what information is displayed in the Document Structure tree in the Editor view. You can choose to display only the name of the element, the entire content of the element or no information at all. | |
Name | Specify a name for the element. By default, WorldServer also uses this name for the Code and Identifier fields but you can edit them if you want to use different names instead. | |
Description | Specify a description for your element. The description is displayed in the Additional Information column of the Document Structure Information dialog box. | |
Color | Specify the background color for displaying the element in the Document Structure column and in Document Structure Information dialog box. | |
Formatting | Specify the font, size, color and style for displaying the content of the element in the Editor view. You can choose to inherit the formatting from the element's parent or to activate/deactivate a certain style. |