The embedded content of a document is content that is not processed by the normal file filter rules. For example, an XML file might have HTML content embedded inside a CDATA section. You can set options to convert embedded content into tags in the output document, and can also specify these tags as being translatable or non-translatable.
Embedded content page
The options to process embedded content apply to the following file types:
- Excel (all versions)
- Java resources
- Generic XML (Any XML)
- New XML file types
- Enable embedded content processing
- Select to enable processing of embedded content.
Document structure information
This defines which content is embedded content.
Tag definition rules
These specify how to treat the embedded content defined in
Document structure information box.
- Start Tag Expression (Placeholder)
- This is a regular expression that identifies embedded content, and converts each occurrence to a placeholder tag. For example, to convert all HTML
<br> (line break) tags to placeholder tags, enter
<br.*?>
- Start Tag Expression and
End Tag Expression (Tag Pair)
- These are regular expressions that identify embedded content by start and end tags. The start and end tags may enclose some content or none.
The processor will try to match the tag pair before tries to match each tag expression. That is, it looks for any section of text that starts with the
Start Tag expression and ends with the
End Tag expression before it tries to match individual start and end tags.
For example, to identify all HTML
<tr>...</tr> (table row) tag pairs, enter:
- Start Tag: <tr.*?>
- End Tag:
</tr>
- Tag Type
- Choose between the following:
- Placeholder—Converts embedded content to standalone (placeholder) tags.
- Tag Pair—Identifies tag pairs (a start tag and an end tag) in the embedded content.
- Translate
-
Not translatable means that the content between the tag pairs is displayed to the translator as locked content.
Text within tag pairs can be
translatable or
non-translatable. Placeholder tags are
Not translatable.
Advanced Settings
The
Advanced Settings specify how tags are displayed.
- Inside text the tag acts as a word end
- This option changes the behavior of cursor placement in the
Editor window.
When selected, the editor treats the tag as a word for the purposes of navigation. For example, in the editor, pressing
Ctrl+Left Arrow will move the cursor to the beginning of the tag and
Ctrl+Right Arrow will move the cursor to the end of the tag.
- Text lines can be wrapped after the tag
- Selecting this option indicates that a line break after this tag does not indicate the end of a segment. For example:
Gather ye rosebuds while ye may,<br>
Old Time is still a-flying: <br>
And this same flower that smiles to-day <br>
To-morrow will be dying.
- Tags represent formatting only and can be hidden in the editor
- When this option is selected, text is formatted correctly and the standard formatting tags (for example, bold, italic, and font type) are not displayed.
Selecting this option does not mean that the tag is always hidden; the user can change the editor settings to force the tag to be displayed.
- Tags represent the text
- Placeholder (standalone) tags only.
A tag can have a text equivalent. For example, the entity tag
" has the text equivalent ".
- Segmentation Hint
- A segmentation hint is a property of a tag that helps the software to segment the file better when converting the file to a translatable format: whether to position the tag within a segment or outside of the segment, or to force a segmentation break. Choose one of the following options:
- Include—If selected, the tag is displayed in the editor, even if it has no associated text. You would rarely select this option.
- Include with text—If selected, when the tag has associated text, the tag is displayed in the editor. Example: the tag specifies a footnote marker. Where this is the case, the translator needs the ability to move the marker to another word in the same sentence, so the tag should be included with the text.
- Exclude—If selected, the software will, where possible, use the tag or tag pair to segment the text. For example, if
<p>...</p> or
<br> tags are marked
Exclude, then if an XML document includes embedded HTML code, the software will use the HTML tags
<p>...</p> and
<br> to segment the document. This segmentation is in addition to the segmentation that is already applied to the embedding XML code.
- May exclude, Undefined—These two are effectively the same. The editor determines whether the tag is part of the text.