Embedded Content

The embedded content of a document is content that is not processed by the normal file filter rules. For example, an XML file might have HTML content embedded inside a CDATA section. You can set options to convert embedded content into tags in the output document, and can also specify these tags as being translatable or non-translatable.

Embedded content page

The options to process embedded content apply to the following file types:

Enable embedded content processing
Select to enable processing of embedded content.

Document structure information

This defines which content is embedded content.

Tag definition rules

These specify how to treat the embedded content defined in Document structure information box.
Start Tag Expression (Placeholder)
This is a regular expression that identifies embedded content, and converts each occurrence to a placeholder tag. For example, to convert all HTML <br> (line break) tags to placeholder tags, enter <br.*?>
Start Tag Expression and End Tag Expression (Tag Pair)
These are regular expressions that identify embedded content by start and end tags. The start and end tags may enclose some content or none.

The processor will try to match the tag pair before tries to match each tag expression. That is, it looks for any section of text that starts with the Start Tag expression and ends with the End Tag expression before it tries to match individual start and end tags.

For example, to identify all HTML <tr>...</tr> (table row) tag pairs, enter:
  • Start Tag: <tr.*?>
  • End Tag: </tr>
Tag Type
Choose between the following:
  • Placeholder—Converts embedded content to standalone (placeholder) tags.
  • Tag Pair—Identifies tag pairs (a start tag and an end tag) in the embedded content.
Translate

Not translatable means that the content between the tag pairs is displayed to the translator as locked content.

Text within tag pairs can be translatable or non-translatable. Placeholder tags are Not translatable.

Advanced Settings

The Advanced Settings specify how tags are displayed.

Inside text the tag acts as a word end
This option changes the behavior of cursor placement in the Editor window.

When selected, the editor treats the tag as a word for the purposes of navigation. For example, in the editor, pressing Ctrl+Left Arrow will move the cursor to the beginning of the tag and Ctrl+Right Arrow will move the cursor to the end of the tag.

Text lines can be wrapped after the tag
Selecting this option indicates that a line break after this tag does not indicate the end of a segment. For example:

Gather ye rosebuds while ye may,<br>

Old Time is still a-flying: <br>

And this same flower that smiles to-day <br>

To-morrow will be dying.

Tags represent formatting only and can be hidden in the editor
When this option is selected, text is formatted correctly and the standard formatting tags (for example, bold, italic, and font type) are not displayed.

Selecting this option does not mean that the tag is always hidden; the user can change the editor settings to force the tag to be displayed.

Tags represent the text
Placeholder (standalone) tags only.

A tag can have a text equivalent. For example, the entity tag &quot; has the text equivalent ".

Segmentation Hint
A segmentation hint is a property of a tag that helps the software to segment the file better when converting the file to a translatable format: whether to position the tag within a segment or outside of the segment, or to force a segmentation break. Choose one of the following options:
  • Include—If selected, the tag is displayed in the editor, even if it has no associated text. You would rarely select this option.
  • Include with text—If selected, when the tag has associated text, the tag is displayed in the editor. Example: the tag specifies a footnote marker. Where this is the case, the translator needs the ability to move the marker to another word in the same sentence, so the tag should be included with the text.
  • Exclude—If selected, the software will, where possible, use the tag or tag pair to segment the text. For example, if <p>...</p> or <br> tags are marked Exclude, then if an XML document includes embedded HTML code, the software will use the HTML tags <p>...</p> and <br> to segment the document. This segmentation is in addition to the segmentation that is already applied to the embedding XML code.
  • May exclude, Undefined—These two are effectively the same. The editor determines whether the tag is part of the text.