Determining file types for segmentation

File types process the data in project assets to expose the translatable content and to hide the non-translatable content. Translatable content is presented in segments; the action that file types perform on assets is called segmentation. File type configurations allow you to customize how file types process the information.

Most of the time, the Segment Asset automatic action segments assets in a project workflow. Apart from this automatic action, assets are segmented when you perform one of the following actions: Whether WorldServer performs segmentation through the automatic action or through one of these ad-hoc operations, the process is the same:

Associating file types with MIME types

The segmentation process is performed by a file type that is appropriate to the format of the asset. For example, if the asset has the .xml extension, WorldServer uses the file type associated with the text/xml MIME type in the in the MIME type table (see Management > Administration > Customization > Custom component type: MIME Types). The default MIME type for the .xml file extension is text/xml and those files use the Any XML File Type file type by default.

If the MIME type does not have an associated file type, the asset is considered not segmentable (that is, not translatable).
Note: You can modify MIME types by clicking the corresponding links in the MIME Type Name column. In the Edit Custom MIME Types Component window, you can associate an available file type with a MIME type or give the MIME type a different name. Make sure you use names and associations that will make sense to everyone who uses your WorldServer system.

Choosing file type configurations

If the default configuration is the only configuration available for a certain file type, that default configuration is used. If there are multiple configurations, the system searches for the project type or the AIS property to figure out which configuration to use, depending on how the project was created:
  • Upload Files and Create Projects on the Legacy Home page

    The file type configuration is the one specified in the project type of the project.

  • Create New Project on the Assignments > Project page

    As in the first case, the file type configuration is the one specified in the project type of the project.

  • Project > Create Project or ad hoc in the WorldServer Explorer

    You can assign a specific file type configuration for a target asset or folder in the Change Properties window. Go to Management > Asset Interface System > View and Change Properties. (You can also get to this window by going to Explorer > Asset > Properties.)

You must create at least one file type configuration (in Linguistic Tool Setup > File Types > <File Type>: Add) for the File Type list to be displayed in the Change Properties window.
Note: If you have selected an asset, in the Change Properties window, you can see a list of configurations available for the file format corresponding to that asset. If you have selected a folder instead of an asset, you can see a list of all the available file type configurations, because WorldServer cannot determine the file formats included in the folder.

When WorldServer consults the File Type property for a project, MIME type, or AIS folder, it checks whether the assigned configuration has the correct file type association. If so, that file type is used. If the file formats do not match, the default configuration of the file type is used. See the "File type groups" topic for information aboyt applying multiple file type configurations to a folder in AIS.

Asset re-segmentation

WorldServer re-segments assets when it detects that the file type configuration has changed. However, in some cases a change, you might have to modify the file in order to change its timestamp. A file type configuration can depend on many factors. For example, changing sentence-breaking rules will affect segmentation. Unless you are sure that an asset has been re-segmented after configuration changes, you should force re-segmentation by modifying the source file to change the timestamp.