To determine if a file should be processed with the HTML file type, WorldServer checks the content of the file against the conditions that you define on the Detection page. Specify either a Doctype declaration, a root element or a namespace declaration. This helps WorldServer identify a file as being an HTML document and apply the correct extraction settings.
Option | Description |
---|---|
DOCTYPE Declaration | To specify a doctype declaration, type the element name and click
OK on the
Detection page. For example:
<!DOCTYPE html> |
Root Elements | To specify a root element, click
Add... and type the element name without the brackets. For example:
<html> <title>Hello world!</title> </html>If WorldServer finds an <html> element in the selected file, it will process the file as an HTML document and apply the HTML extraction settings. |
Namespace Declarations | To specify a namespace declaration, click Add... and type the URI (Uniform Resource Indicator) that identifies the namespace. For example: <html xmlns:xhtml="http://www.w3.org/xhtml/">. |