Configuration

Standard Document Intake Workflow

Vertesia Semantic DocPrep can be set as the default text rendition of PDF documents. When this feature is active, PDF documents are first transformed to structured markdown (or XML if specified), then the document type and its properties will be determined.

Go to Workflow > Rules and open the Standard Document Intake rule

Next, open the configuration tab and add the following:

{
  "useSemanticLayer": true,
  "output_format": "markdown"
}

Document Intake Rule

Configuration Options

OptionTypeDescription
useSemanticLayerbooleanEnables or disables Semantic DocPrep processing (defaults to false)
output_formatstringOutput format: "markdown", "xml" (defaults to "markdown")

Was this page helpful?