Configuration
Standard Document Intake Workflow
Vertesia Semantic DocPrep can be set as the default text rendition of PDF documents. When this feature is active, PDF documents are first transformed to structured markdown (or XML if specified), then the document type and its properties will be determined.
Go to Workflow > Rules and open the Standard Document Intake rule
Next, open the configuration tab and add the following:
{
"useSemanticLayer": true,
"output_format": "markdown"
}
Configuration Options
Option | Type | Description |
---|---|---|
useSemanticLayer | boolean | Enables or disables Semantic DocPrep processing (defaults to false ) |
output_format | string | Output format: "markdown" , "xml" (defaults to "markdown" ) |