Configuration

First, let's go over a few concepts:

Workflow DSL: the workflow DSL is a JSON-based language that is used to define workflows. It is a simple language that is easy to learn and use. The DSL is composed of a list of steps. Each step can be either an activity or a child workflow.

Activities: Activities are the building blocks of workflows. They are the individual tasks that are executed by the workflow worker. Details about the Workflow Activities are in the Workflow Activities section.

Child Workflows: Child workflows are workflows that are executed as part of another workflow. They are useful for breaking down complex workflows into smaller, more manageable units.

Prerequisites

In order to easily create and update Workflow definitions in Vertesia, you will need to use the Vertesia CLI. If you haven't installed or configured it yet, please have a look at the documentation

Workflow Definition

A workflow definition is a JSON structures with contains at least the following:

a name
a description
an array of steps
input variables

Below is an example of intake workflow that can be triggered when a new text document is uploaded to vertesia.

{
  "name": "MyWorkflow",
  "description": "This is my workflow.",
  "vars": {
    "interactionsNames": {
      "extractInformation": "sys:ExtractInformation",
      "selectDocumentType": "sys:SelectDocumentType",
      "generateMetadataModel": "sys:GenerateMetadataModel",
      "chunkDocument": "sys:ChunkDocument"
    }
  },
  "steps": [
    {
      "name": "setDocumentStatus",
      "params": {
        "status": "processing"
      }
    },
    {
      "title": "Extract text from the current document",
      "name": "generateObjectText",
      "type": "workflow",
      "output": "extractResult"
    },
    {
      "title": "Generate or assign a content type for the current document",
      "name": "generateOrAssignContentType",
      "import": ["interactionsNames"],
      "params": {
        "interactionNames": {
          "generateMetadataModel": "${interactionsNames.generateMetadataModel}",
          "selectDocumentType": "${interactionsNames.selectDocumentType}"
        }
      },
      "condition": {
        "extractResult.hasText": {
          "$eq": true
        }
      }
    },
    {
      "title": "Generate document properties from text content",
      "name": "generateDocumentProperties",
      "import": ["interactionsNames"],
      "params": {
        "interactionName": "${interactionsNames.extractInformation}"
      },
      "condition": {
        "extractResult.hasText": {
          "$eq": true
        }
      }
    },
    {
      "title": "Chunk the current document text",
      "name": "chunkDocument",
      "import": ["interactionsNames"],
      "params": {
        "interactionName": "${interactionsNames.chunkDocument}",
        "createParts": true
    },
      "condition": {
        "extractResult.hasText": {
        "$eq": true
        }
    }
    },
    {
      "name": "generateEmbeddings",
      "title": "Generate embeddings for text",
      "params": {
        "type": "text",
        "force": false
      }
    },
    {
      "name": "setDocumentStatus",
      "params": {
        "status": "completed"
      }
    }
  ]
}

Workflow Variables

The DSL supports variables that can be used to store data and pass it between steps. Variables are defined in the vars property of the workflow. The value of a variable can be a literal value or a reference to another variable. References to variables are enclosed in ${}. For example, the following DSL defines a variable named myVariable with the value "Hello World!":

{
  "vars": {
    "myVariable": "Hello World!"
  }
}

The value of myVariable can then be referenced in other parts of the DSL using ${myVariable}. For example, the following DSL logs the value of myVariable to the console:

{
  "steps": [
    {
      "type": "activity",
      "name": "log",
      "params": {
        "message": "The value of myVariable is: ${myVariable}"
      }
    }
  ]
}

Conditions

The DSL supports conditions that can be used to control the flow of the workflow. Conditions are defined in the condition property of a step. The value of a condition is a JSON object that describes the condition. The following operators are supported:

Operator	Description
`$eq`	Equal to
`$ne`	Not equal to
`$gt`	Greater than
`$gte`	Greater than or equal to
`$lt`	Less than
`$lte`	Less than or equal to
`$in`	In array
`$nin`	Not in array
`$regexp`	Matches regular expression

For example, the following DSL defines a step that only executes if the value of the variable myVariable is equal to "Hello World!":

{
  "steps": [
    {
      "type": "activity",
      "name": "log",
      "condition": {
        "$eq": {
          "myVariable": "Hello World!"
        }
      },
      "params": {
        "message": "The value of myVariable is: ${myVariable}"
      }
    }
  ]
}

Fetch

The DSL supports fetching data from external sources during the workflow execution. The fetch property of a step is used to define the data to fetch. The value of the fetch property is a JSON object that describes the data to fetch. The following properties are supported:

Property	Description
`type`	The type of data to fetch.
`source`	The source of the data.
`query`	The query to use to fetch the data.
`select`	The fields to select from the fetched data.
`limit`	The maximum number of results to fetch.
`on_not_found`	How to handle not found objects.

For example, the following DSL defines a step that fetches a document from the store:

{
  "steps": [
    {
      "type": "activity",
      "name": "fetchDocument",
      "fetch": {
        "type": "document",
        "query": {
          "id": "${documentId}"
        }
      },
      "output": "document"
    }
  ]
}

Projection

The DSL supports projecting data from the result of an activity. The projection property of a step is used to define the data to project. The value of the projection property is a JSON object that describes the data to project. The following operators are supported:

Operator	Description
`$include`	Include the specified fields.
`$exclude`	Exclude the specified fields.

For example, the following DSL defines a step that projects the name and description fields from the result of the fetchDocument activity:

{
  "steps": [
    {
      "type": "activity",
      "name": "fetchDocument",
      "fetch": {
        "type": "document",
        "query": {
          "id": "${documentId}"
        }
      },
      "output": "document"
    },
    {
      "type": "activity",
      "name": "projectDocument",
      "params": {
        "document": "${document}"
      },
      "projection": {
        "$include": [
          "name",
          "description"
        ]
      },
      "output": "projectedDocument"
    }
  ]
}