Search Configuration

Vertesia supports two search backends: MongoDB Atlas Vector Search (default) and Elasticsearch (optional). This guide covers how to configure and manage both for optimal search performance.

MongoDB Atlas Vector Search

MongoDB Atlas Vector Search is the default search backend, providing vector-based semantic search integrated directly with your document store.

How It Works

Automatic indexing: When embeddings are enabled, documents are automatically indexed
Vector similarity: Uses HNSW (Hierarchical Navigable Small World) algorithm for fast similarity search
Integrated storage: Vectors stored alongside documents in MongoDB

Vector Index Management

Vector indexes are automatically created when you enable embeddings. Each embedding type has its own index:

vsearch_text - Text embeddings index
vsearch_image - Image/vision embeddings index
vsearch_properties - Properties embeddings index

Monitoring Index Status

Check vector index status through the embeddings status endpoint:

{
  "vectorIndex": {
    "status": "READY",
    "name": "vsearch_text",
    "type": "hnsw"
  }
}

Status values:

Status	Description
READY	Index is operational
PENDING	Index is being built
FAILED	Index creation failed
DOES_NOT_EXIST	No index (embeddings not enabled)

Limitations

MongoDB Atlas Vector Search provides semantic/vector search only. For full-text search with stemming, fuzzy matching, or complex aggregations, enable Elasticsearch.

Elasticsearch

Elasticsearch provides advanced search capabilities including full-text search, hybrid search, and powerful aggregations.

When to Use Elasticsearch

Enable Elasticsearch when you need:

Full-text search with stemming, fuzzy matching, and phrase queries
Hybrid search combining vector and full-text with configurable weights
Aggregations for analytics, facets, and document statistics
DSL queries for complete control over search behavior
High-volume search with dedicated search infrastructure

Prerequisites

Elasticsearch requires:

Elasticsearch infrastructure enabled for your account
Embeddings configured (for vector search capabilities)
Project settings permission (project:settings_write)

Enabling Elasticsearch

Get Status

Check the current Elasticsearch status for your project:

Get Elasticsearch Status

curl --location --request GET \
  'https://api.vertesia.io/api/v1/commands/elasticsearch/status' \
  --header 'Authorization: Bearer <YOUR_JWT_TOKEN>'

Example Response:

{
  "infrastructureEnabled": true,
  "indexingEnabled": true,
  "queriesEnabled": true,
  "indexStats": {
    "documentCount": 15234,
    "sizeInBytes": 52428800
  },
  "mongoDocumentCount": 15234,
  "reindexProgress": null
}

Enable Indexing

Enable Elasticsearch indexing for your project. This creates the index and starts syncing documents:

Enable Elasticsearch Indexing

curl --location --request POST \
  'https://api.vertesia.io/api/v1/commands/elasticsearch/enable-indexing' \
  --header 'Authorization: Bearer <YOUR_JWT_TOKEN>'

Enable Queries

Enable Elasticsearch queries. Requires indexing to be enabled first:

Enable Elasticsearch Queries

curl --location --request POST \
  'https://api.vertesia.io/api/v1/commands/elasticsearch/enable-queries' \
  --header 'Authorization: Bearer <YOUR_JWT_TOKEN>'

Disable Queries

Disable Elasticsearch queries while keeping indexing active:

Disable Elasticsearch Queries

curl --location --request POST \
  'https://api.vertesia.io/api/v1/commands/elasticsearch/disable-queries' \
  --header 'Authorization: Bearer <YOUR_JWT_TOKEN>'

Disable Indexing

Disable Elasticsearch indexing entirely:

Disable Elasticsearch Indexing

curl --location --request POST \
  'https://api.vertesia.io/api/v1/commands/elasticsearch/disable-indexing' \
  --header 'Authorization: Bearer <YOUR_JWT_TOKEN>'

Reindexing

Reindexing rebuilds the Elasticsearch index from MongoDB. This may be needed when:

Enabling Elasticsearch for an existing project with documents
Changing embedding dimensions
Index corruption or sync issues
Recovering from failures

Trigger Reindex

curl --location --request POST \
  'https://api.vertesia.io/api/v1/commands/elasticsearch/reindex' \
  --header 'Authorization: Bearer <YOUR_JWT_TOKEN>' \
  --header 'Content-Type: application/json' \
  --data-raw '{
    "recreateIndex": false
  }'

Parameters:

Parameter	Type	Description
`recreateIndex`	boolean	If `true`, drops and recreates the index. Use when changing dimensions or mappings.

Zero-Downtime Reindexing

Vertesia uses alias-based reindexing for zero downtime:

A new index is created with updated mappings
Documents are batch-indexed to the new index
The alias is atomically swapped from old to new
The old index is deleted

During reindexing, queries continue to work against the existing index.

Monitoring Progress

Check reindex progress through the status endpoint:

{
  "reindexProgress": {
    "workflowId": "reindex-project-abc123",
    "status": "running",
    "processedDocuments": 5000,
    "totalDocuments": 15234,
    "startedAt": "2024-01-15T10:30:00Z"
  }
}

Hybrid Search

Hybrid search combines full-text and vector search for optimal relevance. When both search types return results, scores are aggregated using configurable methods.

Score Aggregation Methods

Method	Algorithm	Best For
RRF	Reciprocal Rank Fusion	When relevance scores from different sources aren't directly comparable
RSF	Relevance Score Fusion	When you want to combine normalized scores directly
Smart	Automatic selection	General use, automatically picks the best method

Weight Configuration

Control the relative importance of each search type:

{
  "query": {
    "full_text": "quarterly report",
    "vector": { "text": "financial analysis" },
    "weights": {
      "full_text": 2,
      "vector": 3
    }
  }
}

Higher weights give more influence to that search type. With the above configuration, vector search results are weighted 1.5x more than full-text results.

Dynamic Scaling

When enabled, dynamic scaling adjusts weights automatically if one search type is unavailable:

{
  "query": {
    "full_text": "quarterly report",
    "vector": { "text": "financial analysis" },
    "dynamic_scaling": "on"
  }
}

Index Configuration Tools

Agents can query and update index configuration using built-in tools:

get_index_configuration

Retrieves the current index status and configuration.

Returns:

Index status (exists, healthy)
Document count and size
Embedding dimensions for each type
Field mappings

update_index_configuration

Updates index configuration with options to change embedding dimensions or trigger reindexing.

Parameters:

Parameter	Type	Description
`embedding_dimensions`	object	New dimensions for `text`, `image`, or `properties`
`force_reindex`	boolean	Trigger a full reindex
`user_confirmed`	boolean	Required confirmation (must use `ask_user` first)

Troubleshooting

Documents Not Appearing in Search

Check that indexing is enabled (indexingEnabled: true)
Verify embeddings are configured and generating
Allow time for async indexing to complete
Check for sync issues in status endpoint

Dimension Mismatch Errors

If you changed embedding dimensions:

Recalculate embeddings with new dimensions
Trigger reindex with recreateIndex: true

Search Returns No Results

Verify documents exist in MongoDB (mongoDocumentCount)
Check Elasticsearch document count matches
Test with broader queries or match_all
Verify query syntax is correct

Reindex Stuck or Failed

Check workflow status in the Vertesia UI
Look for errors in workflow history
Ensure sufficient permissions
Try triggering a new reindex (will cancel stuck one)

Best Practices

Index Management

Enable queries only after initial indexing completes
Monitor document counts between MongoDB and Elasticsearch
Schedule reindexing during low-traffic periods

Search Configuration

Start with smart score aggregation
Tune weights based on search quality feedback
Use facets for navigation and filtering
Enable analyze for complex queries that benefit from LLM summarization

Performance

Use appropriate limit values (avoid fetching more than needed)
Use count_only for pagination totals
Stream large results to artifacts with output_artifact
Consider DSL mode for complex aggregations

Next Steps

Content Overview - Understanding the full search architecture
Embeddings Configuration - Configure embeddings for vector search
Built-in Tools - Learn about query_documents and index tools
Commands API - Full API reference for Elasticsearch commands