Configuration

Manage RAG parameters, retriever settings, and system prompts

hybrid Settings

Configure hybrid retriever fusion and combination settings

Fusion mode for combining results: 'reciprocal_rerank' (Reciprocal Rank Fusion), 'relative_score' (relative scoring), 'dist_based_score' (distance-based), or 'simple' (simple reordering).

Number of query variations to generate for retrieval. Higher values improve recall but increase latency.

ingestion Settings

Configure document chunking parameters and node parser selection

Number of characters to overlap between consecutive chunks. Overlap helps maintain context across chunk boundaries. Typically 10-20% of chunk_size.

Size of text chunks when splitting documents. Larger chunks preserve more context but may exceed token limits. Typical range: 512-2048.

Node parser to use for markdown documents: 'markdown' (MarkdownNodeParser - preserves structure) or 'sentence' (SentenceSplitter - simple text splitting).

Enable context-based retrieval feature. When enabled, append context of chunk with respect to the document will be added on before every chunks.

llm Settings

General configuration settings

OpenAI embedding model for vector search. Options: text-embedding-3-small (fast), text-embedding-3-large (most capable), text-embedding-ada-002 (legacy).

OpenAI LLM model to use for generation. Options include: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo, etc.

Temperature for LLM generation (0-2). Lower values make output more deterministic, higher values make it more creative.

prompt Settings

Customize system prompts for different channels and query enhancement

Base system prompt that applies to all responses. Contains company branding and general instructions.

Prompt for context extraction. Used to extract the context of a chunk from the whole document.

Channel-specific prompt for email responses. Defines tone, style, and rules for email communication.

Prompt for intent detection. Used to identify user intents or query intents from text.

Prompt for query enhancement/optimization. Used to transform user queries into retrieval-optimized queries.

Prompt for draft refinement. Used to refine the draft response based on the refinement request.

Channel-specific prompt for WhatsApp responses. Defines tone, style, and rules for WhatsApp communication.

rag Settings

Configure RAG query parameters that affect retrieval and response generation

Response synthesis mode: 'compact' (faster, combines chunks), 'refine' (iterative refinement), 'tree_summarize' (tree-based), 'simple_summarize' (single call), 'accumulate' (concatenate all), or 'generation' (ignore context).

Minimum similarity score (0-1) for retrieved chunks. Higher values return more relevant but fewer results.

Number of top chunks to retrieve from the knowledge base. Higher values provide more context but may include less relevant information.

retriever Settings

Configure vector and BM25 retriever parameters

Enable BM25 keyword-based retrieval. When enabled, combines with vector search for hybrid retrieval.

Language for BM25 full-text search. Must match PostgreSQL text search configuration.

Maximum number of chunks to retrieve from BM25 keyword search.

Enable Re-ranker to filter out best performing chunk retrived from similarity search

HNSW index search parameter. Higher values improve recall but slow down search. Typical range: 64-512.

Maximum number of chunks to retrieve from vector search before filtering by similarity threshold.

Enable intent-based filtering of retrieved chunks. When enabled, only chunks with the specified intents will be retrieved.