RAG Service Configuration

This document describes all configuration options for the RAG service.

Overview

The RAG service uses pydantic-settings for type-safe configuration with environment variable support.

# config.py
from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        case_sensitive=False,
        extra="ignore",
    )

Environment Variables

Qdrant Configuration

Variable	Type	Default	Description
`QDRANT_HOST`	string	`qdrant`	Qdrant server hostname
`QDRANT_PORT`	int	`6333`	Qdrant server port
`QDRANT_COLLECTION_NAME`	string	`academic_documents`	Collection name

Example:

QDRANT_HOST=localhost
QDRANT_PORT=6333
QDRANT_COLLECTION_NAME=my_documents

Ollama Configuration

Variable	Type	Default	Description
`OLLAMA_HOST`	string	`ollama`	Ollama server hostname
`OLLAMA_PORT`	int	`11434`	Ollama API port
`OLLAMA_MODEL`	string	`nomic-embed-text`	Embedding model

Example:

OLLAMA_HOST=localhost
OLLAMA_PORT=11434
OLLAMA_MODEL=nomic-embed-text

RAG Parameters

Variable	Type	Default	Description
`EMBEDDING_DIMENSION`	int	`768`	Vector dimension (must match model)
`TOP_K_RESULTS`	int	`5`	Default search results count
`SIMILARITY_THRESHOLD`	float	`0.5`	Minimum similarity score (0-1)

Example:

EMBEDDING_DIMENSION=768
TOP_K_RESULTS=10
SIMILARITY_THRESHOLD=0.6

Chunking Parameters

Variable	Type	Default	Description
`CHUNK_SIZE`	int	`1000`	Maximum characters per chunk
`CHUNK_OVERLAP`	int	`200`	Overlap between consecutive chunks

Example:

CHUNK_SIZE=500
CHUNK_OVERLAP=100

API Configuration

Variable	Type	Default	Description
`API_HOST`	string	`0.0.0.0`	API bind address
`API_PORT`	int	`8081`	API port
`CORS_ORIGINS`	list	`["*"]`	Allowed CORS origins

Example:

API_HOST=0.0.0.0
API_PORT=8081
CORS_ORIGINS=["http://localhost:5173"]

Storage Configuration

Variable	Type	Default	Description
`DOCUMENTS_PATH`	string	`/app/documents`	Documents directory

Example:

DOCUMENTS_PATH=/data/documents

Configuration Class

Full settings class definition:

class Settings(BaseSettings):
    """RAG Service configuration settings."""
    
    # Qdrant configuration
    qdrant_host: str = "qdrant"
    qdrant_port: int = 6333
    qdrant_collection_name: str = "academic_documents"
    
    # Ollama configuration
    ollama_host: str = "ollama"
    ollama_port: int = 11434
    ollama_model: str = "nomic-embed-text"
    
    # RAG parameters
    embedding_dimension: int = 768
    top_k_results: int = 5
    similarity_threshold: float = 0.5
    
    # Chunking parameters
    chunk_size: int = 1000
    chunk_overlap: int = 200
    
    # API configuration
    api_host: str = "0.0.0.0"
    api_port: int = 8081
    cors_origins: list[str] = ["*"]
    
    # Documents storage
    documents_path: str = "/app/documents"
    
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        case_sensitive=False,
        extra="ignore",
    )

settings = Settings()

Environment Files

Development (.env)

# Qdrant (local)
QDRANT_HOST=localhost
QDRANT_PORT=6333

# Ollama (local)
OLLAMA_HOST=localhost
OLLAMA_PORT=11434

# Development settings
DOCUMENTS_PATH=./documents
TOP_K_RESULTS=10
SIMILARITY_THRESHOLD=0.4

Docker (.env.docker)

# Qdrant (container)
QDRANT_HOST=qdrant
QDRANT_PORT=6333

# Ollama (container)
OLLAMA_HOST=ollama
OLLAMA_PORT=11434

# Production settings
DOCUMENTS_PATH=/app/documents
TOP_K_RESULTS=5
SIMILARITY_THRESHOLD=0.5

Example File (.env.example)

# ===========================================
# RAG Service Configuration
# ===========================================

# Qdrant Vector Database
QDRANT_HOST=qdrant
QDRANT_PORT=6333
QDRANT_COLLECTION_NAME=academic_documents

# Ollama Embedding Service
OLLAMA_HOST=ollama
OLLAMA_PORT=11434
OLLAMA_MODEL=nomic-embed-text

# Embedding Configuration
EMBEDDING_DIMENSION=768

# Search Parameters
TOP_K_RESULTS=5
SIMILARITY_THRESHOLD=0.5

# Chunking Parameters
CHUNK_SIZE=1000
CHUNK_OVERLAP=200

# API Server
API_HOST=0.0.0.0
API_PORT=8081
CORS_ORIGINS=["*"]

# Document Storage
DOCUMENTS_PATH=/app/documents

Docker Compose Integration

Service Definition

rag_service:
  build:
    context: .
    dockerfile: ./rag_service/Dockerfile
  container_name: rag_service
  environment:
    - QDRANT_HOST=qdrant
    - QDRANT_PORT=6333
    - OLLAMA_HOST=ollama
    - OLLAMA_PORT=11434
    - DOCUMENTS_PATH=/app/documents
  ports:
    - "8081:8081"
  volumes:
    - ./rag_service/documents:/app/documents
  depends_on:
    - qdrant
    - ollama

Dependent Services

qdrant:
  image: qdrant/qdrant:latest
  ports:
    - "6333:6333"
  volumes:
    - qdrant_storage:/qdrant/storage

ollama:
  image: ollama/ollama:latest
  ports:
    - "11435:11434"
  volumes:
    - ollama_models:/root/.ollama

Configuration by Environment

Local Development

# Start dependencies
docker compose up -d qdrant ollama

# Run service locally
cd rag_service
export QDRANT_HOST=localhost
export OLLAMA_HOST=localhost
uvicorn rag_service.api:app --reload --port 8081

Docker Development

# Use docker-compose defaults
docker compose up -d rag_service

Production

# Production settings
export SIMILARITY_THRESHOLD=0.6
export TOP_K_RESULTS=5
export CHUNK_SIZE=800

# Or use .env file
cp .env.production .env
docker compose -f docker-compose.prod.yml up -d

Tuning Guidelines

Embedding Model Selection

Model	Dimensions	`EMBEDDING_DIMENSION`
`nomic-embed-text`	768	`768`
`mxbai-embed-large`	1024	`1024`
`all-minilm`	384	`384`

⚠️ Important: Changing EMBEDDING_DIMENSION requires recreating the Qdrant collection.

Chunking Strategy

Document Type	`CHUNK_SIZE`	`CHUNK_OVERLAP`
Short notes	500	50
Lecture slides	800	100
Textbook chapters	1000	200
Code documentation	1200	150

Search Quality

Use Case	`TOP_K_RESULTS`	`SIMILARITY_THRESHOLD`
Precise answers	3-5	0.7-0.8
Exploration	10-15	0.4-0.5
Comprehensive	20+	0.3

Validation

Check Current Configuration

from rag_service.config import settings

print(f"Qdrant: {settings.qdrant_host}:{settings.qdrant_port}")
print(f"Ollama: {settings.ollama_host}:{settings.ollama_port}")
print(f"Model: {settings.ollama_model}")
print(f"Chunk size: {settings.chunk_size}")

Health Check Verification

# Check service health
curl http://localhost:8081/health

# Response shows configuration in action
{
  "status": "healthy",
  "qdrant_connected": true,
  "collection": {
    "name": "academic_documents",  # QDRANT_COLLECTION_NAME
    "points_count": 156
  }
}

Troubleshooting

Connection Issues

Error	Check	Solution
Qdrant connection refused	`QDRANT_HOST`, `QDRANT_PORT`	Verify Qdrant is running
Ollama connection failed	`OLLAMA_HOST`, `OLLAMA_PORT`	Check Ollama container
Model not found	`OLLAMA_MODEL`	Pull model: `ollama pull nomic-embed-text`

Dimension Mismatch

Error: Vector dimension mismatch

Solution:

Delete existing collection
Update EMBEDDING_DIMENSION to match model
Re-index documents

CORS Errors

# Allow specific origins
CORS_ORIGINS=["http://localhost:5173","http://localhost:3000"]

# Or allow all (development only)
CORS_ORIGINS=["*"]

Architecture - System design
Development - Local setup
Deployment - Production config

RAG Service Configuration

Overview

Environment Variables

Qdrant Configuration

Ollama Configuration

RAG Parameters

Chunking Parameters

API Configuration

Storage Configuration

Configuration Class

Environment Files

Development (.env)

Docker (.env.docker)

Example File (.env.example)

Docker Compose Integration

Service Definition

Dependent Services

Configuration by Environment

Local Development

Docker Development

Production

Tuning Guidelines

Embedding Model Selection

Chunking Strategy

Search Quality

Validation

Check Current Configuration

Health Check Verification

Troubleshooting

Connection Issues

Dimension Mismatch

CORS Errors

Related Documentation