Vector Store with Qdrant

This document describes the vector storage and retrieval system using Qdrant for semantic search.

Overview

The RAG service uses Qdrant as the vector database for storing document embeddings and performing similarity searches.

graph TB
    subgraph "VectorStoreService"
        Index[index_documents]
        Search[search]
        Info[get_collection_info]
    end
    
    subgraph "Qdrant :6333"
        Collection[(academic_documents)]
        Vectors[Vector Index]
        Payloads[Metadata Payloads]
    end
    
    Index -->|upsert| Collection
    Search -->|query_points| Collection
    Info -->|get_collection| Collection
    Collection --> Vectors
    Collection --> Payloads

VectorStoreService Class

Located in embeddings/store.py:

class VectorStoreService:
    """Service for managing Qdrant vector store."""
    
    def index_documents(self, documents: list[Document], auto_chunk: bool = True) -> int:
        """Index documents into the vector store."""
        
    def search(self, query: str, top_k: int = 5, 
               score_threshold: float = 0.5,
               filters: dict[str, str] | None = None) -> list[SearchResult]:
        """Perform semantic search over indexed documents."""
        
    def get_collection_info(self) -> dict[str, Any]:
        """Get information about the Qdrant collection."""
        
    def delete_collection(self):
        """Delete the collection and all its data."""

Collection Configuration

Default Settings

Setting	Value	Description
Name	`academic_documents`	Collection name
Distance	`COSINE`	Similarity metric
Vector Size	`768`	Embedding dimensions

Collection Initialization

The collection is created automatically on first access:

def _init_collection(self):
    collections = self.client.get_collections().collections
    collection_names = [c.name for c in collections]
    
    if self.collection_name not in collection_names:
        self.client.create_collection(
            collection_name=self.collection_name,
            vectors_config=VectorParams(
                size=settings.embedding_dimension,
                distance=Distance.COSINE,
            ),
        )

Document Indexing

Indexing Flow

sequenceDiagram
    participant Client
    participant VectorStore
    participant DocProcessor
    participant EmbeddingService
    participant Qdrant
    
    Client->>VectorStore: index_documents(docs)
    VectorStore->>DocProcessor: chunk_documents(docs)
    DocProcessor-->>VectorStore: chunks[]
    
    VectorStore->>EmbeddingService: embed_documents(texts)
    EmbeddingService-->>VectorStore: embeddings[]
    
    VectorStore->>VectorStore: _create_points()
    VectorStore->>Qdrant: upsert(points)
    Qdrant-->>VectorStore: success
    VectorStore-->>Client: indexed_count

Usage

from rag_service.embeddings.store import get_vector_store
from rag_service.models import Document, DocumentMetadata

vector_store = get_vector_store()

# Create document
doc = Document(
    content="Fuzzy logic is a form of many-valued logic...",
    metadata=DocumentMetadata(
        asignatura="logica-difusa",
        tipo_documento="apuntes",
        tema="Introducción"
    )
)

# Index with automatic chunking
indexed_count = vector_store.index_documents([doc], auto_chunk=True)
print(f"Indexed {indexed_count} chunks")

Point Structure

Each document chunk is stored as a Qdrant point:

PointStruct(
    id=idx,
    vector=embedding,  # 768-dim float array
    payload={
        "content": "Document text content...",
        "filename": "tema1.pdf",
        "asignatura": "logica-difusa",
        "tipo_documento": "apuntes",
        "tema": "Introducción",
        "chunk_id": 0,
        # ... other metadata
    }
)

Semantic Search

Search Parameters

Parameter	Type	Default	Description
`query`	string	required	Natural language query
`top_k`	int	5	Maximum results
`score_threshold`	float	0.5	Minimum similarity (0-1)
`filters`	dict	None	Metadata filters

Basic Search

results = vector_store.search(
    query="What is a membership function?",
    top_k=5
)

for result in results:
    print(f"Score: {result.score:.2f}")
    print(f"Content: {result.content[:100]}...")
    print(f"Subject: {result.metadata['asignatura']}")

Filtered Search

# Filter by subject
results = vector_store.search(
    query="Docker containers",
    filters={"asignatura": "iv"}
)

# Filter by subject and document type
results = vector_store.search(
    query="Practice exercises",
    filters={
        "asignatura": "logica-difusa",
        "tipo_documento": "ejercicios"
    }
)

Score Threshold

Adjust the minimum similarity score:

# High precision (fewer, more relevant results)
results = vector_store.search(
    query="fuzzy sets",
    score_threshold=0.8
)

# High recall (more results, some less relevant)
results = vector_store.search(
    query="fuzzy sets",
    score_threshold=0.3
)

Filter Implementation

Filters are converted to Qdrant conditions:

def search(self, query, filters=None):
    qdrant_filter = None
    
    if filters:
        conditions = []
        for key, value in filters.items():
            conditions.append(
                FieldCondition(
                    key=key,
                    match=MatchValue(value=value),
                )
            )
        qdrant_filter = Filter(must=conditions)
    
    results = self.client.query_points(
        collection_name=self.collection_name,
        query=query_embedding,
        query_filter=qdrant_filter,
        ...
    )

Available Filter Fields

Field	Type	Example
`asignatura`	string	`"logica-difusa"`
`tipo_documento`	string	`"apuntes"`
`tema`	string	`"Conjuntos difusos"`
`autor`	string	`"Profesor"`
`fuente`	string	`"PRADO UGR"`
`idioma`	string	`"es"`

Collection Management

Get Collection Info

info = vector_store.get_collection_info()
print(info)
# {
#     "name": "academic_documents",
#     "vectors_count": 156,
#     "points_count": 156,
#     "status": "green"
# }

Delete Collection

⚠️ Warning: This permanently deletes all indexed documents.

vector_store.delete_collection()
# Collection will be recreated on next index operation

Configuration

Environment variables:

Variable	Default	Description
`QDRANT_HOST`	`qdrant`	Qdrant server hostname
`QDRANT_PORT`	`6333`	Qdrant server port
`QDRANT_COLLECTION_NAME`	`academic_documents`	Collection name
`TOP_K_RESULTS`	`5`	Default search results
`SIMILARITY_THRESHOLD`	`0.5`	Default score threshold

Qdrant Client Usage

Direct Client Access

from qdrant_client import QdrantClient

client = QdrantClient(host="localhost", port=6333)

# List collections
collections = client.get_collections()
print([c.name for c in collections.collections])

# Get collection details
info = client.get_collection("academic_documents")
print(f"Points: {info.points_count}")
print(f"Status: {info.status}")

Qdrant Web UI

Access the Qdrant dashboard:

http://localhost:6333/dashboard

Features:

Browse collections
View points and payloads
Test searches
Monitor performance

Performance Optimization

Indexing Batch Size

For large document sets:

# Process in batches to avoid memory issues
BATCH_SIZE = 100
for i in range(0, len(documents), BATCH_SIZE):
    batch = documents[i:i + BATCH_SIZE]
    vector_store.index_documents(batch)

Search Optimization

Use filters: Narrow search scope
Tune top_k: Request only needed results
Adjust threshold: Balance precision/recall

Index Configuration

For production, consider:

# In Qdrant config
{
    "optimizers_config": {
        "indexing_threshold": 10000,
        "memmap_threshold": 50000
    },
    "hnsw_config": {
        "m": 16,
        "ef_construct": 100
    }
}

Error Handling

Common Errors

Error	Cause	Solution
`Connection refused`	Qdrant not running	Start Qdrant container
`Collection not found`	Collection deleted	Will auto-create on index
`Dimension mismatch`	Wrong embedding model	Delete collection, re-index

Retry Logic

from tenacity import retry, stop_after_attempt, wait_exponential

@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1))
def robust_search(query):
    return vector_store.search(query)

Testing

Unit Tests

# tests/test_vector_store.py

def test_index_documents(vector_store, sample_documents):
    count = vector_store.index_documents(sample_documents)
    assert count > 0

def test_search(vector_store):
    results = vector_store.search("test query")
    assert isinstance(results, list)
    for result in results:
        assert hasattr(result, 'content')
        assert hasattr(result, 'score')
        assert 0 <= result.score <= 1

Integration Tests

@pytest.mark.integration
def test_full_pipeline():
    # Index
    doc = Document(content="Test content", metadata=...)
    indexed = vector_store.index_documents([doc])
    assert indexed > 0
    
    # Search
    results = vector_store.search("Test")
    assert len(results) > 0
    assert "Test" in results[0].content

Docker Setup

Qdrant Container

# docker-compose.yml
qdrant:
  image: qdrant/qdrant:latest
  container_name: qdrant
  ports:
    - "6333:6333"
    - "6334:6334"  # gRPC
  volumes:
    - qdrant_storage:/qdrant/storage
  restart: unless-stopped

Verify Connection

# Check Qdrant health
curl http://localhost:6333/healthz

# List collections
curl http://localhost:6333/collections

Embeddings - Embedding service
Document Processing - Chunking
API Endpoints - Search API
Architecture - System design