RAG Service Development

This document covers local development setup, testing, and debugging for the RAG service.

Prerequisites

Python 3.12+
Docker (for Qdrant and Ollama)
uv or pip for package management

Local Setup

1. Start Dependencies

# Start Qdrant and Ollama
docker compose up -d qdrant ollama

# Initialize embedding model
docker exec ollama ollama pull nomic-embed-text

# Verify services
curl http://localhost:6333/healthz     # Qdrant
curl http://localhost:11434/api/tags   # Ollama

2. Install Python Dependencies

cd rag_service

# Using uv (recommended)
uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"

# Or using pip
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

3. Configure Environment

# Create .env file
cat > .env << EOF
QDRANT_HOST=localhost
QDRANT_PORT=6333
OLLAMA_HOST=localhost
OLLAMA_PORT=11434
DOCUMENTS_PATH=./documents
EOF

4. Run the Service

# Development mode with auto-reload
uvicorn rag_service.api:app --reload --port 8081

# Or using the module
python -m rag_service

5. Verify Installation

# Health check
curl http://localhost:8081/health

# API info
curl http://localhost:8081/

Project Structure

rag_service/
├── api.py                    # FastAPI application entry point
├── config.py                 # Pydantic settings
├── models.py                 # Request/response models
├── logging_config.py         # Structured logging
├── __main__.py               # Module entry point
├── pyproject.toml            # Dependencies
│
├── routes/                   # API endpoints
│   ├── general.py            # /, /health
│   ├── search_index.py       # /search, /index
│   ├── files.py              # /files, /upload
│   └── subjects.py           # /subjects
│
├── documents/                # Document processing
│   ├── file_loader.py        # Load PDF, TXT, MD
│   ├── document_processor.py # Text chunking
│   └── file_utils.py         # File system ops
│
├── embeddings/               # Vector operations
│   ├── embeddings.py         # Ollama embeddings
│   └── store.py              # Qdrant vector store
│
├── tests/                    # Test suite
│   ├── conftest.py           # Fixtures
│   ├── test_*.py             # Unit tests
│   └── test_integration.py   # Integration tests
│
└── documents/                # Sample documents

Development Workflow

Running Tests

cd rag_service

# All tests
pytest tests/ -v

# Unit tests only (no external services)
pytest tests/ -m "not integration" -v

# Integration tests (requires Qdrant/Ollama)
pytest tests/ -m integration -v

# With coverage
pytest tests/ -v --cov=. --cov-report=html

Test Markers

# pytest.ini
[pytest]
markers =
    unit: Unit tests (no external dependencies)
    integration: Integration tests (require external services)

Code Quality

# Linting
ruff check .

# Formatting
black .
isort .

# Type checking
mypy .

Testing Strategies

Unit Tests

Test individual components without external services:

# tests/test_document_processor.py

def test_chunk_document():
    processor = DocumentProcessor(chunk_size=100, chunk_overlap=20)
    doc = Document(content="x" * 500, metadata=sample_metadata)
    
    chunks = processor.chunk_document(doc)
    
    assert len(chunks) > 1
    assert all(len(c.content) <= 120 for c in chunks)  # Allow overlap
    assert all(c.metadata.chunk_id is not None for c in chunks)

Mock External Services

# tests/test_search.py
from unittest.mock import Mock, patch

@patch('rag_service.embeddings.store.QdrantClient')
@patch('rag_service.embeddings.embeddings.OllamaEmbeddings')
def test_search_with_mocks(mock_ollama, mock_qdrant):
    # Configure mocks
    mock_ollama.return_value.embed_query.return_value = [0.1] * 768
    mock_qdrant.return_value.query_points.return_value = Mock(points=[])
    
    # Test search
    store = VectorStoreService()
    results = store.search("test query")
    
    assert results == []

Integration Tests

Test with real external services:

# tests/test_integration.py
import pytest

@pytest.mark.integration
def test_full_indexing_pipeline():
    """Test complete indexing and search flow."""
    store = get_vector_store()
    
    # Index a document
    doc = Document(
        content="Docker is a containerization platform.",
        metadata=DocumentMetadata(
            asignatura="iv",
            tipo_documento="teoria"
        )
    )
    indexed = store.index_documents([doc])
    assert indexed > 0
    
    # Search for it
    results = store.search("What is Docker?")
    assert len(results) > 0
    assert "Docker" in results[0].content

Fixtures

# tests/conftest.py
import pytest
from rag_service.models import Document, DocumentMetadata

@pytest.fixture
def sample_metadata():
    return DocumentMetadata(
        asignatura="test-subject",
        tipo_documento="test-type"
    )

@pytest.fixture
def sample_document(sample_metadata):
    return Document(
        content="Test document content for unit testing.",
        metadata=sample_metadata,
        doc_id="test-doc"
    )

@pytest.fixture
def temp_documents_dir(tmp_path):
    """Create temporary documents directory structure."""
    docs = tmp_path / "documents"
    (docs / "test-subject" / "test-type").mkdir(parents=True)
    return docs

Debugging

Enable Debug Logging

# logging_config.py
import logging

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("rag_service")
logger.setLevel(logging.DEBUG)

Debug Endpoints

# Check collection info
curl http://localhost:8081/collection/info

# List all files
curl http://localhost:8081/files

# Check subjects
curl http://localhost:8081/subjects

Interactive Debugging

# Launch Python REPL with service loaded
python -c "
from rag_service.embeddings.store import get_vector_store
from rag_service.config import settings

print(f'Qdrant: {settings.qdrant_host}:{settings.qdrant_port}')

store = get_vector_store()
info = store.get_collection_info()
print(f'Collection: {info}')
"

Debug in VSCode

// .vscode/launch.json
{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "RAG Service",
            "type": "python",
            "request": "launch",
            "module": "uvicorn",
            "args": [
                "rag_service.api:app",
                "--reload",
                "--port", "8081"
            ],
            "env": {
                "QDRANT_HOST": "localhost",
                "OLLAMA_HOST": "localhost"
            }
        },
        {
            "name": "Debug Tests",
            "type": "python",
            "request": "launch",
            "module": "pytest",
            "args": ["-v", "-s", "tests/"]
        }
    ]
}

API Development

FastAPI Interactive Docs

Swagger UI: http://localhost:8081/docs
ReDoc: http://localhost:8081/redoc

Test Requests

# Upload a file
curl -X POST http://localhost:8081/upload \
  -F "file=@test.pdf" \
  -F 'metadata={"asignatura":"test","tipo_documento":"apuntes","auto_index":true}'

# Search
curl -X POST http://localhost:8081/search \
  -H "Content-Type: application/json" \
  -d '{"query": "test query", "top_k": 5}'

# Index directly
curl -X POST http://localhost:8081/index \
  -H "Content-Type: application/json" \
  -d '[{"content": "Test content", "metadata": {"asignatura": "test", "tipo_documento": "apuntes"}}]'

Example Upload Script

# upload_example.py
import requests

# Upload and index a document
files = {'file': open('tema1.pdf', 'rb')}
metadata = {
    "asignatura": "logica-difusa",
    "tipo_documento": "apuntes",
    "tema": "Conjuntos difusos",
    "auto_index": True
}

response = requests.post(
    "http://localhost:8081/upload",
    files=files,
    data={"metadata": json.dumps(metadata)}
)

print(response.json())

Docker Development

Build Image

# Build with dev dependencies
docker build -f rag_service/Dockerfile \
  --build-arg INSTALL_DEV=true \
  -t rag-service:dev .

Run Container

docker run -it --rm \
  -p 8081:8081 \
  -e QDRANT_HOST=host.docker.internal \
  -e OLLAMA_HOST=host.docker.internal \
  -v $(pwd)/documents:/app/documents \
  rag-service:dev

Docker Compose Development

# Start all services
docker compose up -d

# View logs
docker compose logs -f rag_service

# Restart after code changes
docker compose restart rag_service

# Rebuild and restart
docker compose up -d --build rag_service

Troubleshooting

Common Issues

Issue	Solution
`Connection refused` to Qdrant	Ensure Qdrant is running: `docker compose up -d qdrant`
`Connection refused` to Ollama	Ensure Ollama is running: `docker compose up -d ollama`
`Model not found`	Pull model: `docker exec ollama ollama pull nomic-embed-text`
`Dimension mismatch`	Delete collection, ensure `EMBEDDING_DIMENSION` matches model
`File not found`	Check `DOCUMENTS_PATH` configuration

Check Service Status

# Qdrant
curl http://localhost:6333/healthz

# Ollama models
curl http://localhost:11434/api/tags

# RAG Service
curl http://localhost:8081/health

Reset Vector Store

from rag_service.embeddings.store import get_vector_store

store = get_vector_store()
store.delete_collection()
# Collection will be recreated on next index

Architecture - System design
Configuration - Environment variables
API Endpoints - API reference
Deployment - Production setup

RAG Service Development

Prerequisites

Local Setup

1. Start Dependencies

2. Install Python Dependencies

3. Configure Environment

4. Run the Service

5. Verify Installation

Project Structure

Development Workflow

Running Tests

Test Markers

Code Quality

Testing Strategies

Unit Tests

Mock External Services

Integration Tests

Fixtures

Debugging

Enable Debug Logging

Debug Endpoints

Interactive Debugging

Debug in VSCode

API Development

FastAPI Interactive Docs

Test Requests

Example Upload Script

Docker Development

Build Image

Run Container

Docker Compose Development

Troubleshooting

Common Issues

Check Service Status

Reset Vector Store

Related Documentation