RAG Service Development
This document covers local development setup, testing, and debugging for the RAG service.
Prerequisites
- Python 3.12+
- Docker (for Qdrant and Ollama)
uvorpipfor package management
Local Setup
1. Start Dependencies
# Start Qdrant and Ollama
docker compose up -d qdrant ollama
# Initialize embedding model
docker exec ollama ollama pull nomic-embed-text
# Verify services
curl http://localhost:6333/healthz # Qdrant
curl http://localhost:11434/api/tags # Ollama
2. Install Python Dependencies
cd rag_service
# Using uv (recommended)
uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"
# Or using pip
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
3. Configure Environment
# Create .env file
cat > .env << EOF
QDRANT_HOST=localhost
QDRANT_PORT=6333
OLLAMA_HOST=localhost
OLLAMA_PORT=11434
DOCUMENTS_PATH=./documents
EOF
4. Run the Service
# Development mode with auto-reload
uvicorn rag_service.api:app --reload --port 8081
# Or using the module
python -m rag_service
5. Verify Installation
# Health check
curl http://localhost:8081/health
# API info
curl http://localhost:8081/
Project Structure
rag_service/
├── api.py # FastAPI application entry point
├── config.py # Pydantic settings
├── models.py # Request/response models
├── logging_config.py # Structured logging
├── __main__.py # Module entry point
├── pyproject.toml # Dependencies
│
├── routes/ # API endpoints
│ ├── general.py # /, /health
│ ├── search_index.py # /search, /index
│ ├── files.py # /files, /upload
│ └── subjects.py # /subjects
│
├── documents/ # Document processing
│ ├── file_loader.py # Load PDF, TXT, MD
│ ├── document_processor.py # Text chunking
│ └── file_utils.py # File system ops
│
├── embeddings/ # Vector operations
│ ├── embeddings.py # Ollama embeddings
│ └── store.py # Qdrant vector store
│
├── tests/ # Test suite
│ ├── conftest.py # Fixtures
│ ├── test_*.py # Unit tests
│ └── test_integration.py # Integration tests
│
└── documents/ # Sample documents
Development Workflow
Running Tests
cd rag_service
# All tests
pytest tests/ -v
# Unit tests only (no external services)
pytest tests/ -m "not integration" -v
# Integration tests (requires Qdrant/Ollama)
pytest tests/ -m integration -v
# With coverage
pytest tests/ -v --cov=. --cov-report=html
Test Markers
# pytest.ini
[pytest]
markers =
unit: Unit tests (no external dependencies)
integration: Integration tests (require external services)
Code Quality
# Linting
ruff check .
# Formatting
black .
isort .
# Type checking
mypy .
Testing Strategies
Unit Tests
Test individual components without external services:
# tests/test_document_processor.py
def test_chunk_document():
processor = DocumentProcessor(chunk_size=100, chunk_overlap=20)
doc = Document(content="x" * 500, metadata=sample_metadata)
chunks = processor.chunk_document(doc)
assert len(chunks) > 1
assert all(len(c.content) <= 120 for c in chunks) # Allow overlap
assert all(c.metadata.chunk_id is not None for c in chunks)
Mock External Services
# tests/test_search.py
from unittest.mock import Mock, patch
@patch('rag_service.embeddings.store.QdrantClient')
@patch('rag_service.embeddings.embeddings.OllamaEmbeddings')
def test_search_with_mocks(mock_ollama, mock_qdrant):
# Configure mocks
mock_ollama.return_value.embed_query.return_value = [0.1] * 768
mock_qdrant.return_value.query_points.return_value = Mock(points=[])
# Test search
store = VectorStoreService()
results = store.search("test query")
assert results == []
Integration Tests
Test with real external services:
# tests/test_integration.py
import pytest
@pytest.mark.integration
def test_full_indexing_pipeline():
"""Test complete indexing and search flow."""
store = get_vector_store()
# Index a document
doc = Document(
content="Docker is a containerization platform.",
metadata=DocumentMetadata(
asignatura="iv",
tipo_documento="teoria"
)
)
indexed = store.index_documents([doc])
assert indexed > 0
# Search for it
results = store.search("What is Docker?")
assert len(results) > 0
assert "Docker" in results[0].content
Fixtures
# tests/conftest.py
import pytest
from rag_service.models import Document, DocumentMetadata
@pytest.fixture
def sample_metadata():
return DocumentMetadata(
asignatura="test-subject",
tipo_documento="test-type"
)
@pytest.fixture
def sample_document(sample_metadata):
return Document(
content="Test document content for unit testing.",
metadata=sample_metadata,
doc_id="test-doc"
)
@pytest.fixture
def temp_documents_dir(tmp_path):
"""Create temporary documents directory structure."""
docs = tmp_path / "documents"
(docs / "test-subject" / "test-type").mkdir(parents=True)
return docs
Debugging
Enable Debug Logging
# logging_config.py
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("rag_service")
logger.setLevel(logging.DEBUG)
Debug Endpoints
# Check collection info
curl http://localhost:8081/collection/info
# List all files
curl http://localhost:8081/files
# Check subjects
curl http://localhost:8081/subjects
Interactive Debugging
# Launch Python REPL with service loaded
python -c "
from rag_service.embeddings.store import get_vector_store
from rag_service.config import settings
print(f'Qdrant: {settings.qdrant_host}:{settings.qdrant_port}')
store = get_vector_store()
info = store.get_collection_info()
print(f'Collection: {info}')
"
Debug in VSCode
// .vscode/launch.json
{
"version": "0.2.0",
"configurations": [
{
"name": "RAG Service",
"type": "python",
"request": "launch",
"module": "uvicorn",
"args": [
"rag_service.api:app",
"--reload",
"--port", "8081"
],
"env": {
"QDRANT_HOST": "localhost",
"OLLAMA_HOST": "localhost"
}
},
{
"name": "Debug Tests",
"type": "python",
"request": "launch",
"module": "pytest",
"args": ["-v", "-s", "tests/"]
}
]
}
API Development
FastAPI Interactive Docs
- Swagger UI: http://localhost:8081/docs
- ReDoc: http://localhost:8081/redoc
Test Requests
# Upload a file
curl -X POST http://localhost:8081/upload \
-F "file=@test.pdf" \
-F 'metadata={"asignatura":"test","tipo_documento":"apuntes","auto_index":true}'
# Search
curl -X POST http://localhost:8081/search \
-H "Content-Type: application/json" \
-d '{"query": "test query", "top_k": 5}'
# Index directly
curl -X POST http://localhost:8081/index \
-H "Content-Type: application/json" \
-d '[{"content": "Test content", "metadata": {"asignatura": "test", "tipo_documento": "apuntes"}}]'
Example Upload Script
# upload_example.py
import requests
# Upload and index a document
files = {'file': open('tema1.pdf', 'rb')}
metadata = {
"asignatura": "logica-difusa",
"tipo_documento": "apuntes",
"tema": "Conjuntos difusos",
"auto_index": True
}
response = requests.post(
"http://localhost:8081/upload",
files=files,
data={"metadata": json.dumps(metadata)}
)
print(response.json())
Docker Development
Build Image
# Build with dev dependencies
docker build -f rag_service/Dockerfile \
--build-arg INSTALL_DEV=true \
-t rag-service:dev .
Run Container
docker run -it --rm \
-p 8081:8081 \
-e QDRANT_HOST=host.docker.internal \
-e OLLAMA_HOST=host.docker.internal \
-v $(pwd)/documents:/app/documents \
rag-service:dev
Docker Compose Development
# Start all services
docker compose up -d
# View logs
docker compose logs -f rag_service
# Restart after code changes
docker compose restart rag_service
# Rebuild and restart
docker compose up -d --build rag_service
Troubleshooting
Common Issues
| Issue | Solution |
|---|---|
Connection refused to Qdrant | Ensure Qdrant is running: docker compose up -d qdrant |
Connection refused to Ollama | Ensure Ollama is running: docker compose up -d ollama |
Model not found | Pull model: docker exec ollama ollama pull nomic-embed-text |
Dimension mismatch | Delete collection, ensure EMBEDDING_DIMENSION matches model |
File not found | Check DOCUMENTS_PATH configuration |
Check Service Status
# Qdrant
curl http://localhost:6333/healthz
# Ollama models
curl http://localhost:11434/api/tags
# RAG Service
curl http://localhost:8081/health
Reset Vector Store
from rag_service.embeddings.store import get_vector_store
store = get_vector_store()
store.delete_collection()
# Collection will be recreated on next index
Related Documentation
- Architecture - System design
- Configuration - Environment variables
- API Endpoints - API reference
- Deployment - Production setup