Troubleshooting

Common issues and their solutions.

Container Issues

Services Not Starting

Symptom: Containers fail to start or stay in “Restarting” state.

# Check container status
docker compose ps

# View logs for specific service
docker compose logs -f backend
docker compose logs -f chatbot

# Restart specific service
docker compose restart backend

# Full restart
docker compose down && docker compose up -d

Port Already in Use

Symptom: Error message about port being in use.

# Find process using port
lsof -i :8000

# Kill process
kill -9 <PID>

# Or change port in docker-compose.yml

Container Resource Issues

Symptom: Containers are slow or crash.

# Check resource usage
docker stats

# Increase memory limits in docker-compose.yml
services:
  chatbot:
    deploy:
      resources:
        limits:
          memory: 2G

Database Issues

MongoDB Connection Refused

Symptom: Connection refused or Authentication failed.

# Verify MongoDB is running
docker exec mongo-service mongosh --eval "db.runCommand('ping')"

# Check credentials match .env
cat .env | grep MONGO

# Test connection manually
mongosh "mongodb://root:example@localhost:27017"

MongoDB Data Corruption

Symptom: Database errors or missing data.

# Reset database (WARNING: deletes all data)
docker compose down -v
docker compose up -d

# Re-seed users
python scripts/seed_users.py

LLM Issues

Gemini API Key Invalid

Symptom: 401 Unauthorized or API key not valid.

Verify your API key at Google AI Studio
Check the key is set correctly:
```
echo $GOOGLE_API_KEY
```
Ensure no extra spaces or quotes in .env

Ollama Model Not Found

Symptom: model 'nomic-embed-text' not found.

# List available models
docker exec ollama-service ollama list

# Pull model manually
docker exec ollama-service ollama pull nomic-embed-text

# Or run init script
./scripts/init_ollama.sh

LLM Response Timeout

Symptom: Requests timeout or take too long.

Check LLM service health:

curl http://localhost:11434/api/tags  # Ollama

Increase timeout in configuration
Use a faster model for development

Python Issues

Import Errors

Symptom: ModuleNotFoundError: No module named 'backend'.

# Ensure packages are installed in development mode
uv pip install -e ./backend -e ./rag_service -e ./chatbot -e .

# Verify installation
uv pip list | grep tfg

# Activate virtual environment
source .venv/bin/activate

Version Conflicts

Symptom: Dependency conflicts or version mismatches.

# Recreate virtual environment
rm -rf .venv
uv venv
source .venv/bin/activate
uv pip install -e ./backend -e ./rag_service -e ./chatbot -e ".[test,quality,dev]"

Python Version Too Old

Symptom: Python 3.13 required.

# Check Python version
python --version

# Use pyenv to install 3.13
pyenv install 3.13
pyenv local 3.13

Frontend Issues

npm Install Fails

Symptom: Dependency installation errors.

# Clear npm cache
cd frontend
rm -rf node_modules package-lock.json
npm cache clean --force
npm install

Build Errors

Symptom: TypeScript or build errors.

# Check for type errors
cd frontend
npm run check

# Fix linting issues
npm run check:fix

CORS Errors

Symptom: Access-Control-Allow-Origin errors in browser.

Verify backend is running on expected port
Check CORS configuration in backend/api.py
Ensure frontend is using correct API URL

Test Issues

Tests Not Found

Symptom: no tests ran or 0 items collected.

# Ensure correct directory
pwd  # Should be project root

# Check test discovery
pytest --collect-only tests/

# Verify markers
pytest --markers | grep -E "unit|integration"

Integration Tests Fail

Symptom: Tests fail with connection errors.

Verify services are running:

docker compose ps
curl http://localhost:8000/health

Wait for services to be healthy
Check test environment variables

Fixture Not Found

Symptom: fixture 'xyz' not found.

# Check conftest.py exists
ls tests/infrastructure/conftest.py

# View available fixtures
pytest --fixtures tests/infrastructure/

Network Issues

Service Cannot Reach Another Service

Symptom: Timeout errors between services.

# Test connectivity from container
docker exec tfg-chatbot curl http://rag_service:8081/health

# Check network
docker network ls
docker network inspect tfg-chatbot_default

DNS Resolution Fails

Symptom: Could not resolve host.

# Use IP instead of hostname for testing
docker inspect tfg-rag-service | grep IPAddress

# Restart containers
docker compose down && docker compose up -d

Performance Issues

Slow Startup

Symptom: Services take long to start.

Pre-pull images:
```
docker compose pull
```

Use --no-rebuild flag for testing:

./scripts/run_tests.sh all --no-rebuild

High Memory Usage

Symptom: System becomes slow, OOM errors.

Check memory usage:
```
docker stats
```
Reduce parallel test workers
Use smaller LLM models for development

Getting Help

Check Documentation

ADRs: Architecture decisions in docs/ADR/
Copilot Instructions: Technical details in .github/copilot-instructions.md
API Docs: Run services and visit /docs endpoints

Debug Logging

Enable debug logging for more information:

export LOG_LEVEL=DEBUG
docker compose up

Report Issues

If you can’t resolve an issue:

Check GitHub Issues for known problems
Create a new issue with:
- Steps to reproduce
- Expected vs actual behavior
- Logs and error messages
- Environment details (OS, Python version, etc.)