Files

chore(ci): add comprehensive GitHub Actions CI/CD workflow

- Implement test jobs with Python 3.11 and 3.12 matrix support
- Configure services for PostgreSQL and Redis with health checks
- Include code formatting, linting, type checking, and security scanning steps
- Add unit and integration tests with coverage reporting and upload
- Integrate SonarCloud scanning for code quality assurance
- Add security scanning jobs with Trivy and dependency checks
- Automate Docker image building with metadata extraction and multi-platform support
- Generate and upload software bill of materials (SBOM)
- Include container security scan using Trivy for built images
- Define deployment workflows for staging and production with kubectl
- Perform smoke tests and health checks after deployment
- Send deployment notifications to Slack channels
- Automate GitHub release creation on tagged commits
- Add cleanup job to remove old container image versions
- Support manual deployment dispatch with environment choice input

2025-08-26 10:55:12 -04:00

18 KiB

Raw Permalink Blame History

Discord Voice Chat Quote Bot - Deployment Guide

📋 Table of Contents

Overview
Prerequisites
Environment Setup
Docker Deployment
Kubernetes Deployment
CI/CD Pipeline
Configuration
Monitoring & Observability
Security
Troubleshooting
Backup & Recovery
Scaling
Maintenance

🔍 Overview

The Discord Voice Chat Quote Bot is a comprehensive AI-powered system designed for production deployment with high availability, scalability, and robust monitoring. This deployment supports both Docker Compose and Kubernetes environments.

Architecture Components

Discord Bot Application: Main bot service with AI integration
PostgreSQL: Primary database for structured data
Redis: Caching and session management
Qdrant: Vector database for AI embeddings and memory
Prometheus: Metrics collection and monitoring
Grafana: Visualization and dashboards
Nginx: Reverse proxy and load balancing
Ollama: Optional local AI inference

🛠️ Prerequisites

System Requirements

Minimum Requirements

CPU: 4 cores (2.0+ GHz)
RAM: 8 GB
Storage: 50 GB SSD
Network: 100 Mbps

Recommended for Production

CPU: 8+ cores (3.0+ GHz)
RAM: 16+ GB
Storage: 200+ GB NVMe SSD
Network: 1 Gbps

Software Requirements

# Docker Environment
Docker >= 24.0.0
Docker Compose >= 2.20.0

# Kubernetes Environment
Kubernetes >= 1.28
kubectl >= 1.28
Helm >= 3.12 (optional)

# Development/CI
Git >= 2.40
Python >= 3.11
Node.js >= 18 (for some tools)

External Dependencies

Discord Bot Application and Token
AI Provider API Keys (OpenAI, Anthropic, Groq)
Domain name and SSL certificates (for production)
Cloud storage for backups (optional)

🔧 Environment Setup

1. Clone Repository

git clone https://github.com/your-org/discord-quote-bot.git
cd discord-quote-bot

2. Create Environment Configuration

# Copy environment template
cp .env.example .env.production

# Edit configuration
nano .env.production

3. Generate Secrets

# Make deployment script executable
chmod +x scripts/deploy.sh

# Setup environment (creates directories, generates secrets)
./scripts/deploy.sh setup

4. Configure Required Variables

Edit .env.production with your actual values:

# Discord Configuration
DISCORD_BOT_TOKEN=your_discord_bot_token_here
DISCORD_CLIENT_ID=your_discord_client_id_here
DISCORD_GUILD_ID=your_primary_guild_id_here

# AI Providers
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
GROQ_API_KEY=your_groq_api_key_here

# Database passwords (auto-generated in secrets/)
POSTGRES_PASSWORD=$(cat secrets/postgres_password.txt)
REDIS_PASSWORD=$(cat secrets/redis_password.txt)

# Optional: External monitoring
SENTRY_DSN=your_sentry_dsn_here
WEBHOOK_URL_ALERTS=your_alert_webhook_here

🐳 Docker Deployment

Production Docker Compose

1. Deploy Full Stack

# Full deployment
./scripts/deploy.sh deploy

# Or manually
docker-compose -f docker-compose.production.yml up -d

2. Verify Deployment

# Check service status
./scripts/deploy.sh status

# Check health
./scripts/deploy.sh health

# View logs
./scripts/deploy.sh logs bot

3. Access Services

Bot Health: http://localhost:8080/health
Prometheus: http://localhost:9090
Grafana: http://localhost:3000 (admin/admin123)
Bot Dashboard: http://localhost:8080/dashboard

Docker Commands

# Build images
./scripts/deploy.sh build

# Update deployment
./scripts/deploy.sh update

# Create backup
./scripts/deploy.sh backup

# Stop services
./scripts/deploy.sh stop

# View real-time logs
docker-compose -f docker-compose.production.yml logs -f bot

# Scale bot service
docker-compose -f docker-compose.production.yml up -d --scale bot=3

# Execute commands in containers
docker-compose -f docker-compose.production.yml exec bot bash
docker-compose -f docker-compose.production.yml exec postgres psql -U quotes_user quotes_db

☸️ Kubernetes Deployment

1. Prepare Kubernetes Environment

# Verify cluster access
kubectl cluster-info

# Create namespace
kubectl apply -f k8s/namespace.yaml

# Create secrets
kubectl create secret generic bot-secrets \
  --from-literal=DISCORD_BOT_TOKEN="$(cat secrets/discord_token.txt)" \
  --from-literal=POSTGRES_PASSWORD="$(cat secrets/postgres_password.txt)" \
  --from-literal=REDIS_PASSWORD="$(cat secrets/redis_password.txt)" \
  --from-literal=OPENAI_API_KEY="$OPENAI_API_KEY" \
  --from-literal=ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
  -n discord-quote-bot

2. Deploy Services

# Deploy in order
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/pvc.yaml
kubectl apply -f k8s/postgres.yaml
kubectl apply -f k8s/redis.yaml
kubectl apply -f k8s/qdrant.yaml

# Wait for databases to be ready
kubectl wait --for=condition=ready pod -l app=postgres -n discord-quote-bot --timeout=300s

# Deploy main application
kubectl apply -f k8s/deployment.yaml

# Deploy monitoring
kubectl apply -f k8s/monitoring.yaml

3. Verify Kubernetes Deployment

# Check pod status
kubectl get pods -n discord-quote-bot

# Check services
kubectl get services -n discord-quote-bot

# View logs
kubectl logs -l app=discord-bot -n discord-quote-bot -f

# Port forward for testing
kubectl port-forward service/discord-bot-service 8080:8080 -n discord-quote-bot

4. Kubernetes Management Commands

# Scale deployment
kubectl scale deployment discord-bot --replicas=5 -n discord-quote-bot

# Rolling update
kubectl set image deployment/discord-bot discord-bot=discord-quote-bot:v1.1.0 -n discord-quote-bot

# Check rollout status
kubectl rollout status deployment/discord-bot -n discord-quote-bot

# Rollback deployment
kubectl rollout undo deployment/discord-bot -n discord-quote-bot

# View resource usage
kubectl top pods -n discord-quote-bot
kubectl top nodes

🔄 CI/CD Pipeline

GitHub Actions Setup

Configure Repository Secrets:

GITHUB_TOKEN: (automatic)
SONAR_TOKEN: SonarCloud token
CODECOV_TOKEN: Codecov token
STAGING_KUBECONFIG: Base64 encoded kubeconfig for staging
PRODUCTION_KUBECONFIG: Base64 encoded kubeconfig for production
SLACK_WEBHOOK: Slack webhook for notifications
SENTRY_DSN: Sentry error tracking

Workflow Triggers:
- Push to main: Deploy to production
- Push to develop: Deploy to staging
- Pull requests: Run tests only
- Tags v*: Create release and deploy to production
Pipeline Stages:
- Test: Unit tests, integration tests, security scans
- Build: Docker image build and push to registry
- Security: Container vulnerability scanning
- Deploy: Automated deployment to staging/production
- Verify: Smoke tests and health checks

Manual Deployment

# Trigger manual deployment
gh workflow run "CI/CD Pipeline" \
  --field deploy_environment=production \
  --field image_tag=v1.0.0

⚙️ Configuration

Environment Variables

Critical Configuration

# Discord
DISCORD_BOT_TOKEN=required          # Bot token from Discord Developer Portal
DISCORD_CLIENT_ID=required          # Application ID
DISCORD_GUILD_ID=optional          # Primary guild for testing

# Database
POSTGRES_PASSWORD=auto-generated    # Secure password
REDIS_PASSWORD=auto-generated       # Secure password

# AI Providers (at least one required)
OPENAI_API_KEY=optional
ANTHROPIC_API_KEY=optional
GROQ_API_KEY=optional

Feature Flags

FEATURE_VOICE_RECORDING=true        # Enable voice recording
FEATURE_SPEAKER_RECOGNITION=true    # Enable speaker ID
FEATURE_LAUGHTER_DETECTION=true     # Enable laughter analysis
FEATURE_QUOTE_EXPLANATION=true      # Enable AI explanations
FEATURE_FEEDBACK_SYSTEM=true        # Enable RLHF feedback
FEATURE_MEMORY_SYSTEM=true          # Enable long-term memory
FEATURE_TTS=true                    # Enable text-to-speech

Performance Tuning

MAX_WORKERS=6                       # Worker processes
MAX_MEMORY_MB=6144                 # Memory limit
AUDIO_PROCESSING_THREADS=4          # Audio processing
MAX_CONCURRENT_TASKS=100            # Async task limit

SSL/TLS Configuration

Generate Self-Signed Certificates (Development)

mkdir -p config/ssl
openssl req -x509 -newkey rsa:4096 -keyout config/ssl/bot.key -out config/ssl/bot.crt -days 365 -nodes

Production Certificates (Let's Encrypt)

# Using certbot
certbot certonly --standalone -d your-domain.com
cp /etc/letsencrypt/live/your-domain.com/fullchain.pem config/ssl/bot.crt
cp /etc/letsencrypt/live/your-domain.com/privkey.pem config/ssl/bot.key

📊 Monitoring & Observability

Metrics Collection

Prometheus Metrics

Application metrics: /metrics endpoint
System metrics: Node Exporter
Custom business metrics: Quote analysis, user engagement

Key Metrics to Monitor

# Application Health
- bot_requests_total
- bot_errors_total  
- bot_response_time_seconds
- bot_quotes_processed_total

# System Resources
- node_cpu_seconds_total
- node_memory_MemAvailable_bytes
- node_filesystem_avail_bytes

# Database Performance
- postgres_connections_active
- postgres_query_duration_seconds
- redis_connected_clients

Grafana Dashboards

Pre-configured Dashboards

Bot Overview: General health and performance
AI Performance: AI provider metrics and response times
Database Health: PostgreSQL and Redis metrics
System Resources: CPU, memory, disk, network
User Engagement: Quote statistics and user activity

Custom Alerts

# High Error Rate
- alert: HighErrorRate
  expr: rate(bot_errors_total[5m]) > 0.1
  for: 2m
  
# High Memory Usage  
- alert: HighMemoryUsage
  expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) > 0.85
  for: 5m

# Database Connection Issues
- alert: DatabaseConnectionFailure
  expr: postgres_up == 0
  for: 1m

Log Management

Structured Logging

{
  "timestamp": "2024-01-15T10:30:00Z",
  "level": "INFO",
  "service": "discord-bot",
  "component": "quote_analyzer",
  "user_id": "12345",
  "guild_id": "67890",
  "message": "Quote analyzed successfully",
  "metadata": {
    "quote_id": 123,
    "funny_score": 8.5,
    "processing_time": 0.245
  },
  "correlation_id": "abc123",
  "request_id": "req_456"
}

Log Aggregation

# Docker logs with rotation
docker-compose logs --tail=1000 -f bot | tee logs/bot.log

# Kubernetes logs
kubectl logs -l app=discord-bot -n discord-quote-bot --tail=1000 -f

# Search logs with jq
docker-compose logs bot | jq 'select(.level=="ERROR")'

🔒 Security

Security Best Practices

Container Security

# Non-root user
USER appuser

# Read-only filesystem where possible
--read-only

# No new privileges
--security-opt no-new-privileges:true

# Drop capabilities
--cap-drop ALL

Network Security

# Network policies (Kubernetes)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: discord-bot-netpol
spec:
  podSelector:
    matchLabels:
      app: discord-bot
  policyTypes:
  - Ingress
  - Egress

Secrets Management

# Use external secret management
kubectl create secret generic bot-secrets \
  --from-file=discord-token=./secrets/discord_token.txt \
  --from-file=postgres-password=./secrets/postgres_password.txt

# Rotate secrets regularly
./scripts/rotate-secrets.sh

Security Monitoring

Vulnerability Scanning

# Scan Docker images
trivy image discord-quote-bot:latest

# Scan filesystem
trivy fs .

# Continuous monitoring
trivy server --listen 0.0.0.0:8080

Security Alerts

Failed authentication attempts
Unusual API usage patterns
Resource abuse detection
Data access anomalies

🔧 Troubleshooting

Common Issues

Bot Not Starting

# Check logs
docker-compose logs bot

# Common causes:
# 1. Invalid Discord token
# 2. Database connection failure
# 3. Missing environment variables
# 4. Port conflicts

# Verify configuration
docker-compose exec bot python -c "
import os
print('Discord token:', 'VALID' if os.getenv('DISCORD_BOT_TOKEN') else 'MISSING')
"

Database Connection Issues

# Test PostgreSQL connection
docker-compose exec postgres psql -U quotes_user -d quotes_db -c "SELECT 1;"

# Check Redis connection  
docker-compose exec redis redis-cli ping

# View connection logs
docker-compose logs postgres
docker-compose logs redis

High Resource Usage

# Monitor resource usage
docker stats

# Check system resources
htop
iotop
nethogs

# Optimize configuration
# Reduce worker processes, adjust memory limits

AI Provider Issues

# Test API connections
curl -H "Authorization: Bearer $OPENAI_API_KEY" \
  https://api.openai.com/v1/models

# Check rate limits
# Monitor error rates in metrics

# Enable fallback providers
FEATURE_AI_FALLBACK=true

Debug Mode

Enable Debug Logging

# Temporary debug mode
docker-compose exec bot python main.py --debug

# Persistent debug configuration
LOG_LEVEL=DEBUG docker-compose up -d bot

Health Check Debugging

# Detailed health status
curl -s http://localhost:8080/health/detailed | jq

# Component-specific checks
curl -s http://localhost:8080/health/database
curl -s http://localhost:8080/health/redis
curl -s http://localhost:8080/health/ai-providers

💾 Backup & Recovery

Automated Backups

Database Backups

# PostgreSQL backup
docker-compose exec postgres pg_dump -U quotes_user quotes_db > backups/postgres_$(date +%Y%m%d_%H%M%S).sql

# Redis backup
docker-compose exec redis redis-cli --rdb backups/redis_$(date +%Y%m%d_%H%M%S).rdb

# Automated backup script
./scripts/backup.sh

Vector Database Backup

# Qdrant snapshot
curl -X POST "http://localhost:6333/collections/quotes_memory/snapshots"

# Copy snapshot files
cp -r data/qdrant/snapshots backups/qdrant_$(date +%Y%m%d_%H%M%S)

Recovery Procedures

Database Recovery

# PostgreSQL restore
docker-compose exec postgres psql -U quotes_user quotes_db < backups/postgres_20240115_120000.sql

# Redis restore
docker-compose stop redis
cp backups/redis_20240115_120000.rdb data/redis/dump.rdb
docker-compose start redis

# Qdrant restore
docker-compose stop qdrant
rm -rf data/qdrant/storage/*
cp -r backups/qdrant_20240115_120000/* data/qdrant/storage/
docker-compose start qdrant

Disaster Recovery

# Complete system restore
./scripts/restore.sh backups/full_backup_20240115.tar.gz

# Verify restoration
./scripts/verify-restore.sh

📈 Scaling

Horizontal Scaling

Docker Compose Scaling

# Scale bot service
docker-compose -f docker-compose.production.yml up -d --scale bot=3

# Load balancer configuration required
# Update nginx.conf for multiple bot instances

Kubernetes Scaling

# Manual scaling
kubectl scale deployment discord-bot --replicas=5 -n discord-quote-bot

# Auto-scaling (HPA configured)
kubectl get hpa -n discord-quote-bot

# Cluster scaling (if using cloud provider)
kubectl top nodes

Vertical Scaling

Resource Adjustments

# Update docker-compose.yml
deploy:
  resources:
    limits:
      memory: 8G      # Increased from 6G
      cpus: '4'       # Increased from 3
    reservations:
      memory: 4G      # Increased from 3G  
      cpus: '2'       # Increased from 1.5

Performance Optimization

# Tune PostgreSQL
shared_buffers = 1GB
effective_cache_size = 4GB
maintenance_work_mem = 256MB

# Tune Redis
maxmemory 4gb
maxmemory-policy allkeys-lru

# Optimize bot configuration
MAX_WORKERS=8
MAX_CONCURRENT_TASKS=200
AUDIO_PROCESSING_THREADS=8

🔄 Maintenance

Regular Maintenance Tasks

Weekly

# Update dependencies
./scripts/update-dependencies.sh

# Clean up old logs
find logs/ -name "*.log" -mtime +7 -delete

# Vacuum database
docker-compose exec postgres psql -U quotes_user quotes_db -c "VACUUM ANALYZE;"

# Clean Docker images
docker system prune -f

Monthly

# Security updates
./scripts/security-updates.sh

# Backup verification
./scripts/verify-backups.sh

# Performance analysis
./scripts/performance-report.sh

# Certificate renewal (if using Let's Encrypt)
certbot renew --dry-run

Quarterly

# Full security audit
./scripts/security-audit.sh

# Capacity planning review
./scripts/capacity-analysis.sh

# Disaster recovery test
./scripts/dr-test.sh

# Dependencies audit
pip-audit
safety check

Update Procedures

Application Updates

# Automated update via CI/CD
git tag v1.1.0
git push origin v1.1.0

# Manual update
./scripts/deploy.sh update

# Rollback if needed
kubectl rollout undo deployment/discord-bot -n discord-quote-bot

Security Patches

# System updates
apt update && apt upgrade -y

# Container base image updates  
docker pull python:3.11-slim
./scripts/deploy.sh build

# Dependency updates
pip install --upgrade -r requirements.txt

📞 Support

Documentation

Monitoring

Grafana Dashboards: http://localhost:3000
Prometheus Metrics: http://localhost:9090
Application Health: http://localhost:8080/health

Emergency Contacts

On-call Engineer: [Your contact info]
Infrastructure Team: [Team contact]
Security Team: [Security contact]

Useful Commands Quick Reference

# Status and health
./scripts/deploy.sh status
./scripts/deploy.sh health
curl -f http://localhost:8080/health

# Logs and debugging
./scripts/deploy.sh logs bot
kubectl logs -l app=discord-bot -n discord-quote-bot -f

# Backup and restore
./scripts/deploy.sh backup
./scripts/restore.sh backup_file.tar.gz

# Scaling and updates
kubectl scale deployment discord-bot --replicas=5 -n discord-quote-bot
./scripts/deploy.sh update

This deployment guide provides comprehensive instructions for deploying and managing the Discord Voice Chat Quote Bot in production environments. Follow the security best practices and monitoring guidelines to ensure reliable operation.

18 KiB Raw Permalink Blame History