Scalability(CLAUDE-MD-Scalability)

Scale-focused architecture

Growing applicationsLoad balancingcaching strategies

Claude Code Configuration for Scalability-Focused Development

🌐 CRITICAL: Build for Scale from Day One

MANDATORY RULE: Every component must be designed for horizontal scaling:

Stateless Services → No local state, all state externalized
Load Distribution → Balance across multiple instances
Cache Everything → Multi-layer caching strategy
Queue Heavy Work → Async processing for intensive tasks
Monitor at Scale → Distributed tracing and metrics

🚀 Scalability Swarm Configuration

Initialize Scale-Ready Swarm

// ✅ CORRECT: Scalability-focused swarm setup
[Single Message with BatchTool]:
  // Distributed topology for scalability
  mcp__claude-flow__swarm_init { 
    topology: "mesh",      // Distributed coordination
    maxAgents: 8,          // Scale team
    strategy: "adaptive"   // Dynamic scaling
  }
  
  // Specialized scalability agents
  mcp__claude-flow__agent_spawn { type: "architect", name: "Scale Architect" }
  mcp__claude-flow__agent_spawn { type: "specialist", name: "Load Balancer Expert" }
  mcp__claude-flow__agent_spawn { type: "specialist", name: "Cache Engineer" }
  mcp__claude-flow__agent_spawn { type: "specialist", name: "Queue Specialist" }
  mcp__claude-flow__agent_spawn { type: "specialist", name: "Database Scaler" }
  mcp__claude-flow__agent_spawn { type: "monitor", name: "Scale Monitor" }
  mcp__claude-flow__agent_spawn { type: "optimizer", name: "Performance Tuner" }
  mcp__claude-flow__agent_spawn { type: "coordinator", name: "Scale Lead" }
  
  // Scalability todos
  TodoWrite { todos: [
    {id: "arch", content: "Design scalable architecture", status: "in_progress", priority: "high"},
    {id: "stateless", content: "Make services stateless", status: "pending", priority: "high"},
    {id: "loadbalance", content: "Implement load balancing", status: "pending", priority: "high"},
    {id: "cache-layers", content: "Setup multi-layer caching", status: "pending", priority: "high"},
    {id: "queue-system", content: "Implement message queuing", status: "pending", priority: "high"},
    {id: "db-sharding", content: "Design database sharding", status: "pending", priority: "high"},
    {id: "auto-scale", content: "Setup auto-scaling", status: "pending", priority: "medium"},
    {id: "circuit-break", content: "Add circuit breakers", status: "pending", priority: "medium"},
    {id: "rate-limit", content: "Implement rate limiting", status: "pending", priority: "medium"},
    {id: "monitoring", content: "Setup distributed monitoring", status: "pending", priority: "high"}
  ]}

🏗️ Scalable Architecture Patterns

Microservices Architecture

// Scale Architect Agent Task
Task(`You are the Scale Architect agent designing for massive scale.

MANDATORY COORDINATION:
1. START: npx claude-flow@alpha hooks pre-task --description "Designing scalable architecture"
2. DESIGN: Create horizontally scalable service architecture
3. STORE: npx claude-flow@alpha hooks notify --message "Architecture decisions: [patterns]"
4. VALIDATE: Ensure all components can scale independently

IMPLEMENT:
- Service mesh architecture
- API gateway pattern
- Event-driven communication
- CQRS for read/write separation
- Saga pattern for distributed transactions
- Service discovery and registration
`)

// Implementation structure
[BatchTool]:
  // Create microservice structure
  Bash("mkdir -p services/{api-gateway,user-service,order-service,payment-service,notification-service}")
  Bash("mkdir -p infrastructure/{service-mesh,load-balancer,message-queue,cache}")
  
  // API Gateway
  Write("services/api-gateway/index.js", `
    const express = require('express');
    const httpProxy = require('http-proxy-middleware');
    const CircuitBreaker = require('opossum');
    const RateLimiter = require('express-rate-limit');
    
    const app = express();
    
    // Rate limiting
    const limiter = RateLimiter({
      windowMs: 1 * 60 * 1000, // 1 minute
      max: 100, // limit each IP to 100 requests per windowMs
      message: 'Too many requests',
      standardHeaders: true,
      legacyHeaders: false,
    });
    
    app.use(limiter);
    
    // Service registry
    const services = {
      users: process.env.USER_SERVICE_URL || 'http://user-service:3001',
      orders: process.env.ORDER_SERVICE_URL || 'http://order-service:3002',
      payments: process.env.PAYMENT_SERVICE_URL || 'http://payment-service:3003'
    };
    
    // Circuit breaker for each service
    const breakers = {};
    for (const [name, url] of Object.entries(services)) {
      breakers[name] = new CircuitBreaker(
        httpProxy.createProxyMiddleware({
          target: url,
          changeOrigin: true
        }),
        {
          timeout: 3000,
          errorThresholdPercentage: 50,
          resetTimeout: 30000
        }
      );
    }
    
    // Route to services
    app.use('/api/users', breakers.users);
    app.use('/api/orders', breakers.orders);
    app.use('/api/payments', breakers.payments);
    
    // Health check
    app.get('/health', (req, res) => {
      const health = {
        status: 'healthy',
        services: {}
      };
      
      for (const [name, breaker] of Object.entries(breakers)) {
        health.services[name] = {
          state: breaker.stats.state,
          failures: breaker.stats.failures,
          successes: breaker.stats.successes
        };
      }
      
      res.json(health);
    });
  `)

⚖️ Load Balancing Strategies

Multi-Layer Load Balancing

// Load Balancer Expert Agent
Task(`You are the Load Balancer Expert implementing distribution strategies.

IMPLEMENT:
1. DNS load balancing for geographic distribution
2. Layer 4 load balancing for TCP/UDP
3. Layer 7 load balancing for HTTP/HTTPS
4. Consistent hashing for session affinity
5. Health checks and automatic failover
`)

// HAProxy configuration
Write("infrastructure/load-balancer/haproxy.cfg", `
global
    maxconn 50000
    log stdout local0
    user haproxy
    group haproxy
    daemon
    
    # Performance tuning
    tune.ssl.default-dh-param 2048
    tune.bufsize 32768
    
defaults
    mode http
    log global
    option httplog
    option dontlognull
    option http-server-close
    option forwardfor except 127.0.0.0/8
    option redispatch
    retries 3
    timeout http-request 10s
    timeout queue 1m
    timeout connect 10s
    timeout client 1m
    timeout server 1m
    timeout http-keep-alive 10s
    timeout check 10s
    maxconn 10000

# Statistics
stats enable
stats uri /stats
stats refresh 30s

# API Backend
backend api_servers
    balance roundrobin
    option httpchk GET /health
    
    # Dynamic server addition via API
    server-template api 10 _api._tcp.service.consul resolvers consul resolve-prefer ipv4 check

    # Stick sessions using cookies
    cookie SERVERID insert indirect nocache
    
    # Circuit breaker behavior
    option allbackups
    
frontend api_frontend
    bind *:80
    bind *:443 ssl crt /etc/ssl/certs/api.pem
    
    # Rate limiting
    stick-table type ip size 100k expire 30s store http_req_rate(10s)
    http-request track-sc0 src
    http-request deny if { sc_http_req_rate(0) gt 100 }
    
    # Compression
    compression algo gzip
    compression type text/html text/plain text/css application/json
    
    # Security headers
    http-response set-header X-Frame-Options DENY
    http-response set-header X-Content-Type-Options nosniff
    
    default_backend api_servers
`)

// Nginx load balancer alternative
Write("infrastructure/load-balancer/nginx.conf", `
upstream api_backend {
    least_conn;  # Least connections algorithm
    
    # Health checks
    zone backend 64k;
    
    # Servers with weights
    server api1.example.com:3000 weight=5 max_fails=3 fail_timeout=30s;
    server api2.example.com:3000 weight=5 max_fails=3 fail_timeout=30s;
    server api3.example.com:3000 weight=3 max_fails=3 fail_timeout=30s;
    
    # Backup servers
    server api-backup1.example.com:3000 backup;
    server api-backup2.example.com:3000 backup;
    
    # Keep alive connections
    keepalive 32;
}

server {
    listen 80;
    listen 443 ssl http2;
    
    # SSL configuration
    ssl_certificate /etc/nginx/ssl/cert.pem;
    ssl_certificate_key /etc/nginx/ssl/key.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    
    # Rate limiting
    limit_req_zone $binary_remote_addr zone=api:10m rate=100r/s;
    limit_req zone=api burst=200 nodelay;
    
    # Caching
    proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=api_cache:10m max_size=1g;
    proxy_cache api_cache;
    proxy_cache_valid 200 1m;
    proxy_cache_use_stale error timeout invalid_header updating;
    
    location / {
        proxy_pass http://api_backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        
        # Headers
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Timeouts
        proxy_connect_timeout 5s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
        
        # Buffering
        proxy_buffering on;
        proxy_buffer_size 4k;
        proxy_buffers 8 4k;
    }
    
    # Health check endpoint
    location /health {
        access_log off;
        return 200 "healthy\\n";
        add_header Content-Type text/plain;
    }
}
`)

💾 Multi-Layer Caching Strategy

Comprehensive Caching Implementation

// Cache Engineer Agent
Task(`You are the Cache Engineer implementing multi-layer caching.

MANDATORY COORDINATION:
1. Browser cache with proper headers
2. CDN cache for static assets
3. Application-level memory cache
4. Redis for distributed cache
5. Database query cache

OPTIMIZE:
- Cache invalidation strategies
- Cache warming techniques
- Cache hit ratio monitoring
`)

// Caching infrastructure
[BatchTool]:
  Write("infrastructure/cache/cache-manager.js", `
    const Redis = require('ioredis');
    const LRU = require('lru-cache');
    
    class CacheManager {
      constructor() {
        // Local memory cache (L1)
        this.memoryCache = new LRU({
          max: 500,
          maxAge: 1000 * 60 * 5, // 5 minutes
          updateAgeOnGet: true
        });
        
        // Redis cache (L2)
        this.redisCache = new Redis.Cluster([
          { host: 'redis-node-1', port: 6379 },
          { host: 'redis-node-2', port: 6379 },
          { host: 'redis-node-3', port: 6379 }
        ], {
          enableReadyCheck: true,
          maxRetriesPerRequest: 3,
          retryDelayOnFailover: 100,
          retryDelayOnClusterDown: 300,
        });
        
        // Cache statistics
        this.stats = {
          hits: 0,
          misses: 0,
          l1Hits: 0,
          l2Hits: 0
        };
      }
      
      async get(key, options = {}) {
        // Check L1 cache
        const l1Value = this.memoryCache.get(key);
        if (l1Value !== undefined) {
          this.stats.hits++;
          this.stats.l1Hits++;
          return l1Value;
        }
        
        // Check L2 cache
        try {
          const l2Value = await this.redisCache.get(key);
          if (l2Value) {
            this.stats.hits++;
            this.stats.l2Hits++;
            const parsed = JSON.parse(l2Value);
            
            // Populate L1 cache
            this.memoryCache.set(key, parsed);
            return parsed;
          }
        } catch (error) {
          console.error('Redis error:', error);
        }
        
        this.stats.misses++;
        return null;
      }
      
      async set(key, value, options = {}) {
        const ttl = options.ttl || 3600; // 1 hour default
        
        // Set in L1 cache
        this.memoryCache.set(key, value);
        
        // Set in L2 cache
        try {
          await this.redisCache.setex(
            key,
            ttl,
            JSON.stringify(value)
          );
        } catch (error) {
          console.error('Redis set error:', error);
        }
      }
      
      async invalidate(pattern) {
        // Clear from L1 cache
        const keys = this.memoryCache.keys();
        for (const key of keys) {
          if (key.match(pattern)) {
            this.memoryCache.del(key);
          }
        }
        
        // Clear from L2 cache
        try {
          const stream = this.redisCache.scanStream({
            match: pattern,
            count: 100
          });
          
          stream.on('data', async (keys) => {
            if (keys.length) {
              await this.redisCache.del(...keys);
            }
          });
        } catch (error) {
          console.error('Redis invalidation error:', error);
        }
      }
      
      getHitRatio() {
        const total = this.stats.hits + this.stats.misses;
        return total > 0 ? this.stats.hits / total : 0;
      }
    }
  `)
  
  // CDN configuration
  Write("infrastructure/cache/cdn-config.js", `
    // CloudFront configuration example
    const cloudFrontConfig = {
      DefaultCacheBehavior: {
        TargetOriginId: 'api-origin',
        ViewerProtocolPolicy: 'redirect-to-https',
        AllowedMethods: ['GET', 'HEAD', 'OPTIONS', 'PUT', 'POST', 'PATCH', 'DELETE'],
        CachedMethods: ['GET', 'HEAD', 'OPTIONS'],
        Compress: true,
        
        ForwardedValues: {
          QueryString: true,
          Headers: ['Authorization', 'CloudFront-Forwarded-Proto'],
          Cookies: { Forward: 'none' }
        },
        
        MinTTL: 0,
        DefaultTTL: 86400,
        MaxTTL: 31536000
      },
      
      CacheBehaviors: [
        {
          PathPattern: '/api/*',
          TargetOriginId: 'api-origin',
          ViewerProtocolPolicy: 'https-only',
          MinTTL: 0,
          DefaultTTL: 0,
          MaxTTL: 0,
          ForwardedValues: {
            QueryString: true,
            Headers: ['*'],
            Cookies: { Forward: 'all' }
          }
        },
        {
          PathPattern: '/static/*',
          TargetOriginId: 's3-origin',
          ViewerProtocolPolicy: 'redirect-to-https',
          MinTTL: 86400,
          DefaultTTL: 604800,
          MaxTTL: 31536000,
          Compress: true
        }
      ]
    };
  `)

📬 Message Queue Architecture

Scalable Async Processing

// Queue Specialist Agent
Task(`You are the Queue Specialist implementing distributed message processing.

IMPLEMENT:
1. Message broker setup (RabbitMQ/Kafka)
2. Work queue patterns
3. Pub/sub for event distribution
4. Dead letter queues
5. Message persistence and replay
`)

// RabbitMQ implementation
[BatchTool]:
  Write("infrastructure/message-queue/rabbitmq-manager.js", `
    const amqp = require('amqplib');
    
    class RabbitMQManager {
      constructor(urls) {
        this.urls = urls; // Array of RabbitMQ URLs for HA
        this.connection = null;
        this.channel = null;
        this.consumers = new Map();
      }
      
      async connect() {
        try {
          // Connect to cluster
          this.connection = await amqp.connect(this.urls, {
            heartbeat: 60,
            reconnect: true,
            reconnectDelay: 5000
          });
          
          this.channel = await this.connection.createChannel();
          
          // Set prefetch for work distribution
          await this.channel.prefetch(10);
          
          // Handle connection events
          this.connection.on('error', this.handleError.bind(this));
          this.connection.on('close', this.handleClose.bind(this));
          
        } catch (error) {
          console.error('RabbitMQ connection error:', error);
          throw error;
        }
      }
      
      async createQueue(name, options = {}) {
        const queueOptions = {
          durable: true,
          exclusive: false,
          autoDelete: false,
          arguments: {
            'x-message-ttl': 86400000, // 24 hours
            'x-max-length': 1000000,
            'x-overflow': 'reject-publish',
            ...options.arguments
          }
        };
        
        await this.channel.assertQueue(name, queueOptions);
        
        // Create dead letter queue
        const dlqName = \`\${name}.dlq\`;
        await this.channel.assertQueue(dlqName, {
          durable: true,
          arguments: {
            'x-message-ttl': 604800000 // 7 days
          }
        });
        
        // Bind main queue to DLQ
        await this.channel.bindQueue(
          name,
          '',
          name,
          { 'x-dead-letter-exchange': '', 'x-dead-letter-routing-key': dlqName }
        );
      }
      
      async publish(queue, message, options = {}) {
        const messageBuffer = Buffer.from(JSON.stringify(message));
        
        const publishOptions = {
          persistent: true,
          timestamp: Date.now(),
          contentType: 'application/json',
          ...options
        };
        
        return this.channel.sendToQueue(queue, messageBuffer, publishOptions);
      }
      
      async consume(queue, handler, options = {}) {
        const consumerTag = await this.channel.consume(
          queue,
          async (msg) => {
            if (!msg) return;
            
            try {
              const content = JSON.parse(msg.content.toString());
              await handler(content, msg);
              
              // Acknowledge message
              this.channel.ack(msg);
            } catch (error) {
              console.error('Message processing error:', error);
              
              // Reject and send to DLQ after max retries
              const retries = (msg.properties.headers['x-retries'] || 0) + 1;
              if (retries >= 3) {
                this.channel.reject(msg, false);
              } else {
                // Republish with retry count
                await this.publish(queue, JSON.parse(msg.content), {
                  headers: { 'x-retries': retries }
                });
                this.channel.ack(msg);
              }
            }
          },
          { noAck: false, ...options }
        );
        
        this.consumers.set(queue, consumerTag);
      }
      
      async createExchange(name, type = 'topic', options = {}) {
        await this.channel.assertExchange(name, type, {
          durable: true,
          autoDelete: false,
          ...options
        });
      }
      
      // Pub/Sub pattern
      async publishToExchange(exchange, routingKey, message) {
        const messageBuffer = Buffer.from(JSON.stringify(message));
        
        return this.channel.publish(
          exchange,
          routingKey,
          messageBuffer,
          { persistent: true }
        );
      }
    }
  `)
  
  // Kafka alternative for high throughput
  Write("infrastructure/message-queue/kafka-manager.js", `
    const { Kafka } = require('kafkajs');
    
    class KafkaManager {
      constructor(brokers) {
        this.kafka = new Kafka({
          clientId: 'scalable-app',
          brokers: brokers,
          connectionTimeout: 10000,
          requestTimeout: 30000,
          retry: {
            initialRetryTime: 100,
            retries: 8
          }
        });
        
        this.producer = null;
        this.consumers = new Map();
      }
      
      async initProducer() {
        this.producer = this.kafka.producer({
          allowAutoTopicCreation: false,
          transactionTimeout: 30000,
          idempotent: true,
          maxInFlightRequests: 5,
          compression: 1 // GZIP
        });
        
        await this.producer.connect();
      }
      
      async produce(topic, messages, options = {}) {
        const records = messages.map(msg => ({
          key: msg.key || null,
          value: JSON.stringify(msg.value),
          partition: msg.partition,
          headers: msg.headers
        }));
        
        await this.producer.send({
          topic,
          messages: records,
          acks: -1, // Wait for all replicas
          timeout: 30000,
          compression: 1
        });
      }
      
      async createConsumer(groupId, topics, handler) {
        const consumer = this.kafka.consumer({
          groupId,
          sessionTimeout: 30000,
          rebalanceTimeout: 60000,
          heartbeatInterval: 3000,
          maxBytesPerPartition: 1048576, // 1MB
          maxWaitTimeInMs: 5000
        });
        
        await consumer.connect();
        await consumer.subscribe({
          topics,
          fromBeginning: false
        });
        
        await consumer.run({
          autoCommit: false,
          eachMessage: async ({ topic, partition, message }) => {
            try {
              const value = JSON.parse(message.value.toString());
              await handler({
                topic,
                partition,
                offset: message.offset,
                key: message.key?.toString(),
                value,
                headers: message.headers,
                timestamp: message.timestamp
              });
              
              // Commit offset after successful processing
              await consumer.commitOffsets([{
                topic,
                partition,
                offset: (parseInt(message.offset) + 1).toString()
              }]);
            } catch (error) {
              console.error('Kafka message processing error:', error);
              // Implement retry logic or DLQ
            }
          }
        });
        
        this.consumers.set(groupId, consumer);
      }
    }
  `)

🗄️ Database Scaling Strategies

Horizontal Database Scaling

// Database Scaler Agent
Task(`You are the Database Scaler implementing horizontal scaling patterns.

IMPLEMENT:
1. Read replicas for load distribution
2. Database sharding strategies
3. Connection pooling optimization
4. Query result caching
5. Write-through caching
`)

// Database sharding implementation
[BatchTool]:
  Write("infrastructure/database/shard-manager.js", `
    const crypto = require('crypto');
    
    class ShardManager {
      constructor(shards) {
        this.shards = shards; // Array of database connections
        this.shardCount = shards.length;
        this.consistentHash = new ConsistentHash(shards);
      }
      
      // Hash-based sharding
      getShardByKey(key) {
        const hash = crypto.createHash('md5').update(key).digest('hex');
        const shardIndex = parseInt(hash.substring(0, 8), 16) % this.shardCount;
        return this.shards[shardIndex];
      }
      
      // Range-based sharding
      getShardByRange(value) {
        const rangeSize = Math.ceil(Number.MAX_SAFE_INTEGER / this.shardCount);
        const shardIndex = Math.floor(value / rangeSize);
        return this.shards[Math.min(shardIndex, this.shardCount - 1)];
      }
      
      // Geographic sharding
      getShardByRegion(region) {
        const regionMap = {
          'us-east': this.shards[0],
          'us-west': this.shards[1],
          'eu-west': this.shards[2],
          'asia-pacific': this.shards[3]
        };
        return regionMap[region] || this.shards[0];
      }
      
      // Execute query on specific shard
      async query(shardKey, sql, params) {
        const shard = this.getShardByKey(shardKey);
        return shard.query(sql, params);
      }
      
      // Execute query on all shards (scatter-gather)
      async queryAll(sql, params) {
        const promises = this.shards.map(shard => 
          shard.query(sql, params)
        );
        
        const results = await Promise.all(promises);
        return this.mergeResults(results);
      }
      
      // Cross-shard join (simplified)
      async crossShardJoin(table1, table2, joinKey) {
        // Step 1: Get data from all shards
        const table1Data = await this.queryAll(
          \`SELECT * FROM \${table1}\`
        );
        
        const table2Data = await this.queryAll(
          \`SELECT * FROM \${table2}\`
        );
        
        // Step 2: Perform join in application
        const joined = [];
        for (const row1 of table1Data) {
          for (const row2 of table2Data) {
            if (row1[joinKey] === row2[joinKey]) {
              joined.push({ ...row1, ...row2 });
            }
          }
        }
        
        return joined;
      }
      
      // Rebalancing data across shards
      async rebalance() {
        // Implementation for moving data between shards
        // This is complex and should be done carefully
      }
    }
    
    // Consistent hashing for dynamic shard addition/removal
    class ConsistentHash {
      constructor(nodes, virtualNodes = 150) {
        this.nodes = nodes;
        this.virtualNodes = virtualNodes;
        this.ring = new Map();
        this.sortedKeys = [];
        
        this.buildRing();
      }
      
      buildRing() {
        for (const node of this.nodes) {
          for (let i = 0; i < this.virtualNodes; i++) {
            const virtualKey = \`\${node.id}-\${i}\`;
            const hash = this.hash(virtualKey);
            this.ring.set(hash, node);
          }
        }
        
        this.sortedKeys = Array.from(this.ring.keys()).sort((a, b) => a - b);
      }
      
      hash(key) {
        return crypto.createHash('md5').update(key).digest().readUInt32BE(0);
      }
      
      getNode(key) {
        const hash = this.hash(key);
        
        // Binary search for the first key >= hash
        let left = 0;
        let right = this.sortedKeys.length - 1;
        
        while (left <= right) {
          const mid = Math.floor((left + right) / 2);
          if (this.sortedKeys[mid] === hash) {
            return this.ring.get(this.sortedKeys[mid]);
          } else if (this.sortedKeys[mid] < hash) {
            left = mid + 1;
          } else {
            right = mid - 1;
          }
        }
        
        // Wrap around to the first node
        const index = left >= this.sortedKeys.length ? 0 : left;
        return this.ring.get(this.sortedKeys[index]);
      }
    }
  `)
  
  // Read replica configuration
  Write("infrastructure/database/read-replica-manager.js", `
    class ReadReplicaManager {
      constructor(master, replicas) {
        this.master = master;
        this.replicas = replicas;
        this.currentReplica = 0;
        this.healthChecks = new Map();
        
        this.startHealthChecks();
      }
      
      // Round-robin read distribution
      getReadConnection() {
        const healthyReplicas = this.replicas.filter(replica => 
          this.healthChecks.get(replica.id) !== false
        );
        
        if (healthyReplicas.length === 0) {
          console.warn('No healthy replicas, falling back to master');
          return this.master;
        }
        
        const replica = healthyReplicas[this.currentReplica % healthyReplicas.length];
        this.currentReplica++;
        return replica;
      }
      
      // Always write to master
      getWriteConnection() {
        return this.master;
      }
      
      // Execute read query with replica
      async read(sql, params) {
        const connection = this.getReadConnection();
        try {
          return await connection.query(sql, params);
        } catch (error) {
          console.error('Read replica error:', error);
          // Fallback to master
          return this.master.query(sql, params);
        }
      }
      
      // Execute write query on master
      async write(sql, params) {
        return this.master.query(sql, params);
      }
      
      // Health checking
      startHealthChecks() {
        setInterval(() => {
          this.replicas.forEach(async (replica) => {
            try {
              await replica.query('SELECT 1');
              this.healthChecks.set(replica.id, true);
            } catch (error) {
              console.error(\`Replica \${replica.id} health check failed:\`, error);
              this.healthChecks.set(replica.id, false);
            }
          });
        }, 5000);
      }
      
      // Lag monitoring
      async getReplicationLag() {
        const lags = await Promise.all(
          this.replicas.map(async (replica) => {
            try {
              const result = await replica.query('SHOW SLAVE STATUS');
              return {
                replica: replica.id,
                lag: result[0].Seconds_Behind_Master
              };
            } catch (error) {
              return {
                replica: replica.id,
                lag: null,
                error: error.message
              };
            }
          })
        );
        
        return lags;
      }
    }
  `)

🚀 Auto-Scaling Implementation

Dynamic Resource Scaling

// Auto-scaling configuration
[BatchTool]:
  Write("infrastructure/auto-scaling/scaler.js", `
    const AWS = require('aws-sdk');
    const k8s = require('@kubernetes/client-node');
    
    class AutoScaler {
      constructor(config) {
        this.config = config;
        this.metrics = new Map();
        this.scalingInProgress = false;
        
        // AWS Auto Scaling
        this.autoScaling = new AWS.AutoScaling();
        
        // Kubernetes client
        const kc = new k8s.KubeConfig();
        kc.loadFromDefault();
        this.k8sApi = kc.makeApiClient(k8s.AppsV1Api);
        this.k8sMetrics = kc.makeApiClient(k8s.MetricsV1beta1Api);
      }
      
      // Monitor metrics and trigger scaling
      async monitorAndScale() {
        if (this.scalingInProgress) return;
        
        try {
          const metrics = await this.collectMetrics();
          const scalingDecision = this.makeScalingDecision(metrics);
          
          if (scalingDecision.action !== 'none') {
            this.scalingInProgress = true;
            await this.executeScaling(scalingDecision);
            this.scalingInProgress = false;
          }
        } catch (error) {
          console.error('Auto-scaling error:', error);
          this.scalingInProgress = false;
        }
      }
      
      async collectMetrics() {
        const metrics = {
          cpu: await this.getCPUMetrics(),
          memory: await this.getMemoryMetrics(),
          requestRate: await this.getRequestRate(),
          responseTime: await this.getResponseTime(),
          queueLength: await this.getQueueLength()
        };
        
        // Store metrics history
        this.metrics.set(Date.now(), metrics);
        
        // Keep only last 5 minutes
        const fiveMinutesAgo = Date.now() - 300000;
        for (const [timestamp] of this.metrics) {
          if (timestamp < fiveMinutesAgo) {
            this.metrics.delete(timestamp);
          }
        }
        
        return metrics;
      }
      
      makeScalingDecision(metrics) {
        const { 
          cpuThreshold = 70,
          memoryThreshold = 80,
          responseTimeThreshold = 1000,
          minInstances = 2,
          maxInstances = 20
        } = this.config;
        
        // Get current instance count
        const currentInstances = this.getCurrentInstances();
        
        // Scale up conditions
        if (
          metrics.cpu > cpuThreshold ||
          metrics.memory > memoryThreshold ||
          metrics.responseTime > responseTimeThreshold
        ) {
          const targetInstances = Math.min(
            currentInstances + Math.ceil(currentInstances * 0.5),
            maxInstances
          );
          
          return {
            action: 'scale-up',
            currentInstances,
            targetInstances,
            reason: this.getScaleReason(metrics)
          };
        }
        
        // Scale down conditions
        if (
          metrics.cpu < cpuThreshold * 0.3 &&
          metrics.memory < memoryThreshold * 0.3 &&
          metrics.responseTime < responseTimeThreshold * 0.3
        ) {
          const targetInstances = Math.max(
            currentInstances - Math.floor(currentInstances * 0.25),
            minInstances
          );
          
          return {
            action: 'scale-down',
            currentInstances,
            targetInstances,
            reason: 'Low resource utilization'
          };
        }
        
        return { action: 'none' };
      }
      
      async executeScaling(decision) {
        console.log(\`Executing \${decision.action}: \${decision.currentInstances} -> \${decision.targetInstances}\`);
        
        // Kubernetes scaling
        if (this.config.platform === 'kubernetes') {
          await this.k8sApi.patchNamespacedDeploymentScale(
            this.config.deploymentName,
            this.config.namespace,
            {
              spec: { replicas: decision.targetInstances }
            }
          );
        }
        
        // AWS Auto Scaling
        if (this.config.platform === 'aws') {
          await this.autoScaling.setDesiredCapacity({
            AutoScalingGroupName: this.config.asgName,
            DesiredCapacity: decision.targetInstances,
            HonorCooldown: false
          }).promise();
        }
        
        // Log scaling event
        await this.logScalingEvent(decision);
        
        // Wait for instances to stabilize
        await this.waitForStabilization(decision.targetInstances);
      }
      
      // Predictive scaling based on patterns
      async predictiveScale() {
        const history = Array.from(this.metrics.values());
        if (history.length < 10) return;
        
        // Simple trend analysis
        const cpuTrend = this.calculateTrend(history.map(m => m.cpu));
        const requestTrend = this.calculateTrend(history.map(m => m.requestRate));
        
        // Predict future load
        if (cpuTrend > 0.1 || requestTrend > 0.15) {
          console.log('Predictive scaling: Increasing trend detected');
          const currentInstances = this.getCurrentInstances();
          const targetInstances = Math.min(
            currentInstances + 2,
            this.config.maxInstances
          );
          
          await this.executeScaling({
            action: 'predictive-scale-up',
            currentInstances,
            targetInstances,
            reason: 'Predicted load increase'
          });
        }
      }
      
      calculateTrend(values) {
        if (values.length < 2) return 0;
        
        // Simple linear regression
        const n = values.length;
        const sumX = (n * (n - 1)) / 2;
        const sumY = values.reduce((a, b) => a + b, 0);
        const sumXY = values.reduce((sum, y, x) => sum + x * y, 0);
        const sumX2 = (n * (n - 1) * (2 * n - 1)) / 6;
        
        const slope = (n * sumXY - sumX * sumY) / (n * sumX2 - sumX * sumX);
        return slope;
      }
    }
  `)

🔐 Scalability Security Patterns

Rate Limiting and DDoS Protection

// Distributed rate limiting
Write("infrastructure/security/rate-limiter.js", `
  const Redis = require('ioredis');
  
  class DistributedRateLimiter {
    constructor(redisCluster) {
      this.redis = redisCluster;
      this.scripts = this.loadLuaScripts();
    }
    
    loadLuaScripts() {
      // Sliding window rate limiting
      const slidingWindow = \`
        local key = KEYS[1]
        local window = tonumber(ARGV[1])
        local limit = tonumber(ARGV[2])
        local now = tonumber(ARGV[3])
        local clearBefore = now - window
        
        redis.call('zremrangebyscore', key, 0, clearBefore)
        
        local current = redis.call('zcard', key)
        if current < limit then
          redis.call('zadd', key, now, now)
          redis.call('expire', key, window)
          return { 1, limit - current - 1 }
        else
          return { 0, 0 }
        end
      \`;
      
      // Token bucket algorithm
      const tokenBucket = \`
        local key = KEYS[1]
        local rate = tonumber(ARGV[1])
        local capacity = tonumber(ARGV[2])
        local now = tonumber(ARGV[3])
        local requested = tonumber(ARGV[4])
        
        local bucket = redis.call('hmget', key, 'tokens', 'last_refill')
        local tokens = tonumber(bucket[1]) or capacity
        local last_refill = tonumber(bucket[2]) or now
        
        local elapsed = math.max(0, now - last_refill)
        local new_tokens = math.min(capacity, tokens + (elapsed * rate))
        
        if new_tokens >= requested then
          new_tokens = new_tokens - requested
          redis.call('hmset', key, 'tokens', new_tokens, 'last_refill', now)
          redis.call('expire', key, capacity / rate)
          return { 1, new_tokens }
        else
          redis.call('hmset', key, 'tokens', new_tokens, 'last_refill', now)
          return { 0, new_tokens }
        end
      \`;
      
      return { slidingWindow, tokenBucket };
    }
    
    async checkLimit(identifier, options = {}) {
      const {
        window = 60000, // 1 minute
        limit = 100,
        algorithm = 'sliding'
      } = options;
      
      const key = \`rate_limit:\${identifier}\`;
      const now = Date.now();
      
      try {
        let result;
        
        if (algorithm === 'sliding') {
          result = await this.redis.eval(
            this.scripts.slidingWindow,
            1,
            key,
            window,
            limit,
            now
          );
        } else if (algorithm === 'token') {
          const rate = limit / (window / 1000); // tokens per second
          result = await this.redis.eval(
            this.scripts.tokenBucket,
            1,
            key,
            rate,
            limit,
            now,
            1
          );
        }
        
        return {
          allowed: result[0] === 1,
          remaining: result[1],
          resetAt: now + window
        };
      } catch (error) {
        console.error('Rate limit error:', error);
        // Fail open in case of Redis failure
        return { allowed: true, remaining: limit, resetAt: now + window };
      }
    }
    
    // Global rate limiting across all instances
    async checkGlobalLimit(options = {}) {
      const {
        window = 60000,
        limit = 10000,
        namespace = 'global'
      } = options;
      
      return this.checkLimit(namespace, { window, limit });
    }
    
    // IP-based rate limiting
    async checkIPLimit(ip, options = {}) {
      return this.checkLimit(\`ip:\${ip}\`, options);
    }
    
    // User-based rate limiting
    async checkUserLimit(userId, options = {}) {
      return this.checkLimit(\`user:\${userId}\`, options);
    }
    
    // API key rate limiting with tiers
    async checkAPIKeyLimit(apiKey, tier = 'basic') {
      const limits = {
        basic: { window: 3600000, limit: 1000 },    // 1000/hour
        pro: { window: 3600000, limit: 10000 },     // 10000/hour
        enterprise: { window: 3600000, limit: 100000 } // 100000/hour
      };
      
      return this.checkLimit(\`api:\${apiKey}\`, limits[tier]);
    }
  }
`)

📊 Scalability Monitoring Dashboard

// Real-time scalability metrics
Write("infrastructure/monitoring/scalability-dashboard.js", `
  const express = require('express');
  const WebSocket = require('ws');
  
  class ScalabilityDashboard {
    constructor() {
      this.app = express();
      this.metrics = {
        instances: new Map(),
        loadBalancer: {},
        cache: {},
        database: {},
        queues: {}
      };
      
      this.setupEndpoints();
      this.startMetricsCollection();
    }
    
    setupEndpoints() {
      // Dashboard HTML
      this.app.get('/', (req, res) => {
        res.send(\`
          <!DOCTYPE html>
          <html>
          <head>
            <title>Scalability Dashboard</title>
            <style>
              .grid { display: grid; grid-template-columns: repeat(3, 1fr); gap: 20px; }
              .metric-box { 
                background: #f5f5f5; 
                padding: 20px; 
                border-radius: 8px;
                box-shadow: 0 2px 4px rgba(0,0,0,0.1);
              }
              .metric-title { font-size: 18px; font-weight: bold; margin-bottom: 10px; }
              .metric-value { font-size: 36px; color: #2196F3; }
              .sub-metric { font-size: 14px; color: #666; margin-top: 5px; }
              .status-good { color: #4CAF50; }
              .status-warning { color: #FF9800; }
              .status-critical { color: #F44336; }
            </style>
          </head>
          <body>
            <h1>Scalability Dashboard</h1>
            
            <div class="grid">
              <div class="metric-box">
                <div class="metric-title">Active Instances</div>
                <div class="metric-value" id="instances">-</div>
                <div class="sub-metric" id="instance-health">Health: -</div>
              </div>
              
              <div class="metric-box">
                <div class="metric-title">Load Balancer</div>
                <div class="metric-value" id="requests-per-sec">-</div>
                <div class="sub-metric" id="lb-distribution">Distribution: -</div>
              </div>
              
              <div class="metric-box">
                <div class="metric-title">Cache Hit Rate</div>
                <div class="metric-value" id="cache-hit-rate">-</div>
                <div class="sub-metric" id="cache-size">Size: -</div>
              </div>
              
              <div class="metric-box">
                <div class="metric-title">Database Connections</div>
                <div class="metric-value" id="db-connections">-</div>
                <div class="sub-metric" id="db-replicas">Replicas: -</div>
              </div>
              
              <div class="metric-box">
                <div class="metric-title">Queue Depth</div>
                <div class="metric-value" id="queue-depth">-</div>
                <div class="sub-metric" id="queue-processing">Processing: -/sec</div>
              </div>
              
              <div class="metric-box">
                <div class="metric-title">Response Time</div>
                <div class="metric-value" id="response-time">-</div>
                <div class="sub-metric" id="response-percentiles">p95: -, p99: -</div>
              </div>
            </div>
            
            <canvas id="scaling-chart" width="800" height="400"></canvas>
            
            <script>
              const ws = new WebSocket('ws://localhost:3002/metrics');
              
              ws.onmessage = (event) => {
                const data = JSON.parse(event.data);
                
                // Update metrics
                document.getElementById('instances').textContent = data.instances.count;
                document.getElementById('instance-health').textContent = 
                  'Health: ' + data.instances.healthy + '/' + data.instances.count;
                
                document.getElementById('requests-per-sec').textContent = 
                  data.loadBalancer.requestsPerSec + ' req/s';
                document.getElementById('lb-distribution').textContent = 
                  'Distribution: ' + data.loadBalancer.distribution;
                
                document.getElementById('cache-hit-rate').textContent = 
                  (data.cache.hitRate * 100).toFixed(1) + '%';
                document.getElementById('cache-size').textContent = 
                  'Size: ' + data.cache.size + ' items';
                
                document.getElementById('db-connections').textContent = 
                  data.database.activeConnections + '/' + data.database.maxConnections;
                document.getElementById('db-replicas').textContent = 
                  'Replicas: ' + data.database.replicas + ' (lag: ' + data.database.replicationLag + 's)';
                
                document.getElementById('queue-depth').textContent = data.queues.depth;
                document.getElementById('queue-processing').textContent = 
                  'Processing: ' + data.queues.processingRate + '/sec';
                
                document.getElementById('response-time').textContent = 
                  data.performance.responseTime + 'ms';
                document.getElementById('response-percentiles').textContent = 
                  'p95: ' + data.performance.p95 + 'ms, p99: ' + data.performance.p99 + 'ms';
                
                // Update chart
                updateScalingChart(data.history);
              };
              
              function updateScalingChart(history) {
                // Chart implementation
              }
            </script>
          </body>
          </html>
        \`);
      });
      
      // WebSocket for real-time updates
      const wss = new WebSocket.Server({ port: 3002 });
      
      wss.on('connection', (ws) => {
        const interval = setInterval(() => {
          ws.send(JSON.stringify(this.getCurrentMetrics()));
        }, 1000);
        
        ws.on('close', () => {
          clearInterval(interval);
        });
      });
    }
    
    async getCurrentMetrics() {
      return {
        instances: await this.getInstanceMetrics(),
        loadBalancer: await this.getLoadBalancerMetrics(),
        cache: await this.getCacheMetrics(),
        database: await this.getDatabaseMetrics(),
        queues: await this.getQueueMetrics(),
        performance: await this.getPerformanceMetrics(),
        history: this.getScalingHistory()
      };
    }
  }
`)

🎯 Scalability Checklist

Pre-Launch Scalability Verification

Architecture ✓
- Stateless services
- Microservices deployed
- Service mesh configured
- API gateway active
Load Balancing ✓
- Multiple load balancers
- Health checks configured
- Failover tested
- Geographic distribution
Caching ✓
- CDN configured
- Redis cluster running
- Cache invalidation tested
- Hit rates optimized
Database ✓
- Read replicas active
- Sharding implemented
- Connection pooling
- Query optimization
Message Queues ✓
- Queue cluster running
- Dead letter queues
- Retry logic tested
- Monitoring active
Auto-scaling ✓
- Policies configured
- Tested under load
- Predictive scaling
- Cost controls

Remember: Design for 10x growth from day one!

Explore More

Related Templates

Agile/Scrum(CLAUDE-MD-Agile)

Sprint-based development

AI/ML Projects(CLAUDE-MD-AI-ML-Projects)

Machine learning and AI systems

API Development(CLAUDE-MD-API-Development)

RESTful and GraphQL APIs

Related Guides

ClaudeCode Best Practices & Pro Tips

Master ClaudeCode with expert tips, proven workflows, and best practices. Learn how to maximize productivity and get the most out of AI-powered development.

Claude Code Checkpoints & Rewind Feature - Undo AI Code Changes

Learn how to use Claude Code's checkpoints and rewind feature to safely undo AI-generated code changes. Use Esc+Esc or /rewind to rollback mistakes and maintain control over your codebase.