Cluster Service

Enterprise-grade process clustering capabilities for Node.js and Bun applications.

Overview

The XyPriss Cluster Service enables horizontal scaling through worker process management, load balancing, health monitoring, and inter-process communication.

Architecture

Core Components

  • Cluster Manager - Central orchestrator for worker lifecycle
  • Process Monitor - Real-time resource monitoring
  • Memory Manager - Advanced memory management with leak detection
  • IPC Manager - Inter-process communication system
  • Load Balancer - Request distribution across workers
  • Health Monitor - Worker health checking and recovery

Basic Configuration

import { createServer } from 'xypriss';

const app = createServer({
    cluster: {
        enabled: true,
        config: {
            workers: 'auto', // or specific number
            resources: {
                maxMemoryPerWorker: '256MB',
                maxCpuPerWorker: 50
            }
        }
    }
});

Advanced Configuration

const clusterConfig = {
    enabled: true,
    workers: 4,
    
    resources: {
        maxMemoryPerWorker: '512MB',
        maxCpuPerWorker: 75,
        memoryManagement: {
            enabled: true,
            memoryWarningThreshold: 80,
            memoryCriticalThreshold: 95,
            memoryLeakDetection: true,
            garbageCollectionHint: true
        }
    },
    
    processManagement: {
        respawn: true,
        maxRestarts: 3,
        restartDelay: 2000,
        gracefulShutdownTimeout: 30000,
        zombieDetection: true
    },
    
    healthCheck: {
        enabled: true,
        interval: 30000,
        timeout: 10000,
        maxFailures: 3,
        endpoint: '/health'
    },
    
    autoScaling: {
        enabled: true,
        minWorkers: 2,
        maxWorkers: 8,
        scaleUpThreshold: {
            cpu: 75,
            memory: 80,
            responseTime: 1500
        }
    }
};

Resource Management

Memory Management

  • Real-time memory usage tracking per worker
  • Automatic detection of memory leaks
  • Intelligent GC hints based on memory pressure
  • Hard and soft memory limits with enforcement

CPU Management

  • Per-worker CPU usage tracking
  • Configurable CPU usage limits
  • Intelligent request routing based on worker load

Auto-Scaling

Scaling Triggers

  • CPU Utilization - Scale based on CPU usage thresholds
  • Memory Pressure - Scale based on memory consumption
  • Response Time - Scale based on application response times
  • Queue Length - Scale based on request queue depth

Scaling Policies

  • Scale-Up: Add workers when thresholds are exceeded
  • Scale-Down: Remove workers during low utilization
  • Cooldown Periods: Prevent rapid scaling oscillations
  • Min/Max Limits: Enforce scaling boundaries

Health Monitoring

Health Checks

const app = createServer({
    cluster: {
        enabled: true,
        config: {
            healthCheck: {
                enabled: true,
                interval: 30000,
                timeout: 10000,
                maxFailures: 3,
                endpoint: '/health'
            }
        }
    }
});

Metrics Collection

  • System Metrics: CPU, memory, disk, network usage
  • Application Metrics: Request rates, response times, error rates
  • Worker Metrics: Per-worker performance statistics
  • Cluster Metrics: Overall cluster health and performance

Inter-Process Communication

Message Types

  • Broadcast - Send messages to all workers
  • Unicast - Send messages to specific workers
  • Request-Response - Synchronous communication

Usage Example

// Broadcast to all workers
await app.broadcastToWorkers({
    type: 'config-update',
    data: newConfiguration
});

// Send to random worker
await app.sendToRandomWorker({
    type: 'process-task',
    data: taskData
});

Best Practices

Configuration

  • Start with conservative resource limits
  • Enable health checks in production
  • Use auto-scaling cautiously with proper testing
  • Configure appropriate restart policies
  • Enable comprehensive monitoring

Monitoring

  • Monitor key metrics: CPU, memory, response times
  • Set up alerts for critical thresholds
  • Use structured logging for better analysis
  • Implement custom health checks
  • Regular review of cluster performance