Centralized Load Balancer (LB) Mode

In Load Balancer (LB) Mode, Bauxite acts as a high-performance, centralized intercept for your entire organization. This is the preferred pattern for platform teams providing “LLM-as-a-Service” to multiple internal departments.

Architecture

Instead of living inside each Pod, Bauxite sits behind a standard Network Load Balancer (NLB) or Ingress. All internal applications point their BASE_URL to this central cluster.

Bauxite Diagram


Key Benefits of LB Mode

FeatureDescription
Global Rate LimitingPrevent a single “noisy neighbor” app from exhausting your corporate OpenAI/Anthropic quotas.
Unified Key ManagementManage your provider API keys in one secure vault rather than distributing them to every app team.
Org-Wide AuditingCentralized Carbon Tracking and security logs for every prompt across the company.
Shared KV-CacheMaximize KV-Aware Routing hits by pooling common RAG prefixes in a centralized memory layer.

Deployment Configuration

When running in LB Mode, you typically disable the strict 20MB local limit in favor of a larger shared pool (e.g., 512MB) to handle hundreds of concurrent streams.

# config.yaml (LB Mode Optimization)
mode: load_balancer
max_concurrent_streams: 1000
pii_janitor:
  enabled: true
  vault_type: shared_memory # Uses a faster, non-locking map for high-concurrency

Ingress Example (Kubernetes)

To expose the LB cluster internally, use a standard Kubernetes Service:

apiVersion: v1
kind: Service
metadata:
  name: bauxite-intercept
spec:
  selector:
    app: bauxite
  ports:
    - protocol: TCP
      port: 80
      targetPort: 9090
  type: ClusterIP

Security Considerations

While LB Mode is highly efficient, it creates a “Single Point of Failure.” To mitigate this:

  1. High Availability: Always run at least 3 replicas
  2. mTLS: Enforce Mutual TLS between your internal Apps and the Bauxite LB to ensure no one can snoop on traffic inside your network.
  3. Namespace Isolation: Deploy the Bauxite LB in a dedicated security-egress namespace with strict NetworkPolicies.