This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Concepts

Core concepts and architecture of the Faros platform.

1: Clusters
2: AI Agents

Understanding Faros concepts will help you effectively use the platform for managing Kubernetes clusters and AI-powered analysis.

Overview

Faros is built around two main concepts:

Clusters

Clusters are registered Kubernetes environments that connect to Faros via lightweight agents. Each cluster maintains its own lifecycle, provides secure remote access, and exposes data for analysis.

Learn more about Clusters →

AI Agents

AI Agents are intelligent assistants powered by large language models that analyze your clusters, provide recommendations, and help with troubleshooting and optimization.

Learn more about AI Agents →

1 - Clusters

Understanding cluster management in Faros.

Faros provides a centralized platform for managing multiple Kubernetes clusters through a unified interface. Clusters are registered with Faros via lightweight agents that provide secure, read-only access for monitoring and analysis.

What is a Cluster in Faros?

A Cluster in Faros represents a registered Kubernetes cluster that is connected to the Faros platform. Each cluster:

Has a unique name within your organization
Runs a lightweight Faros agent for connectivity
Maintains its own lifecycle and status
Can be accessed remotely via SSH or API
Exposes metrics and data for AI-powered analysis

Cluster Lifecycle

Clusters in Faros go through the following phases:

Pending: Cluster resource has been created but initialization hasn’t started
Initializing: Cluster is being set up, agent is being configured
Ready: Cluster is fully connected and operational
Failed: Cluster encountered an error during setup or operation
Deleting: Cluster is being removed from Faros
Deleted: Cluster has been successfully removed

Agent Architecture

When you initialize a cluster in Faros, an Agent resource is created. The agent:

Runs as a deployment in your Kubernetes cluster
Establishes a secure WebSocket tunnel to Faros
Uses JWT authentication for secure communication
Provides read-only access to cluster resources
Exposes MCP (Model Context Protocol) servers for AI integration
Sends periodic heartbeats to maintain connection status

Agent Deployment

The agent is deployed to your cluster using standard Kubernetes manifests:

apiVersion: core.faros.sh/v1alpha1
kind: Agent
metadata:
  name: <cluster-name>
spec:
  clusterName: <cluster-name>
  token: <jwt-token>

Remote Access

Faros provides secure remote access to your clusters without exposing them to the internet:

SSH Access

kubectl faros clusters ssh <cluster-name>

This opens an interactive terminal session that:

Uses WebSocket-based SSH tunneling
Supports full terminal features (colors, resize, signals)
Authenticates using your Faros credentials
Provides secure access without VPN or direct network exposure

MCP Server Access

For AI and LLM integration, clusters expose MCP servers:

kubectl faros clusters mcp <cluster-name>

This provides connection details for AI agents to query cluster data and metrics.

Multi-Cluster Management

Faros is designed for organizations managing multiple clusters:

Unified View: List and manage all clusters from one interface
Consistent Tooling: Same CLI commands work across all clusters
Centralized Authentication: Single sign-on via OAuth for all clusters
RBAC Integration: Kubernetes-native access control using ClusterRoleBindings

Security Model

Faros clusters follow these security principles:

Read-Only by Default: Agents provide read-only access to cluster data
No Inbound Connections: Clusters initiate outbound connections only
Token-Based Authentication: JWT tokens authenticate agents
Kubernetes-Native RBAC: Standard Kubernetes roles control access
TLS Encryption: All communication is encrypted in transit

Use Cases

Common scenarios for Faros cluster management:

Multi-Cluster Monitoring: Track status of production, staging, and development clusters
AI-Powered Analysis: Connect AI agents to analyze cluster health and performance
Remote Troubleshooting: SSH into clusters without direct network access
Team Collaboration: Share cluster access with team members via RBAC
Compliance Auditing: Centralized access logs and audit trails

2 - AI Agents

Understanding AI agents in Faros and their capabilities.

AI Agents in Faros provide intelligent analysis, recommendations, and automation for your Kubernetes clusters. Powered by large language models, they help identify issues, optimize configurations, and provide actionable insights.

What is an AI Agent?

An AI Agent in Faros is a resource that connects a large language model (LLM) to your Kubernetes clusters for intelligent analysis. Each agent:

Connects to an AI backend (e.g., OpenAI, Anthropic)
Uses a specific model (e.g., GPT-4, Claude)
Has secure API key management via Kubernetes secrets
Can analyze cluster data and provide recommendations
Integrates with Faros clusters via MCP servers

Agent Components

Intelligence Backend

The backend is the AI service provider:

OpenAI: GPT-4, GPT-3.5-turbo, GPT-4-turbo
Additional backends planned for future releases

Model Selection

Different models offer different capabilities:

GPT-4: Advanced reasoning, complex analysis
GPT-3.5-turbo: Fast responses, cost-effective
GPT-4-turbo: Balanced performance and cost

Agent Lifecycle

AI Agents go through these phases:

Pending: Agent resource created, waiting for initialization
Initializing: Connecting to AI backend, validating credentials
Ready: Agent is operational and available for tasks
Failed: Agent encountered an error (invalid API key, network issues)
Deleting: Agent is being removed
Deleted: Agent has been successfully removed

Authentication and Secrets

AI Agents require API keys to connect to backend services. Faros supports two approaches:

Automatic Secret Creation

When you provide --api-key, the CLI creates a secret:

kubectl faros ai-agents init \
  --name my-agent \
  --backend openai \
  --model gpt-4 \
  --api-key sk-...

This creates a Kubernetes secret: <agent-name>-api-key

Manual Secret Management

For better secret management, create secrets separately:

# Create secret
kubectl create secret generic ai-credentials \
  --from-literal=openai-key=sk-...

# Reference it in the agent
kubectl faros ai-agents init \
  --name my-agent \
  --backend openai \
  --model gpt-4 \
  --secret-name ai-credentials \
  --secret-key openai-key

Cluster Integration

AI Agents connect to clusters through:

MCP (Model Context Protocol) Servers

Clusters expose MCP servers that agents query for data:

# Get MCP server details
kubectl faros clusters mcp production-cluster

The agent uses these endpoints to:

Fetch cluster metrics
Query resource status
Analyze configurations
Generate recommendations

Cluster Selector

Agents can target specific clusters:

apiVersion: intelligence.faros.sh/v1alpha1
kind: Agent
metadata:
  name: prod-analyzer
spec:
  backend: openai
  model: gpt-4
  clusterSelector:
    matchLabels:
      environment: production

Use Cases

Common scenarios for AI Agents:

Cluster Health Analysis

Agents analyze resource usage, pod status, and configurations to identify:

Resource bottlenecks
Misconfigured services
Security vulnerabilities
Cost optimization opportunities

Troubleshooting Assistant

When issues occur, agents help:

Diagnose error patterns in logs
Suggest remediation steps
Identify root causes
Provide runbook recommendations

Configuration Optimization

Agents review configurations and suggest:

Resource limit adjustments
Scaling policies
Network policy improvements
Storage optimizations

Compliance and Security

Agents audit clusters for:

Security best practices
Policy violations
Exposed secrets
Non-compliant configurations

API Resource Definition

AI Agents are defined in the intelligence.faros.sh/v1alpha1 API group:

apiVersion: intelligence.faros.sh/v1alpha1
kind: Agent
metadata:
  name: production-analyzer
  namespace: default
spec:
  backend: openai
  model: gpt-4
  secretRef:
    name: openai-credentials
    key: api-key
  clusterSelector:
    matchLabels:
      environment: production
status:
  phase: Ready
  lastHeartbeat: "2024-06-02T10:30:00Z"
  conditions:
    - type: Ready
      status: "True"
      lastTransitionTime: "2024-06-02T10:25:00Z"

Best Practices

Secret Management

Use separate secrets for different environments
Rotate API keys regularly
Use RBAC to restrict secret access
Never commit secrets to version control

Model Selection

Use GPT-4 for complex analysis requiring deep reasoning
Use GPT-3.5-turbo for quick checks and simple queries
Consider cost vs. capability trade-offs

Agent Naming

Use descriptive names: prod-cost-analyzer, staging-security-audit
Include environment in name for clarity
Use consistent naming conventions

Resource Organization

Deploy agents in dedicated namespaces
Use labels for grouping and filtering
Apply resource quotas to prevent excessive API usage