initial commit

2026-03-25 00:05:57 +01:00
commit 25c7d598ca
63 changed files with 5257 additions and 0 deletions
@@ -0,0 +1,150 @@
+# Cykl życia AgentRun
+
+#sympozium #agenty #lifecycle
+
+## Pełny przepływ reconciliation
+
+`AgentRunReconciler` to **największy i najważniejszy controller** w systemie (~900 linii kodu). Zarządza pełnym lifecycle od Pending do Completed.
+
+## Faza: Pending → Running
+
+```
+reconcilePending():
+│
+├── 1. validatePolicy()
+│   └── Sprawdza SympoziumPolicy:
+│       - Sandbox requirements
+│       - Tool gating
+│       - Feature gates
+│       - AgentSandbox policy
+│
+├── 2. Agent Sandbox check
+│   └── Jeśli agentSandbox.enabled → reconcilePendingAgentSandbox()
+│       (tworzy Sandbox CR zamiast Job)
+│
+├── 3. ensureAgentServiceAccount()
+│   └── ServiceAccount "sympozium-agent" w target namespace
+│
+├── 4. createInputConfigMap()
+│   └── ConfigMap z task, system prompt, memory context
+│
+├── 5. Lookup SympoziumInstance
+│   ├── Memory enabled? → prepend memory instructions
+│   ├── Observability config → inject OTel env vars
+│   ├── Skills inheritance → copy from instance if empty
+│   └── MCP servers → resolve URLs from MCPServer CRs
+│
+├── 6. ensureMCPConfigMap()
+│   └── ConfigMap z konfiguracją MCP serwerów
+│
+├── 7. resolveSkillSidecars()
+│   └── SkillPack CRDs → resolved sidecar specs
+│
+├── 8. Server mode check
+│   └── mode=server → reconcilePendingServer() (Deployment+Service)
+│
+├── 9. Filter server-only sidecars (task mode)
+│
+├── 10. Memory server readiness check
+│   └── Jeśli memory skill → sprawdź czy Deployment istnieje
+│
+├── 11. Build Job
+│   ├── PodBuilder.BuildAgentContainer()
+│   ├── PodBuilder.BuildIPCBridgeContainer()
+│   ├── Skill sidecar containers
+│   ├── MCP bridge sidecar (jeśli MCP servers)
+│   ├── Sandbox sidecar (jeśli enabled)
+│   ├── Memory volumes/init containers
+│   ├── Secret mirroring (system → run namespace)
+│   └── OTel tracing setup
+│
+├── 12. Create ephemeral RBAC
+│   ├── Role + RoleBinding (namespace-scoped, ownerRef)
+│   └── ClusterRole + ClusterRoleBinding (label-based)
+│
+├── 13. NetworkPolicy
+│   └── deny-all + allow DNS + allow NATS
+│
+└── 14. Create Job → Status: Running
+```
+
+## Faza: Running
+
+```
+reconcileRunning():
+│
+├── Poll Job status (co 10s via requeue)
+│
+├── Pod Succeeded → extractResults():
+│   ├── Read pod logs
+│   ├── Extract result text
+│   ├── Extract memory markers (__SYMPOZIUM_MEMORY__)
+│   ├── Patch memory ConfigMap
+│   ├── Extract token usage
+│   └── Set status.result, completedAt, tokenUsage
+│   → Status: Succeeded
+│
+├── Pod Failed →
+│   ├── Read pod logs for error
+│   ├── Set status.error, exitCode
+│   └── Status: Failed
+│
+└── Timeout → failRun() → Status: Failed
+```
+
+## Faza: Succeeded/Failed
+
+```
+reconcileCompleted():
+│
+├── Clean up ephemeral RBAC
+│   ├── Delete ClusterRole (label: agentrun=<name>)
+│   └── Delete ClusterRoleBinding
+│
+├── Prune run history
+│   └── Keep max 50 runs per instance (DefaultRunHistoryLimit)
+│
+└── Remove finalizer → AgentRun deletable
+```
+
+## Faza: Serving (server mode)
+
+```
+reconcileServing():
+│
+├── Sprawdź Deployment health
+├── Sprawdź Service health
+├── Reconcile HTTPRoute (Envoy Gateway)
+└── Requeue co 30s
+```
+
+## Obsługa usunięcia
+
+```
+reconcileDelete():
+│
+├── Delete server-mode resources (Deployment, Service, HTTPRoute)
+├── Delete ephemeral RBAC
+├── Delete input ConfigMap
+├── Delete MCP ConfigMap
+├── Remove finalizer
+└── AgentRun usunięty
+```
+
+## OTel Tracing
+
+Każda faza reconciliation jest tracowana:
+- `agentrun.reconcile` - główny span
+- `agentrun.create_job` - tworzenie Job
+- Traceparent propagowany do agent poda via env var
+- TraceID zapisany w `status.traceID`
+
+## Metryki
+
+- `sympozium.agent.runs` - counter (success/failure labels)
+- `sympozium.agent.duration_ms` - histogram czasu trwania
+- `sympozium.errors` - counter błędów
+
+---
+
+Powiązane: [[AgentRun]] | [[Cykl życia Agent Pod]] | [[Orchestrator - PodBuilder i Spawner]]
@@ -0,0 +1,82 @@
+# Model efemerycznych agentów
+
+#sympozium #agenty #architektura #ephemeral
+
+## Fundamentalna decyzja
+
+Sympozium implementuje **model efemeryczny**: każde wywołanie agenta tworzy nowy Kubernetes [[Job]], który po zakończeniu jest usuwany. To odwrotność podejścia "persistent engine" stosowanego w kagent, LangChain, CrewAI.
+
+## Jak to działa
+
+```
+Wiadomość użytkownika
+    ↓
+Channel Router tworzy AgentRun CR
+    ↓
+AgentRun Reconciler tworzy Job:
+  - Agent container (LLM loop)
+  - IPC Bridge sidecar
+  - Skill sidecars (z efemerycznym RBAC)
+  - Opcjonalnie: sandbox, MCP bridge
+    ↓
+Agent wykonuje zadanie (sekundy-minuty)
+    ↓
+Job ends → Pod usunięty → RBAC usunięty → Zasoby zwolnione
+```
+
+## Implikacje
+
+### Izolacja (zalety)
+- **Blast-radius** ograniczony do jednego poda - misbehaving agent nie wpływa na innych
+- **Resource limits** per invocation - każdy run ma własne CPU/memory limits
+- **[[SecurityContext]]** per run - każdy pod ma hardened security context
+- **[[NetworkPolicy]]** per run - deny-all egress, tylko IPC bridge ma sieć
+- **[[RBAC]]** per run - credentials istnieją tylko przez czas trwania runu
+
+### Skalowanie (zalety)
+- **Horizontal scaling** natywne - K8s scheduler rozkłada pody po nodach
+- **No contention** - każdy run to osobny pod, brak współdzielenia procesora
+- **Auto-cleanup** - Kubernetes garbage collection czyści po zakończeniu
+
+### Cold start (wady i mitygacja)
+- **Problem:** Każdy run = nowy pod = scheduling + image pull + container start
+- **Typowy czas:** 5-30 sekund
+- **Mitygacja 1:** Warm Pools (SandboxWarmPool) - pre-provisioned sandboxes
+- **Mitygacja 2:** ImagePullPolicy: IfNotPresent - obrazy cache'owane na nodach
+- **Mitygacja 3:** Tryb Server - Deployment zamiast Job dla long-lived scenarios
+
+### Brak stanu (wady i mitygacja)
+- **Problem:** Agent nie pamięta poprzednich konwersacji
+- **Mitygacja 1:** Persistent Memory (SQLite + FTS5 na PVC)
+- **Mitygacja 2:** Legacy Memory (ConfigMap MEMORY.md)
+- **Mitygacja 3:** Session persistence (PostgreSQL)
+
+## Porównanie z persistent engine
+
+| Aspekt | Ephemeral (Sympozium) | Persistent (kagent, LangChain) |
+|--------|----------------------|-------------------------------|
+| Izolacja | [[Pod]]-level per run | Shared process |
+| Cold start | 5-30s (mitygowany WarmPool) | ~0s (process already running) |
+| State | External (memory, sessions) | In-process |
+| Scaling | Horizontal native | Vertical only |
+| Resource utilization | Pay-per-invocation | Always-on |
+| Failure blast radius | Single pod | Entire engine |
+| Audit trail | Pod logs, CRD status | Engine logs |
+| RBAC | Ephemeral per-run | Standing ServiceAccount |
+
+## Kiedy model efemeryczny?
+
+Idealny dla:
+- **Cluster operations** - kubectl, helm z izolacją
+- **Scheduled tasks** - health checks, sweeps
+- **Multi-tenant** - izolacja między tenantami
+- **Security-sensitive** - untrusted agent code
+- **Batch processing** - one-shot tasks
+
+Mniej idealny dla:
+- **Real-time chat** - cold start jest widoczny (ale tryb Server to rozwiązuje)
+- **State-heavy workflows** - wymagają persistent memory
+
+---
+
+Powiązane: [[Cykl życia AgentRun]] | [[Agent Sandbox - gVisor i Kata]] | [[Sympozium vs kagent]]
@@ -0,0 +1,122 @@
+# Persistent Memory
+
+#sympozium #agenty #memory
+
+## Problem
+
+W modelu efemerycznym agent traci pamięć po zakończeniu runu. Sympozium rozwiązuje to dwoma mechanizmami.
+
+## Mechanizm 1: Legacy ConfigMap Memory
+
+Prostszy, starszy system:
+
+```
+Agent output zawiera markery:
+__SYMPOZIUM_MEMORY__
+Key insight: Pod X has recurring OOM issues
+__SYMPOZIUM_MEMORY_END__
+    ↓
+Controller parsuje logi poda
+    ↓
+Patch ConfigMap: <instance>-memory
+  Data:
+    MEMORY.md: |
+      # Agent Memory
+      - Key insight: Pod X has recurring OOM issues
+    ↓
+Następny run montuje ConfigMap w /memory/MEMORY.md
+```
+
+### Ograniczenia
+- Flat file (Markdown)
+- Brak wyszukiwania semantycznego
+- Max rozmiar: `maxSizeKB` (domyślnie 256KB)
+- Brak tagowania/kategoryzacji
+
+## Mechanizm 2: SQLite + FTS5 Memory (SkillPack-based)
+
+Zaawansowany system z dedykowanym serwerem:
+
+```
+┌─────────────────────────────────────┐
+│ Memory Server (Deployment+Service)   │
+│                                       │
+│  ┌──────────────────┐                │
+│  │  SQLite + FTS5   │                │
+│  │  na PVC (1Gi)    │                │
+│  └──────────────────┘                │
+│  HTTP API: :8080                      │
+│  /health - readiness/liveness        │
+│  /memory/store - zapis               │
+│  /memory/search - wyszukiwanie FTS   │
+└──────────┬──────────────────────────┘
+           │
+Agent Pod (z memory skill):
+  - memory_store → HTTP POST
+  - memory_search → HTTP GET
+```
+
+### Aktywacja
+Dodanie skillu `memory` do SympoziumInstance:
+```yaml
+skills:
+  - skillPackRef: memory
+```
+
+### Co controller tworzy
+1. **PVC**: `<instance>-memory-db` (1Gi) - persystencja SQLite DB
+2. **Deployment**: `<instance>-memory` - memory-server z PVC
+3. **Service**: `<instance>-memory` - ClusterIP na port 8080
+
+### Zalety vs ConfigMap
+- **Full-text search** (FTS5) - agent szuka w pamięci semantycznie
+- **Tagging** - memories mogą być kategoryzowane
+- **Skalowalne** - DB na PVC, nie ConfigMap (1MB limit)
+- **Upgradeable** - można dodać vector search w przyszłości
+- **API-driven** - agent używa HTTP API, nie plików
+
+## Memory w scheduled tasks
+
+```yaml
+# SympoziumSchedule
+spec:
+  includeMemory: true  # Memory context wstrzyknięty do każdego runu
+```
+
+To enabler dla **uczących się agentów**:
+```
+Run 1: Agent odkrywa problem → zapisuje do memory
+Run 2: Agent czyta memory → kontynuuje od ostatniego stanu
+Run 3: Agent widzi trend → eskaluje
+```
+
+## Memory seeds (PersonaPack)
+
+```yaml
+personas:
+  - name: sre-watchdog
+    memory:
+      enabled: true
+      seeds:
+        - "Track repeated pod restarts for trend analysis"
+        - "Remember previous node capacity assessments"
+```
+
+Seeds to **początkowa pamięć** - instrukcje dla agenta co śledzić.
+
+## Architektura memory w podzie
+
+```
+Agent Pod:
+  ├── Init Container: wait-for-memory
+  │   └── curl http://<instance>-memory:8080/health
+  │       (czeka aż memory server jest ready)
+  ├── Agent Container:
+  │   ├── memory_search("previous issues") → HTTP GET
+  │   └── memory_store("New issue found: ...") → HTTP POST
+  └── (memory server jest ZEWNĘTRZNY - Deployment, nie sidecar)
+```
+
+---
+
+Powiązane: [[Model efemerycznych agentów]] | [[Scheduled Tasks - heartbeaty i swepy]] | [[SympoziumInstance]]
@@ -0,0 +1,101 @@
+# PersonaPacks - zespoły agentów
+
+#sympozium #agenty #personapack #teams
+
+## Koncepcja
+
+PersonaPacks implementują wzorzec **"AI Team as Code"** - deklarujesz cały zespół agentów w jednym CRD, controller realizuje intent.
+
+## Od pojedynczego agenta do zespołu
+
+### Bez PersonaPacks (ręcznie):
+```
+Per agent trzeba stworzyć:
+  1. Secret (klucz API)
+  2. SympoziumInstance (konfiguracja)
+  3. SympoziumSchedule (harmonogram)
+  4. ConfigMap memory (początkowa pamięć)
+
+Dla 7 agentów = 28 zasobów K8s ręcznie
+```
+
+### Z PersonaPacks:
+```
+1. Wybierz pack w TUI
+2. Podaj klucz API
+3. Done - 28 zasobów stworzonych automatycznie
+```
+
+## Wzorzec "stamp out"
+
+PersonaPack controller działa jak **operator pattern**:
+
+```
+PersonaPack (desired state)
+    ↓
+PersonaPackReconciler:
+  for each persona in spec.personas:
+    if persona.name not in spec.excludePersonas:
+      ├── createOrUpdate SympoziumInstance
+      │   ├── Name: <pack>-<persona>
+      │   ├── Channels: z channelConfigs
+      │   ├── Skills: z persona.skills
+      │   ├── AuthRefs: z pack.authRefs
+      │   └── Policy: z pack.policyRef
+      ├── createOrUpdate SympoziumSchedule (jeśli persona.schedule)
+      │   ├── Cron: z interval → cron conversion
+      │   ├── Task: persona.schedule.task (z taskOverride prepend)
+      │   └── IncludeMemory: true
+      └── createOrUpdate ConfigMap memory (jeśli persona.memory)
+          └── Seeds: persona.memory.seeds
+```
+
+## Przykład: developer-team
+
+Pack `developer-team` tworzy **7 współpracujących agentów**:
+
+| Agent | Rola | Schedule | Skills |
+|-------|------|----------|--------|
+| Tech Lead | Planowanie, architektura | Co 1h | github-gitops |
+| Backend Dev | Implementacja backend | Co 30m | github-gitops, k8s-ops |
+| Frontend Dev | Implementacja frontend | Co 30m | github-gitops |
+| QA Engineer | Testowanie | Co 45m | github-gitops |
+| Code Reviewer | Code review | Co 20m | github-gitops |
+| DevOps Engineer | CI/CD, infra | Co 1h | github-gitops, k8s-ops |
+| Docs Writer | Dokumentacja | Co 2h | github-gitops |
+
+Wszystkie te agenty:
+- Współdzielą repo (via skill params)
+- Mają własną pamięć (memory seeds specyficzne per rola)
+- Działają na harmonogramach (heartbeats)
+- Mają osobne RBAC per run
+
+## Lifecycle zarządzania
+
+```
+Install pack → TUI wizard
+  ↓
+Activate (set authRefs) → Controller stampuje
+  ↓
+Running → Agenty działają wg scheduli
+  ↓
+Exclude persona → Controller usuwa zasoby tego agenta
+  ↓
+Delete pack → ownerReferences → K8s GC czyści WSZYSTKO
+```
+
+## Konfiguracja globalna vs per-persona
+
+| Ustawienie | Poziom pack | Poziom persona |
+|------------|-------------|----------------|
+| AuthRefs | Tak (wspólne) | Nie |
+| Model | Tak (default) | Tak (override) |
+| Skills | Nie | Tak |
+| Channels | Tak (channelConfigs) | Tak (channels list) |
+| Policy | Tak (policyRef) | Nie (dziedziczy) |
+| Task Override | Tak (prepend do scheduli) | Nie |
+| Agent Sandbox | Tak (dla wszystkich) | Nie |
+
+---
+
+Powiązane: [[PersonaPack]] | [[SympoziumInstance]] | [[Scheduled Tasks - heartbeaty i swepy]]
@@ -0,0 +1,105 @@
+# Scheduled Tasks - heartbeaty i swepy
+
+#sympozium #agenty #scheduling
+
+## Koncepcja
+
+Sympozium traktuje **agentów jak CronJob'y** - mogą być uruchamiane cyklicznie bez interwencji użytkownika. To enabler dla scenariuszy DevOps/SRE.
+
+## Typy scheduli
+
+| Typ | Cel | Typowy interwał | Przykład |
+|-----|-----|-----------------|---------|
+| **heartbeat** | Regularne sprawdzanie stanu | 5-30 min | "Sprawdź czy wszystkie pody są healthy" |
+| **scheduled** | Zaplanowane zadania | 1-24h | "Poranny raport z kosztów klastra" |
+| **sweep** | Przeglądy i cleanup | 1-7 dni | "Znajdź i zgłoś nieużywane PVC" |
+
+## Self-scheduling
+
+Wyjątkowa cecha: **agenty mogą same zarządzać swoimi schedulami**:
+
+```
+Agent dostaje task: "Monitor klaster"
+    ↓
+Agent decyduje: "Powinienem sprawdzać co 15 minut"
+    ↓
+Agent wywołuje tool: schedule_task(
+  action: "create",
+  schedule: "*/15 * * * *",
+  task: "Sprawdź stan podów w namespace production"
+)
+    ↓
+/ipc/schedules/create.json → IPC Bridge → NATS: schedule.upsert
+    ↓
+Schedule Router → tworzy SympoziumSchedule CRD
+    ↓
+SympoziumSchedule Controller → tworzy AgentRun co 15 min
+```
+
+Agent może też: update, suspend, resume, delete swoje schedules.
+
+## Concurrency control
+
+```yaml
+spec:
+  concurrencyPolicy: Forbid   # Nie twórz nowego jeśli poprzedni działa
+```
+
+| Polityka | Zachowanie | Use case |
+|----------|------------|----------|
+| **Forbid** | Pomiń trigger jeśli run active | Health checks (nie chcemy pile-up) |
+| **Allow** | Twórz nowy niezależnie | Niezależne analizy |
+| **Replace** | Anuluj stary, twórz nowy | Real-time monitoring |
+
+## Memory context
+
+```yaml
+spec:
+  includeMemory: true  # Inject MEMORY.md do każdego runu
+```
+
+Dzięki temu scheduled run:
+1. Czyta pamięć z poprzednich runów
+2. Kontynuuje od miejsca gdzie skończył
+3. Buduje kontekst między uruchomieniami
+
+Przykład: agent monitorujący w heartbeat co 30 min:
+- Run 1: "Wykryłem 3 restarty poda X"
+- Memory: "Pod X ma problem z restartami (3 do tej pory)"
+- Run 2: "Pamięć mówi o restartach poda X. Sprawdzam - już 7 restartów. Eskalaruję."
+
+## Przykład z PersonaPack
+
+```yaml
+personas:
+  - name: sre-watchdog
+    schedule:
+      type: heartbeat
+      interval: "5m"
+      task: |
+        Check cluster health:
+        - Pod restarts > 3
+        - Node conditions
+        - PVC usage > 80%
+        Report and create issues for anomalies.
+    memory:
+      enabled: true
+      seeds:
+        - "Track repeated issues for trend analysis"
+        - "Escalate if same issue persists > 3 checks"
+```
+
+## Architektoniczne znaczenie
+
+Scheduled tasks transformują Sympozium z "narzędzia do chatowania z AI" w **autonomiczną platformę operacyjną**:
+
+- Agenty działają 24/7 bez ludzkiej interwencji
+- Budują pamięć i kontekst
+- Mogą eskalować do ludzi (via kanały)
+- Mogą same modyfikować swoje harmonogramy
+
+To "sel-healing infrastructure" driven by AI.
+
+---
+
+Powiązane: [[SympoziumSchedule]] | [[Persistent Memory]] | [[PersonaPacks - zespoły agentów]]
@@ -0,0 +1,144 @@
+# Skill Sidecars i auto-RBAC
+
+#sympozium #agenty #skills #rbac #security
+
+## Koncepcja
+
+Skill Sidecars to **najważniejsza innowacja bezpieczeństwa** Sympozium. Zamiast dawać agentowi bezpośredni dostęp do kubectl/helm/git, narzędzia uruchamiane są w **oddzielnym kontenerze z własnym, efemerycznym RBAC**.
+
+## Dlaczego to ważne?
+
+### Problem: In-process tool execution
+W frameworkach jak kagent/LangChain, narzędzia działają w tym samym procesie co agent:
+```
+Agent (z credentials) → tool call → kubectl (z credentials agenta)
+```
+Jeśli LLM zostanie "przekonany" do złośliwego tool call, ma pełne uprawnienia procesu.
+
+### Rozwiązanie: Sidecar isolation
+```
+Agent (BEZ credentials) → /ipc/tools/*.json → IPC Bridge → NATS
+    → Skill Sidecar (z WŁASNYMI, scoped credentials) → kubectl
+```
+
+Agent **nigdy nie posiada** credentials do K8s API. Tylko sidecar je ma, i to z least-privilege RBAC.
+
+## Mechanizm
+
+### 1. Deklaracja RBAC w SkillPack
+
+```yaml
+# SkillPack CRD
+spec:
+  sidecar:
+    rbac:                    # Namespace-scoped
+      - apiGroups: ["", "apps"]
+        resources: ["pods", "deployments"]
+        verbs: ["get", "list", "watch"]
+    clusterRBAC:             # Cluster-scoped
+      - apiGroups: [""]
+        resources: ["nodes"]
+        verbs: ["get", "list"]
+```
+
+### 2. Auto-provisioning przy starcie AgentRun
+
+```
+AgentRun Pending:
+    ↓
+Controller czyta SkillPack RBAC declarations
+    ↓
+Tworzy:
+  ├── ServiceAccount: sympozium-agent (w namespace runu)
+  ├── Role: agentrun-<name>-<skill>
+  │   └── rules: [z SkillPack rbac]
+  ├── RoleBinding: agentrun-<name>-<skill>
+  │   └── roleRef → Role, subject → ServiceAccount
+  ├── ClusterRole: agentrun-<name>-<skill>
+  │   └── rules: [z SkillPack clusterRBAC]
+  └── ClusterRoleBinding: agentrun-<name>-<skill>
+      └── roleRef → ClusterRole, subject → ServiceAccount
+```
+
+### 3. Garbage collection po zakończeniu
+
+```
+AgentRun Succeeded/Failed:
+    ↓
+Namespace RBAC:
+  └── Automatyczne via ownerReference → AgentRun
+      (K8s GC czyści gdy AgentRun usunięty)
+    ↓
+Cluster RBAC:
+  └── Controller szuka po labelach:
+      sympozium.ai/agentrun: <name>
+      → Delete ClusterRole
+      → Delete ClusterRoleBinding
+```
+
+## Lifecycle RBAC
+
+```
+AgentRun created
+  ↓
+[RBAC created] ← credentials istnieją
+  ↓
+Agent pod running - sidecar używa credentials
+  ↓
+AgentRun completed
+  ↓
+[RBAC deleted] ← credentials nie istnieją
+```
+
+**Czas życia credentials = czas życia AgentRun** - to odpowiednik temporary IAM session credentials w AWS.
+
+## Secret mirroring
+
+Jeśli SkillPack referencjonuje Secret (np. GH_TOKEN):
+```
+Secret w sympozium-system → Kopia w namespace AgentRun
+    → Montowany w sidecarze pod /secrets/<name>/
+    → Usuwany po zakończeniu
+```
+
+## Parametryzacja
+
+Ten sam SkillPack może być konfigurowany inaczej per instancja:
+```yaml
+# Instance A
+skills:
+  - skillPackRef: github-gitops
+    params:
+      REPO: team-a/repo-a
+
+# Instance B
+skills:
+  - skillPackRef: github-gitops
+    params:
+      REPO: team-b/repo-b
+```
+
+Params → env vars `SKILL_REPO` w sidecarze.
+
+## Architektura bezpieczeństwa
+
+```
+┌─────────────────────────────────┐
+│          Agent Pod               │
+│                                  │
+│  ┌──────────┐  ┌──────────────┐ │
+│  │  Agent    │  │ Skill Sidecar│ │
+│  │          │  │              │ │
+│  │ BEZ K8s  │  │ Z RBAC:     │ │
+│  │ credentials│ │ - Role      │ │
+│  │          │  │ - Binding   │ │
+│  │ /ipc →   │  │ - SA token  │ │
+│  └──────────┘  └──────────────┘ │
+│       │              ↑           │
+│       └──── IPC ─────┘           │
+└─────────────────────────────────┘
+```
+
+---
+
+Powiązane: [[Efemeryczny RBAC per-run]] | [[SkillPack]] | [[Model bezpieczeństwa Defence-in-Depth]]
@@ -0,0 +1,68 @@
+# Sub-agenty i hierarchia
+
+#sympozium #agenty #sub-agents
+
+## Koncepcja
+
+Sympozium wspiera **hierarchiczne spawning sub-agentów** - agent może tworzyć kolejne AgentRun CRs, które stają się jego "dziećmi".
+
+## Konfiguracja
+
+```yaml
+# W SympoziumInstance
+spec:
+  agents:
+    default:
+      subagents:
+        maxDepth: 2              # Max zagnieżdżenie (parent → child → grandchild)
+        maxConcurrent: 5         # Max równoległych agent runów w drzewie
+        maxChildrenPerAgent: 3   # Max dzieci per agent
+```
+
+## Mechanizm
+
+```
+Parent AgentRun
+  ├── spec.parent: null
+  ├── Tworzy child AgentRun:
+  │   spec:
+  │     parent:
+  │       runName: parent-run
+  │       sessionKey: parent-session
+  │       spawnDepth: 1
+  │   ↓
+  │   └── Child AgentRun
+  │       ├── Tworzy grandchild:
+  │       │   spec.parent.spawnDepth: 2
+  │       │   ↓
+  │       │   └── Grandchild AgentRun
+  │       │       └── maxDepth=2 → NIE może tworzyć dalej
+  │       └── Child może mieć max 3 siblings (maxChildrenPerAgent)
+  └── Parent czeka na wyniki children
+```
+
+## Policy enforcement
+
+`SubagentPolicySpec` w [[SympoziumPolicy]] enforcuje limity:
+- `maxDepth: 3` - admission webhook odrzuca AgentRun z spawnDepth > 3
+- `maxConcurrent: 5` - controller sprawdza ile runów jest aktywnych
+
+## Use case
+
+Sub-agenty umożliwiają **divide-and-conquer** workflows:
+```
+"Zoptymalizuj klaster" (parent)
+  ├── "Sprawdź zużycie CPU per namespace" (child 1)
+  ├── "Sprawdź zużycie pamięci per namespace" (child 2)
+  └── "Sprawdź nieużywane PVC" (child 3)
+```
+
+Każdy child to osobny Job z własnym RBAC, izolacją i lifecycle.
+
+## Śledzenie
+
+Env var `SPAWN_DEPTH` mówi agentowi na jakiej głębokości jest. Agent może podejmować decyzje na tej podstawie (np. nie tworzyć sub-agentów jeśli jest już głęboko).
+
+---
+
+Powiązane: [[AgentRun]] | [[SympoziumPolicy]] | [[Model efemerycznych agentów]]
@@ -0,0 +1,83 @@
+# Tryb Server vs Task
+
+#sympozium #agenty #server-mode
+
+## Dwa tryby wykonania
+
+### Task mode (default)
+- Kubernetes **Job** - run-to-completion
+- Pod istnieje tylko przez czas wykonania
+- Garbage collection po zakończeniu
+- Dla: one-shot tasks, scheduled runs, channel messages
+
+### Server mode
+- Kubernetes **Deployment + Service**
+- Pod żyje długo
+- Nie jest usuwany po "zakończeniu"
+- Dla: web endpoints, long-running services
+
+## Kiedy server mode?
+
+Server mode aktywowany gdy:
+1. `AgentRun.spec.mode = "server"` (jawnie ustawione)
+2. SkillPack ma `sidecar.requiresServer = true`
+
+Typowy use case: **web-endpoint** skill, który eksponuje agenta jako HTTP API.
+
+## Architektura server mode
+
+```
+SympoziumInstance z skill "web-endpoint"
+    ↓
+Instance Reconciler → tworzy AgentRun (mode: server)
+    ↓
+AgentRun Reconciler → reconcilePendingServer():
+  1. Deployment (zamiast Job)
+  2. Service (ClusterIP)
+  3. HTTPRoute (Envoy Gateway, jeśli dostępny)
+  4. Auto-generated API key (Secret)
+    ↓
+Web Proxy Sidecar:
+  - /v1/chat/completions (OpenAI-compat)
+  - /sse, /message (MCP protocol)
+  - Rate limiting
+  - Auth via API key
+```
+
+## Przepływ request'u w server mode
+
+```
+HTTP Client → Gateway → HTTPRoute → Service → Web Proxy Pod
+    ↓
+Web Proxy tworzy per-request AgentRun (mode: task!)
+    ↓
+Normalny flow: Job → Agent → wynik → Web Proxy → HTTP Response
+```
+
+Kluczowe: web proxy sidecar sam działa w server mode, ale **każdy request to osobny task-mode AgentRun** - zachowujemy ephemeral model dla izolacji.
+
+## Reconciliation server mode
+
+`reconcileServing()` (zamiast reconcileRunning):
+- Sprawdza Deployment health
+- Sprawdza Service health
+- Reconciliuje HTTPRoute
+- Requeue co 30 sekund
+
+`reconcileDelete()` dla server mode:
+- Delete Deployment
+- Delete Service
+- Delete HTTPRoute
+- Delete API key Secret
+
+## Faza Serving
+
+```
+AgentRun phases:
+  Pending → Serving (nie Running!)
+  Serving (długo) → Succeeded (gdy usunięty)
+```
+
+---
+
+Powiązane: [[Web Endpoints - OpenAI-compat API]] | [[Cykl życia AgentRun]] | [[Model efemerycznych agentów]]