From Words to Actions — A Peek into AIRI’s Mind

AIRI feels human because every response follows a disciplined, multi-stage pipeline that balances creativity, consistency, and safety. This page demystifies that process—no code required.

Big Picture (Function-First)

Consistency: The character never “breaks role” thanks to a structured System Prompt built from your Character Card.
Responsiveness: Streaming generation lets AIRI start speaking before the full answer is ready.
Emotion & Action: Special inline markers convert plain text into avatar expressions or external actions.
Memory: Short-term context (latest chat) and optional long-term vector memories help AIRI reference past events.
Safety & Control: Layered filters and system rules block disallowed content before it reaches TTS or other users.

Outcome: Conversations feel natural yet remain on-brand and under your control.

The Thinking Stack (Conceptual Mechanism)

System Prompt Builder
Combines: Character Card, global rules, localisation settings.
Ensures: Persona, tone, forbidden topics.
Context Builder
Adds: Latest N chat turns, optional long-term memories (vector search), platform metadata (e.g., caller name, Discord channel topic).
Goal: Give the LLM enough context without exceeding token budget.
LLM Selection & Call
Local default: Ollama Llama-3 8B.
Cloud alternatives: GPT-4, Claude 3, Gemini.
Common interface: llm.ts abstracts provider quirks; every call requests stream=true.
Streaming Parser
While tokens arrive, llmmarkerParser.ts continuously:
• Detects <|EMOTE_*|> markers.
• Splits text into TTS-friendly chunks.
• Filters unsafe content flagged by the Safety Layer.
Safety Layer
• Keyword & regex blocklist.
• Optional OpenAI Mod-API check.
• If violation: replace with [content removed] and log incident.
Output Fan-Out
• Clean text → TTS (desktop) or chat message (Discord, Telegram).
• Marker events → Avatar expression, Minecraft skill, etc.

Emotion & Action Markers

Marker	Effect	Example
`<\|EMOTE_HAPPY\|>`	Avatar plays Happy animation	`Great job! <\|EMOTE_HAPPY\|>`
`<\|MOVE_TO\|>{"x":10,"z":5}`	Minecraft bot moves	`On my way! <\|MOVE_TO\|>{"x":10,"z":5}`
`<\|SFX_JINGLE\|>`	Plays sound effect	`Time to celebrate! <\|SFX_JINGLE\|>`

Markers are never shown to users; they are parsed into events before text reaches UI/TTS.

Memory Architecture

Short-Term (Chat History): Last ~20 turns kept in RAM.
Long-Term (Vector DB): Optional; Telegram photos, notable facts stored with embeddings (see db/schema.ts).
Ephemeral Scratchpad: For multi-step tasks in Minecraft integration.

Token Budget Strategy

System Prompt (~600 tokens)
Most recent chat until context limit − safety margin
Memories rated by relevance score until space filled
Hard cutoff prevents model errors.

Safety & Compliance

Dual filter (local regex + optional external) keeps latency low while allowing stricter cloud checks.
Logs retained locally for 7 days by default (configurable in Settings → Safety); no PII transmitted unless cloud filters enabled.
Custom rules can be added in Settings → Safety.

Extending the Brain

New Markers: Add to emotions.ts or create a new skill handler.
Tool Calling: Upgrade to OpenAI function-calling schema by enabling Tool Mode in developer settings.
Memory Plugins: Implement MemoryProvider interface to hook new databases.

Functional Role	Code File	Description
Prompt Builder	`apps/playground-prompt-engineering/src/composables/useCharacterPrompt.ts`	Constructs system prompt from card
LLM Store	`packages/stage-ui/src/stores/llm.ts`	Streams responses from local/cloud models
Marker Parser	`packages/stage-ui/src/composables/llmmarkerParser.ts`	Detects markers & splits text
Safety Filter	`packages/stage-ui/src/utils/safety.ts`	Keyword & mod-api checks
Memory DB Schema	`services/telegram-bot/src/db/schema.ts`	Vector storage for images & text
Emotion Map	`packages/stage-ui/src/constants/emotions.ts`	Marker → animation mapping

Minecraft Player How AIRI Sees & Hears