Skip to content

AI Configuration Reference

This document provides a complete reference for configuring MAID's AI features, including LLM providers and the NPC dialogue system.

Overview

MAID's AI integration consists of two configuration areas:

  1. AI Provider Settings (MAID_AI_*) - Configure the underlying LLM providers
  2. AI Dialogue Settings (MAID_AI_DIALOGUE_*) - Configure NPC dialogue behavior

AI Provider Settings

These settings control the base AI/LLM infrastructure.

Environment Variables

Variable Type Default Description
MAID_AI_DEFAULT_PROVIDER str "anthropic" Default provider for all AI features
MAID_AI_ANTHROPIC_API_KEY str None Anthropic Claude API key
MAID_AI_ANTHROPIC_MODEL str "claude-sonnet-4-20250514" Default Anthropic model
MAID_AI_OPENAI_API_KEY str None OpenAI API key
MAID_AI_OPENAI_MODEL str "gpt-4o" Default OpenAI model
MAID_AI_OLLAMA_HOST str "http://localhost:11434" Ollama server URL
MAID_AI_OLLAMA_MODEL str "llama3.2" Default Ollama model
MAID_AI_MAX_TOKENS int 500 Default max tokens for completions
MAID_AI_TEMPERATURE float 0.7 Default temperature for completions
MAID_AI_REQUEST_TIMEOUT float 30.0 API request timeout in seconds

Provider Selection

Providers are automatically registered based on available API keys:

  1. If MAID_AI_ANTHROPIC_API_KEY is set, Anthropic provider is registered
  2. If MAID_AI_OPENAI_API_KEY is set, OpenAI provider is registered
  3. If MAID_AI_OLLAMA_HOST is set (defaults to localhost), Ollama provider is registered
  4. If no providers are configured, a mock provider is used

The MAID_AI_DEFAULT_PROVIDER setting determines which provider is used by default.

Available Models

Anthropic Claude: - claude-sonnet-4-20250514 - Latest Sonnet, best balance (default) - claude-opus-4-20250514 - Opus, highest quality - claude-3-5-haiku-20241022 - Haiku, fastest

OpenAI GPT: - gpt-4o - GPT-4o, best quality (default) - gpt-4o-mini - Smaller, faster - gpt-4-turbo - GPT-4 Turbo - gpt-3.5-turbo - Fastest, cheapest

Ollama (Local): - llama3.2 - Good balance (default) - llama3.1 - Better quality - mistral - Fast, good quality - phi3 - Fastest, lightweight

AI Dialogue Settings

These settings control the NPC dialogue system behavior.

Core Settings

Variable Type Default Description
MAID_AI_DIALOGUE_ENABLED bool true Master switch for AI dialogue system
MAID_AI_DIALOGUE_DEFAULT_PROVIDER str "anthropic" Default provider for dialogue
MAID_AI_DIALOGUE_DEFAULT_MODEL str None Default model (uses provider default if not set)
MAID_AI_DIALOGUE_DEFAULT_MAX_TOKENS int 150 Default max response tokens (50-1000)
MAID_AI_DIALOGUE_DEFAULT_TEMPERATURE float 0.7 Default temperature (0.0-2.0)

Rate Limiting

Rate limiting protects against abuse and controls costs.

Variable Type Default Description
MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM int 60 Max requests/minute globally
MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM int 10 Max requests/minute per player
MAID_AI_DIALOGUE_PER_NPC_COOLDOWN_SECONDS float 2.0 Default cooldown between NPC responses

Rate Limit Behavior: - When global limit is hit: All players receive wait messages - When player limit is hit: That player receives wait message - When NPC cooldown is active: NPC responds with "Let me think..."

Token Budgets

Token budgets help control API costs.

Variable Type Default Description
MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET int None Daily token limit for server (None = unlimited)
MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET int 5000 Daily token limit per player

When a budget is exhausted: - Server budget: All AI dialogue uses fallback responses - Player budget: That player's AI requests use fallback responses

Budgets reset at midnight (server time).

Context Settings

Context injection adds world state to NPC prompts.

Variable Type Default Description
MAID_AI_DIALOGUE_INCLUDE_WORLD_CONTEXT bool true Include time of day and weather
MAID_AI_DIALOGUE_INCLUDE_LOCATION_CONTEXT bool true Include room name, description, exits
MAID_AI_DIALOGUE_INCLUDE_PLAYER_CONTEXT bool true Include player name, level, race/class

Disabling context reduces token usage but makes NPCs less aware of their surroundings.

Conversation Settings

Variable Type Default Description
MAID_AI_DIALOGUE_MAX_CONVERSATION_HISTORY int 10 Max messages to retain per conversation
MAID_AI_DIALOGUE_CONVERSATION_TIMEOUT_MINUTES int 30 Minutes before idle conversations are cleaned up

Conversation Lifecycle: 1. First message creates a new conversation 2. History is maintained up to MAX_CONVERSATION_HISTORY messages 3. After CONVERSATION_TIMEOUT_MINUTES of inactivity, conversation is removed 4. Player can explicitly end with endconversation command

Safety Settings

Variable Type Default Description
MAID_AI_DIALOGUE_CONTENT_FILTERING bool true Enable content safety filtering
MAID_AI_DIALOGUE_LOG_CONVERSATIONS bool false Log conversation content

Content Filtering: When enabled, the system prompt includes instructions for NPCs to: - Stay in character at all times - Never acknowledge being an AI - Refuse harmful, illegal, or inappropriate requests - Not suggest specific game commands (unless configured)

Important: Content Filtering and Streaming Interaction

Content filtering requires the complete response to be available before safety evaluation, which means responses must be buffered rather than streamed. Since content filtering is enabled by default (MAID_AI_DIALOGUE_CONTENT_FILTERING=true), streaming is effectively disabled by default even though the streaming setting defaults to true.

To enable true streaming (responses appear incrementally as the AI generates them): 1. Disable content filtering: MAID_AI_DIALOGUE_CONTENT_FILTERING=false 2. Ensure streaming is enabled: MAID_AI_DIALOGUE_ENABLE_STREAMING=true (default)

Trade-off: Disabling content filtering removes the safety layer that prevents inappropriate AI responses from reaching players. Only disable filtering in trusted environments or when using AI providers with their own content moderation (e.g., Claude and GPT-4 have built-in safety features).

Conversation Logging: Enable only for debugging. Logs may contain player messages and should be handled according to your privacy policy.

Configuration Examples

Minimal Production Setup

# .env file
MAID_AI_ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
MAID_AI_DIALOGUE_ENABLED=true

Full Production Setup

# .env file

# Provider configuration
MAID_AI_DEFAULT_PROVIDER=anthropic
MAID_AI_ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
MAID_AI_ANTHROPIC_MODEL=claude-sonnet-4-20250514
MAID_AI_REQUEST_TIMEOUT=30.0

# AI Dialogue - Core
MAID_AI_DIALOGUE_ENABLED=true
MAID_AI_DIALOGUE_DEFAULT_PROVIDER=anthropic
MAID_AI_DIALOGUE_DEFAULT_MAX_TOKENS=150
MAID_AI_DIALOGUE_DEFAULT_TEMPERATURE=0.7

# AI Dialogue - Rate Limiting
MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM=120
MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM=15
MAID_AI_DIALOGUE_PER_NPC_COOLDOWN_SECONDS=1.5

# AI Dialogue - Token Budgets
MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET=100000
MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET=3000

# AI Dialogue - Context
MAID_AI_DIALOGUE_INCLUDE_WORLD_CONTEXT=true
MAID_AI_DIALOGUE_INCLUDE_LOCATION_CONTEXT=true
MAID_AI_DIALOGUE_INCLUDE_PLAYER_CONTEXT=true

# AI Dialogue - Conversations
MAID_AI_DIALOGUE_MAX_CONVERSATION_HISTORY=8
MAID_AI_DIALOGUE_CONVERSATION_TIMEOUT_MINUTES=20

# AI Dialogue - Safety
MAID_AI_DIALOGUE_CONTENT_FILTERING=true
MAID_AI_DIALOGUE_LOG_CONVERSATIONS=false

Development Setup with Ollama

# .env file

# Use local Ollama for development
MAID_AI_DEFAULT_PROVIDER=ollama
MAID_AI_OLLAMA_HOST=http://localhost:11434
MAID_AI_OLLAMA_MODEL=llama3.2

# Relaxed rate limits for testing
MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM=1000
MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM=100
MAID_AI_DIALOGUE_PER_NPC_COOLDOWN_SECONDS=0.5

# No token budgets in dev
MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET=
MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET=

# Enable logging for debugging
MAID_AI_DIALOGUE_LOG_CONVERSATIONS=true

Multi-Provider Setup

# .env file

# Configure multiple providers
MAID_AI_ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
MAID_AI_OPENAI_API_KEY=sk-your-openai-key-here
MAID_AI_OLLAMA_HOST=http://localhost:11434

# Set default to Anthropic
MAID_AI_DEFAULT_PROVIDER=anthropic
MAID_AI_DIALOGUE_DEFAULT_PROVIDER=anthropic

With multiple providers configured, individual NPCs can override:

# Important NPC uses Claude
important_npc = DialogueComponent(
    provider_name="anthropic",
    model_name="claude-opus-4-20250514",
    # ...
)

# Background NPC uses local Ollama
background_npc = DialogueComponent(
    provider_name="ollama",
    model_name="phi3",
    # ...
)

Cost-Optimized Setup

# .env file

# Use fast model
MAID_AI_ANTHROPIC_MODEL=claude-3-5-haiku-20241022

# Strict rate limits
MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM=30
MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM=5

# Tight token budgets
MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET=50000
MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET=1000

# Shorter responses
MAID_AI_DIALOGUE_DEFAULT_MAX_TOKENS=100

# Disable context to save tokens
MAID_AI_DIALOGUE_INCLUDE_WORLD_CONTEXT=false
MAID_AI_DIALOGUE_INCLUDE_PLAYER_CONTEXT=false

Programmatic Configuration

Settings can also be configured in code:

from maid_engine.config.settings import Settings, AISettings, AIDialogueSettings

settings = Settings(
    ai=AISettings(
        default_provider="anthropic",
        anthropic_api_key="sk-ant-...",
        anthropic_model="claude-sonnet-4-20250514",
    ),
    ai_dialogue=AIDialogueSettings(
        enabled=True,
        default_provider="anthropic",
        global_rate_limit_rpm=60,
        per_player_rate_limit_rpm=10,
        daily_token_budget=100000,
    ),
)

engine = GameEngine(settings)

Validation

The settings framework validates configuration:

  • default_max_tokens must be between 50 and 1000
  • default_temperature must be between 0.0 and 2.0
  • conversation_timeout_minutes must be positive

Invalid settings raise ValueError at startup.

Monitoring

Checking Provider Status

from maid_engine.ai import get_registry

registry = get_registry()

# List registered providers
print(registry.list_providers())  # ['anthropic', 'openai', 'ollama']

# Check which are available
available = await registry.get_available()
print(available)  # ['anthropic', 'ollama']

# Get default provider
print(registry.default)  # 'anthropic'

Rate Limit Status

Rate limit information is logged at DEBUG level and returned to players when limits are hit.

Token Budget Status

Token budgets are tracked internally. When exhausted, the system logs a warning and uses fallback responses.