AI Configuration Reference¶

This document provides a complete reference for configuring MAID's AI features, including LLM providers and the NPC dialogue system.

Overview¶

MAID's AI integration consists of two configuration areas:

AI Provider Settings (MAID_AI_*) - Configure the underlying LLM providers
AI Dialogue Settings (MAID_AI_DIALOGUE_*) - Configure NPC dialogue behavior

AI Provider Settings¶

These settings control the base AI/LLM infrastructure.

Environment Variables¶

Variable	Type	Default	Description
`MAID_AI_DEFAULT_PROVIDER`	`str`	`"anthropic"`	Default provider for all AI features
`MAID_AI_ANTHROPIC_API_KEY`	`str`	`None`	Anthropic Claude API key
`MAID_AI_ANTHROPIC_MODEL`	`str`	`"claude-sonnet-4-20250514"`	Default Anthropic model
`MAID_AI_OPENAI_API_KEY`	`str`	`None`	OpenAI API key
`MAID_AI_OPENAI_MODEL`	`str`	`"gpt-4o"`	Default OpenAI model
`MAID_AI_OLLAMA_HOST`	`str`	`"http://localhost:11434"`	Ollama server URL
`MAID_AI_OLLAMA_MODEL`	`str`	`"llama3.2"`	Default Ollama model
`MAID_AI_MAX_TOKENS`	`int`	`500`	Default max tokens for completions
`MAID_AI_TEMPERATURE`	`float`	`0.7`	Default temperature for completions
`MAID_AI_REQUEST_TIMEOUT`	`float`	`30.0`	API request timeout in seconds

Provider Selection¶

Providers are automatically registered based on available API keys:

If MAID_AI_ANTHROPIC_API_KEY is set, Anthropic provider is registered
If MAID_AI_OPENAI_API_KEY is set, OpenAI provider is registered
If MAID_AI_OLLAMA_HOST is set (defaults to localhost), Ollama provider is registered
If no providers are configured, a mock provider is used

The MAID_AI_DEFAULT_PROVIDER setting determines which provider is used by default.

Available Models¶

Anthropic Claude: - claude-sonnet-4-20250514 - Latest Sonnet, best balance (default) - claude-opus-4-20250514 - Opus, highest quality - claude-3-5-haiku-20241022 - Haiku, fastest

OpenAI GPT: - gpt-4o - GPT-4o, best quality (default) - gpt-4o-mini - Smaller, faster - gpt-4-turbo - GPT-4 Turbo - gpt-3.5-turbo - Fastest, cheapest

Ollama (Local): - llama3.2 - Good balance (default) - llama3.1 - Better quality - mistral - Fast, good quality - phi3 - Fastest, lightweight

AI Dialogue Settings¶

These settings control the NPC dialogue system behavior.

Core Settings¶

Variable	Type	Default	Description
`MAID_AI_DIALOGUE_ENABLED`	`bool`	`true`	Master switch for AI dialogue system
`MAID_AI_DIALOGUE_DEFAULT_PROVIDER`	`str`	`"anthropic"`	Default provider for dialogue
`MAID_AI_DIALOGUE_DEFAULT_MODEL`	`str`	`None`	Default model (uses provider default if not set)
`MAID_AI_DIALOGUE_DEFAULT_MAX_TOKENS`	`int`	`150`	Default max response tokens (50-1000)
`MAID_AI_DIALOGUE_DEFAULT_TEMPERATURE`	`float`	`0.7`	Default temperature (0.0-2.0)

Rate Limiting¶

Rate limiting protects against abuse and controls costs.

Variable	Type	Default	Description
`MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM`	`int`	`60`	Max requests/minute globally
`MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM`	`int`	`10`	Max requests/minute per player
`MAID_AI_DIALOGUE_PER_NPC_COOLDOWN_SECONDS`	`float`	`2.0`	Default cooldown between NPC responses

Rate Limit Behavior: - When global limit is hit: All players receive wait messages - When player limit is hit: That player receives wait message - When NPC cooldown is active: NPC responds with "Let me think..."

Token Budgets¶

Token budgets help control API costs.

Variable	Type	Default	Description
`MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET`	`int`	`None`	Daily token limit for server (`None` = unlimited)
`MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET`	`int`	`5000`	Daily token limit per player

When a budget is exhausted: - Server budget: All AI dialogue uses fallback responses - Player budget: That player's AI requests use fallback responses

Budgets reset at midnight (server time).

Context Settings¶

Context injection adds world state to NPC prompts.

Variable	Type	Default	Description
`MAID_AI_DIALOGUE_INCLUDE_WORLD_CONTEXT`	`bool`	`true`	Include time of day and weather
`MAID_AI_DIALOGUE_INCLUDE_LOCATION_CONTEXT`	`bool`	`true`	Include room name, description, exits
`MAID_AI_DIALOGUE_INCLUDE_PLAYER_CONTEXT`	`bool`	`true`	Include player name, level, race/class

Disabling context reduces token usage but makes NPCs less aware of their surroundings.

Conversation Settings¶

Variable	Type	Default	Description
`MAID_AI_DIALOGUE_MAX_CONVERSATION_HISTORY`	`int`	`10`	Max messages to retain per conversation
`MAID_AI_DIALOGUE_CONVERSATION_TIMEOUT_MINUTES`	`int`	`30`	Minutes before idle conversations are cleaned up

Conversation Lifecycle: 1. First message creates a new conversation 2. History is maintained up to MAX_CONVERSATION_HISTORY messages 3. After CONVERSATION_TIMEOUT_MINUTES of inactivity, conversation is removed 4. Player can explicitly end with endconversation command

Safety Settings¶

Variable	Type	Default	Description
`MAID_AI_DIALOGUE_CONTENT_FILTERING`	`bool`	`true`	Enable content safety filtering
`MAID_AI_DIALOGUE_LOG_CONVERSATIONS`	`bool`	`false`	Log conversation content

Content Filtering: When enabled, the system prompt includes instructions for NPCs to: - Stay in character at all times - Never acknowledge being an AI - Refuse harmful, illegal, or inappropriate requests - Not suggest specific game commands (unless configured)

Important: Content Filtering and Streaming Interaction

Content filtering requires the complete response to be available before safety evaluation, which means responses must be buffered rather than streamed. Since content filtering is enabled by default (MAID_AI_DIALOGUE_CONTENT_FILTERING=true), streaming is effectively disabled by default even though the streaming setting defaults to true.

To enable true streaming (responses appear incrementally as the AI generates them): 1. Disable content filtering: MAID_AI_DIALOGUE_CONTENT_FILTERING=false 2. Ensure streaming is enabled: MAID_AI_DIALOGUE_ENABLE_STREAMING=true (default)

Trade-off: Disabling content filtering removes the safety layer that prevents inappropriate AI responses from reaching players. Only disable filtering in trusted environments or when using AI providers with their own content moderation (e.g., Claude and GPT-4 have built-in safety features).

Conversation Logging: Enable only for debugging. Logs may contain player messages and should be handled according to your privacy policy.

Configuration Examples¶

Minimal Production Setup¶

# .env file
MAID_AI_ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
MAID_AI_DIALOGUE_ENABLED=true

Full Production Setup¶

# .env file

# Provider configuration
MAID_AI_DEFAULT_PROVIDER=anthropic
MAID_AI_ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
MAID_AI_ANTHROPIC_MODEL=claude-sonnet-4-20250514
MAID_AI_REQUEST_TIMEOUT=30.0

# AI Dialogue - Core
MAID_AI_DIALOGUE_ENABLED=true
MAID_AI_DIALOGUE_DEFAULT_PROVIDER=anthropic
MAID_AI_DIALOGUE_DEFAULT_MAX_TOKENS=150
MAID_AI_DIALOGUE_DEFAULT_TEMPERATURE=0.7

# AI Dialogue - Rate Limiting
MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM=120
MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM=15
MAID_AI_DIALOGUE_PER_NPC_COOLDOWN_SECONDS=1.5

# AI Dialogue - Token Budgets
MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET=100000
MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET=3000

# AI Dialogue - Context
MAID_AI_DIALOGUE_INCLUDE_WORLD_CONTEXT=true
MAID_AI_DIALOGUE_INCLUDE_LOCATION_CONTEXT=true
MAID_AI_DIALOGUE_INCLUDE_PLAYER_CONTEXT=true

# AI Dialogue - Conversations
MAID_AI_DIALOGUE_MAX_CONVERSATION_HISTORY=8
MAID_AI_DIALOGUE_CONVERSATION_TIMEOUT_MINUTES=20

# AI Dialogue - Safety
MAID_AI_DIALOGUE_CONTENT_FILTERING=true
MAID_AI_DIALOGUE_LOG_CONVERSATIONS=false

Development Setup with Ollama¶

# .env file

# Use local Ollama for development
MAID_AI_DEFAULT_PROVIDER=ollama
MAID_AI_OLLAMA_HOST=http://localhost:11434
MAID_AI_OLLAMA_MODEL=llama3.2

# Relaxed rate limits for testing
MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM=1000
MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM=100
MAID_AI_DIALOGUE_PER_NPC_COOLDOWN_SECONDS=0.5

# No token budgets in dev
MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET=
MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET=

# Enable logging for debugging
MAID_AI_DIALOGUE_LOG_CONVERSATIONS=true

Multi-Provider Setup¶

# .env file

# Configure multiple providers
MAID_AI_ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
MAID_AI_OPENAI_API_KEY=sk-your-openai-key-here
MAID_AI_OLLAMA_HOST=http://localhost:11434

# Set default to Anthropic
MAID_AI_DEFAULT_PROVIDER=anthropic
MAID_AI_DIALOGUE_DEFAULT_PROVIDER=anthropic

With multiple providers configured, individual NPCs can override:

# Important NPC uses Claude
important_npc = DialogueComponent(
    provider_name="anthropic",
    model_name="claude-opus-4-20250514",
    # ...
)

# Background NPC uses local Ollama
background_npc = DialogueComponent(
    provider_name="ollama",
    model_name="phi3",
    # ...
)

Cost-Optimized Setup¶

# .env file

# Use fast model
MAID_AI_ANTHROPIC_MODEL=claude-3-5-haiku-20241022

# Strict rate limits
MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM=30
MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM=5

# Tight token budgets
MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET=50000
MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET=1000

# Shorter responses
MAID_AI_DIALOGUE_DEFAULT_MAX_TOKENS=100

# Disable context to save tokens
MAID_AI_DIALOGUE_INCLUDE_WORLD_CONTEXT=false
MAID_AI_DIALOGUE_INCLUDE_PLAYER_CONTEXT=false

Programmatic Configuration¶

Settings can also be configured in code:

from maid_engine.config.settings import Settings, AISettings, AIDialogueSettings

settings = Settings(
    ai=AISettings(
        default_provider="anthropic",
        anthropic_api_key="sk-ant-...",
        anthropic_model="claude-sonnet-4-20250514",
    ),
    ai_dialogue=AIDialogueSettings(
        enabled=True,
        default_provider="anthropic",
        global_rate_limit_rpm=60,
        per_player_rate_limit_rpm=10,
        daily_token_budget=100000,
    ),
)

engine = GameEngine(settings)

Validation¶

The settings framework validates configuration:

default_max_tokens must be between 50 and 1000
default_temperature must be between 0.0 and 2.0
conversation_timeout_minutes must be positive

Invalid settings raise ValueError at startup.

Monitoring¶

Checking Provider Status¶

from maid_engine.ai import get_registry

registry = get_registry()

# List registered providers
print(registry.list_providers())  # ['anthropic', 'openai', 'ollama']

# Check which are available
available = await registry.get_available()
print(available)  # ['anthropic', 'ollama']

# Get default provider
print(registry.default)  # 'anthropic'

Rate Limit Status¶

Rate limit information is logged at DEBUG level and returned to players when limits are hit.

Token Budget Status¶

Token budgets are tracked internally. When exhausted, the system logs a warning and uses fallback responses.

NPC Dialogue Guide - Creating and configuring AI-powered NPCs
AI Provider Testing - Testing with different providers