AI Configuration Reference¶
This document provides a complete reference for configuring MAID's AI features, including LLM providers and the NPC dialogue system.
Overview¶
MAID's AI integration consists of two configuration areas:
- AI Provider Settings (
MAID_AI_*) - Configure the underlying LLM providers - AI Dialogue Settings (
MAID_AI_DIALOGUE_*) - Configure NPC dialogue behavior
AI Provider Settings¶
These settings control the base AI/LLM infrastructure.
Environment Variables¶
| Variable | Type | Default | Description |
|---|---|---|---|
MAID_AI_DEFAULT_PROVIDER |
str |
"anthropic" |
Default provider for all AI features |
MAID_AI_ANTHROPIC_API_KEY |
str |
None |
Anthropic Claude API key |
MAID_AI_ANTHROPIC_MODEL |
str |
"claude-sonnet-4-20250514" |
Default Anthropic model |
MAID_AI_OPENAI_API_KEY |
str |
None |
OpenAI API key |
MAID_AI_OPENAI_MODEL |
str |
"gpt-4o" |
Default OpenAI model |
MAID_AI_OLLAMA_HOST |
str |
"http://localhost:11434" |
Ollama server URL |
MAID_AI_OLLAMA_MODEL |
str |
"llama3.2" |
Default Ollama model |
MAID_AI_MAX_TOKENS |
int |
500 |
Default max tokens for completions |
MAID_AI_TEMPERATURE |
float |
0.7 |
Default temperature for completions |
MAID_AI_REQUEST_TIMEOUT |
float |
30.0 |
API request timeout in seconds |
Provider Selection¶
Providers are automatically registered based on available API keys:
- If
MAID_AI_ANTHROPIC_API_KEYis set, Anthropic provider is registered - If
MAID_AI_OPENAI_API_KEYis set, OpenAI provider is registered - If
MAID_AI_OLLAMA_HOSTis set (defaults to localhost), Ollama provider is registered - If no providers are configured, a mock provider is used
The MAID_AI_DEFAULT_PROVIDER setting determines which provider is used by default.
Available Models¶
Anthropic Claude:
- claude-sonnet-4-20250514 - Latest Sonnet, best balance (default)
- claude-opus-4-20250514 - Opus, highest quality
- claude-3-5-haiku-20241022 - Haiku, fastest
OpenAI GPT:
- gpt-4o - GPT-4o, best quality (default)
- gpt-4o-mini - Smaller, faster
- gpt-4-turbo - GPT-4 Turbo
- gpt-3.5-turbo - Fastest, cheapest
Ollama (Local):
- llama3.2 - Good balance (default)
- llama3.1 - Better quality
- mistral - Fast, good quality
- phi3 - Fastest, lightweight
AI Dialogue Settings¶
These settings control the NPC dialogue system behavior.
Core Settings¶
| Variable | Type | Default | Description |
|---|---|---|---|
MAID_AI_DIALOGUE_ENABLED |
bool |
true |
Master switch for AI dialogue system |
MAID_AI_DIALOGUE_DEFAULT_PROVIDER |
str |
"anthropic" |
Default provider for dialogue |
MAID_AI_DIALOGUE_DEFAULT_MODEL |
str |
None |
Default model (uses provider default if not set) |
MAID_AI_DIALOGUE_DEFAULT_MAX_TOKENS |
int |
150 |
Default max response tokens (50-1000) |
MAID_AI_DIALOGUE_DEFAULT_TEMPERATURE |
float |
0.7 |
Default temperature (0.0-2.0) |
Rate Limiting¶
Rate limiting protects against abuse and controls costs.
| Variable | Type | Default | Description |
|---|---|---|---|
MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM |
int |
60 |
Max requests/minute globally |
MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM |
int |
10 |
Max requests/minute per player |
MAID_AI_DIALOGUE_PER_NPC_COOLDOWN_SECONDS |
float |
2.0 |
Default cooldown between NPC responses |
Rate Limit Behavior: - When global limit is hit: All players receive wait messages - When player limit is hit: That player receives wait message - When NPC cooldown is active: NPC responds with "Let me think..."
Token Budgets¶
Token budgets help control API costs.
| Variable | Type | Default | Description |
|---|---|---|---|
MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET |
int |
None |
Daily token limit for server (None = unlimited) |
MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET |
int |
5000 |
Daily token limit per player |
When a budget is exhausted: - Server budget: All AI dialogue uses fallback responses - Player budget: That player's AI requests use fallback responses
Budgets reset at midnight (server time).
Context Settings¶
Context injection adds world state to NPC prompts.
| Variable | Type | Default | Description |
|---|---|---|---|
MAID_AI_DIALOGUE_INCLUDE_WORLD_CONTEXT |
bool |
true |
Include time of day and weather |
MAID_AI_DIALOGUE_INCLUDE_LOCATION_CONTEXT |
bool |
true |
Include room name, description, exits |
MAID_AI_DIALOGUE_INCLUDE_PLAYER_CONTEXT |
bool |
true |
Include player name, level, race/class |
Disabling context reduces token usage but makes NPCs less aware of their surroundings.
Conversation Settings¶
| Variable | Type | Default | Description |
|---|---|---|---|
MAID_AI_DIALOGUE_MAX_CONVERSATION_HISTORY |
int |
10 |
Max messages to retain per conversation |
MAID_AI_DIALOGUE_CONVERSATION_TIMEOUT_MINUTES |
int |
30 |
Minutes before idle conversations are cleaned up |
Conversation Lifecycle:
1. First message creates a new conversation
2. History is maintained up to MAX_CONVERSATION_HISTORY messages
3. After CONVERSATION_TIMEOUT_MINUTES of inactivity, conversation is removed
4. Player can explicitly end with endconversation command
Safety Settings¶
| Variable | Type | Default | Description |
|---|---|---|---|
MAID_AI_DIALOGUE_CONTENT_FILTERING |
bool |
true |
Enable content safety filtering |
MAID_AI_DIALOGUE_LOG_CONVERSATIONS |
bool |
false |
Log conversation content |
Content Filtering: When enabled, the system prompt includes instructions for NPCs to: - Stay in character at all times - Never acknowledge being an AI - Refuse harmful, illegal, or inappropriate requests - Not suggest specific game commands (unless configured)
Important: Content Filtering and Streaming Interaction
Content filtering requires the complete response to be available before safety evaluation, which means responses must be buffered rather than streamed. Since content filtering is enabled by default (MAID_AI_DIALOGUE_CONTENT_FILTERING=true), streaming is effectively disabled by default even though the streaming setting defaults to true.
To enable true streaming (responses appear incrementally as the AI generates them):
1. Disable content filtering: MAID_AI_DIALOGUE_CONTENT_FILTERING=false
2. Ensure streaming is enabled: MAID_AI_DIALOGUE_ENABLE_STREAMING=true (default)
Trade-off: Disabling content filtering removes the safety layer that prevents inappropriate AI responses from reaching players. Only disable filtering in trusted environments or when using AI providers with their own content moderation (e.g., Claude and GPT-4 have built-in safety features).
Conversation Logging: Enable only for debugging. Logs may contain player messages and should be handled according to your privacy policy.
Configuration Examples¶
Minimal Production Setup¶
Full Production Setup¶
# .env file
# Provider configuration
MAID_AI_DEFAULT_PROVIDER=anthropic
MAID_AI_ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
MAID_AI_ANTHROPIC_MODEL=claude-sonnet-4-20250514
MAID_AI_REQUEST_TIMEOUT=30.0
# AI Dialogue - Core
MAID_AI_DIALOGUE_ENABLED=true
MAID_AI_DIALOGUE_DEFAULT_PROVIDER=anthropic
MAID_AI_DIALOGUE_DEFAULT_MAX_TOKENS=150
MAID_AI_DIALOGUE_DEFAULT_TEMPERATURE=0.7
# AI Dialogue - Rate Limiting
MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM=120
MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM=15
MAID_AI_DIALOGUE_PER_NPC_COOLDOWN_SECONDS=1.5
# AI Dialogue - Token Budgets
MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET=100000
MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET=3000
# AI Dialogue - Context
MAID_AI_DIALOGUE_INCLUDE_WORLD_CONTEXT=true
MAID_AI_DIALOGUE_INCLUDE_LOCATION_CONTEXT=true
MAID_AI_DIALOGUE_INCLUDE_PLAYER_CONTEXT=true
# AI Dialogue - Conversations
MAID_AI_DIALOGUE_MAX_CONVERSATION_HISTORY=8
MAID_AI_DIALOGUE_CONVERSATION_TIMEOUT_MINUTES=20
# AI Dialogue - Safety
MAID_AI_DIALOGUE_CONTENT_FILTERING=true
MAID_AI_DIALOGUE_LOG_CONVERSATIONS=false
Development Setup with Ollama¶
# .env file
# Use local Ollama for development
MAID_AI_DEFAULT_PROVIDER=ollama
MAID_AI_OLLAMA_HOST=http://localhost:11434
MAID_AI_OLLAMA_MODEL=llama3.2
# Relaxed rate limits for testing
MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM=1000
MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM=100
MAID_AI_DIALOGUE_PER_NPC_COOLDOWN_SECONDS=0.5
# No token budgets in dev
MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET=
MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET=
# Enable logging for debugging
MAID_AI_DIALOGUE_LOG_CONVERSATIONS=true
Multi-Provider Setup¶
# .env file
# Configure multiple providers
MAID_AI_ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
MAID_AI_OPENAI_API_KEY=sk-your-openai-key-here
MAID_AI_OLLAMA_HOST=http://localhost:11434
# Set default to Anthropic
MAID_AI_DEFAULT_PROVIDER=anthropic
MAID_AI_DIALOGUE_DEFAULT_PROVIDER=anthropic
With multiple providers configured, individual NPCs can override:
# Important NPC uses Claude
important_npc = DialogueComponent(
provider_name="anthropic",
model_name="claude-opus-4-20250514",
# ...
)
# Background NPC uses local Ollama
background_npc = DialogueComponent(
provider_name="ollama",
model_name="phi3",
# ...
)
Cost-Optimized Setup¶
# .env file
# Use fast model
MAID_AI_ANTHROPIC_MODEL=claude-3-5-haiku-20241022
# Strict rate limits
MAID_AI_DIALOGUE_GLOBAL_RATE_LIMIT_RPM=30
MAID_AI_DIALOGUE_PER_PLAYER_RATE_LIMIT_RPM=5
# Tight token budgets
MAID_AI_DIALOGUE_DAILY_TOKEN_BUDGET=50000
MAID_AI_DIALOGUE_PER_PLAYER_DAILY_BUDGET=1000
# Shorter responses
MAID_AI_DIALOGUE_DEFAULT_MAX_TOKENS=100
# Disable context to save tokens
MAID_AI_DIALOGUE_INCLUDE_WORLD_CONTEXT=false
MAID_AI_DIALOGUE_INCLUDE_PLAYER_CONTEXT=false
Programmatic Configuration¶
Settings can also be configured in code:
from maid_engine.config.settings import Settings, AISettings, AIDialogueSettings
settings = Settings(
ai=AISettings(
default_provider="anthropic",
anthropic_api_key="sk-ant-...",
anthropic_model="claude-sonnet-4-20250514",
),
ai_dialogue=AIDialogueSettings(
enabled=True,
default_provider="anthropic",
global_rate_limit_rpm=60,
per_player_rate_limit_rpm=10,
daily_token_budget=100000,
),
)
engine = GameEngine(settings)
Validation¶
The settings framework validates configuration:
default_max_tokensmust be between 50 and 1000default_temperaturemust be between 0.0 and 2.0conversation_timeout_minutesmust be positive
Invalid settings raise ValueError at startup.
Monitoring¶
Checking Provider Status¶
from maid_engine.ai import get_registry
registry = get_registry()
# List registered providers
print(registry.list_providers()) # ['anthropic', 'openai', 'ollama']
# Check which are available
available = await registry.get_available()
print(available) # ['anthropic', 'ollama']
# Get default provider
print(registry.default) # 'anthropic'
Rate Limit Status¶
Rate limit information is logged at DEBUG level and returned to players when limits are hit.
Token Budget Status¶
Token budgets are tracked internally. When exhausted, the system logs a warning and uses fallback responses.
Related Documentation¶
- NPC Dialogue Guide - Creating and configuring AI-powered NPCs
- AI Provider Testing - Testing with different providers