Skip to content

Exception Handling Policy

This document establishes the exception handling policy for the MAID codebase. Following these guidelines ensures consistent error handling, easier debugging, and better reliability.

Table of Contents

General Principles

  1. Prefer specific exceptions over broad catches - Catching specific exceptions makes code more predictable and easier to debug.

  2. Always have a plan for caught exceptions - Never silently swallow exceptions. At minimum, log them.

  3. Use custom exceptions for domain-specific errors - Create custom exception classes for errors that represent domain concepts.

  4. Preserve exception context - Use exception chaining (raise NewException from original) when re-raising.

  5. Document exceptions in docstrings - Use Raises: sections to document what exceptions a function may raise.

When to Use Specific Exceptions

Always prefer specific exception types in these contexts:

Data Validation and Parsing

# Good: Catch specific parsing errors
try:
    value = int(user_input)
except ValueError:
    await ctx.send("Please enter a valid number.")

# Good: Catch specific key errors
try:
    config = data["settings"]["timeout"]
except KeyError as e:
    raise ConfigurationError(f"Missing required config key: {e}") from e

Resource Operations

# Good: Catch specific file errors
try:
    content = path.read_text()
except FileNotFoundError:
    logger.warning(f"Config file not found: {path}")
    return default_config
except PermissionError:
    raise ConfigurationError(f"Cannot read config file: {path}")

Network Operations

# Good: Catch specific network errors
try:
    async with session.get(url) as response:
        return await response.json()
except aiohttp.ClientConnectionError:
    logger.warning(f"Connection failed to {url}")
    return None
except asyncio.TimeoutError:
    logger.warning(f"Request timed out: {url}")
    return None

Type Checking and Imports

# Good: Catch import errors for optional dependencies
try:
    import anthropic
    ANTHROPIC_AVAILABLE = True
except ImportError:
    ANTHROPIC_AVAILABLE = False

When Broad Exception Handling is Acceptable

Broad except Exception handling is acceptable in specific, well-justified scenarios:

1. Top-Level Error Boundaries

At the outermost layer of request/connection handlers where unhandled exceptions would crash the server:

async def handle_connection(session: Session) -> None:
    """Top-level connection handler - must not let exceptions escape."""
    try:
        await process_session(session)
    except Exception:
        # Log full traceback for debugging
        logger.exception("Unhandled error in connection handler")
        await session.send_line("An error occurred. Please try again.")
        await session.close()

2. Tick Loop Protection

The game tick loop must continue running even if individual systems fail:

async def tick(self, delta: float) -> None:
    """Process game tick - must be resilient to system failures."""
    for system in self._systems:
        try:
            await system.update(delta)
        except Exception:
            # Log but continue - one broken system shouldn't stop the game
            logger.exception(f"Error in system {system.__class__.__name__}")

3. Cleanup and Shutdown Operations

During shutdown, we want to clean up as much as possible even if some operations fail:

async def shutdown(self) -> None:
    """Shutdown all systems - attempt all cleanup even if some fail."""
    for system in self._systems:
        try:
            await system.shutdown()
        except Exception:
            logger.warning(f"Error during {system.__class__.__name__} shutdown", exc_info=True)

4. Plugin and Content Pack Loading

Non-critical features should not prevent server startup:

# Non-critical feature - game can run without recipes
try:
    self._recipe_manager = RecipeManager(data_dir=recipe_path)
    count = await self._recipe_manager.load()
    logger.debug(f"Loaded {count} crafting recipes")
except ImportError:
    logger.debug("RecipeManager not available, crafting disabled")
except Exception:
    # Log full traceback for unexpected errors, but don't crash
    logger.exception("Failed to load crafting recipes")

5. Event Handler Isolation

Event handlers should be isolated to prevent one bad handler from breaking others:

async def emit(self, event: Event) -> None:
    """Emit event to all handlers - isolate handler failures."""
    for handler in self._handlers.get(type(event), []):
        try:
            await handler(event)
        except Exception:
            logger.exception(f"Error in event handler {handler.__name__}")

6. Hot Reload and Recovery Operations

Recovery operations should try multiple approaches:

async def rollback(self) -> bool:
    """Attempt rollback - try to recover even if individual steps fail."""
    try:
        await self._restore_modules()
        return True
    except Exception:
        logger.exception("Rollback failed")
        return False

Custom Exception Hierarchy

MAID uses custom exceptions organized by subsystem. When creating new exceptions:

  1. Inherit from a domain-specific base if one exists
  2. Include relevant context in the exception message
  3. Document the exception with a docstring

Engine Exceptions

# Grid system
class GridError(Exception):
    """Base exception for grid-related errors."""

class CoordinateOccupiedError(GridError):
    """Raised when trying to place a room at an occupied coordinate."""

class RoomNotFoundError(GridError):
    """Raised when a room is not found in the grid."""

Plugin Exceptions

class PluginRegistryError(Exception):
    """Base error for plugin registry operations."""

class PluginNotFoundError(PluginRegistryError):
    """Plugin was not found in the registry."""

class PluginInstallError(PluginRegistryError):
    """Error during plugin installation."""

Reload Exceptions

class HotReloadError(Exception):
    """Base exception for hot reload errors."""

class DependencyViolationError(HotReloadError):
    """Raised when a hot reload operation would violate dependencies."""

class MigrationError(HotReloadError):
    """Raised when a component migration fails during hot reload."""

Exception Chaining

Always use exception chaining when re-raising to preserve the original traceback:

# Good: Chain exceptions to preserve context
try:
    data = json.loads(content)
except json.JSONDecodeError as e:
    raise ConfigurationError(f"Invalid JSON in config file: {path}") from e

# Bad: Context lost
try:
    data = json.loads(content)
except json.JSONDecodeError:
    raise ConfigurationError(f"Invalid JSON in config file: {path}")  # Original traceback lost!

Logging in Exception Handlers

Use logger.exception() for Unexpected Errors

This automatically includes the full traceback:

except Exception:
    logger.exception("Unexpected error during processing")

Use logger.warning() or logger.error() for Expected Errors

When you handle a known error condition:

except FileNotFoundError:
    logger.warning(f"Optional config file not found: {path}")

Include Relevant Context

except ValueError as e:
    logger.error(f"Invalid value for {param_name}: {e}")

Examples

Command Handler with Specific Exceptions

async def cmd_give(ctx: CommandContext, args: ParsedArguments) -> bool:
    """Give an item to another character."""
    try:
        item = await resolve_item(ctx, args["item"])
        target = await resolve_character(ctx, args["target"])
    except TargetNotFoundError as e:
        await ctx.send(str(e))
        return True
    except TargetAmbiguousError as e:
        await ctx.send(f"Which one? {e.candidates}")
        return True

    # Transfer the item
    try:
        await transfer_item(item, ctx.character, target)
    except InventoryFullError:
        await ctx.send(f"{target.name}'s inventory is full.")
        return True
    except ItemBoundError:
        await ctx.send("That item cannot be traded.")
        return True

    await ctx.send(f"You give {item.name} to {target.name}.")
    return True

System with Graceful Degradation

class WeatherSystem(System):
    """Weather system with graceful degradation."""

    async def update(self, delta: float) -> None:
        # Critical: Update weather state
        self._advance_weather(delta)

        # Non-critical: Generate atmospheric messages
        try:
            await self._broadcast_weather_effects()
        except Exception:
            logger.exception("Error broadcasting weather effects")
            # Continue - weather still works, just without messages

API Endpoint with Error Boundary

@router.post("/entities")
async def create_entity(request: CreateEntityRequest) -> EntityResponse:
    """Create a new entity - API error boundary."""
    try:
        entity = await entity_service.create(
            type=request.type,
            name=request.name,
            components=request.components,
        )
        return EntityResponse.from_entity(entity)
    except ValidationError as e:
        raise HTTPException(status_code=400, detail=str(e))
    except DuplicateEntityError as e:
        raise HTTPException(status_code=409, detail=str(e))
    except Exception:
        logger.exception("Unexpected error creating entity")
        raise HTTPException(status_code=500, detail="Internal server error")

Summary

Scenario Recommended Approach
Data validation Specific exceptions (ValueError, TypeError)
File operations Specific exceptions (FileNotFoundError, PermissionError)
Network operations Specific exceptions + timeout handling
Optional imports except ImportError
Top-level handlers except Exception with logging
Tick loop systems except Exception with logging
Shutdown/cleanup except Exception with logging
Non-critical features except Exception with logging
Event handlers except Exception with logging

When in doubt, prefer specific exceptions and escalate to broad handling only at well-defined boundaries.