Technical Debt Tracking¶
This document tracks known technical debt in the MAID codebase. Technical debt represents areas where we've made pragmatic trade-offs that should be addressed in future refactoring efforts.
Overview¶
Technical debt is not inherently bad - it often represents reasonable decisions made under constraints. This document helps us:
- Track known debt so it doesn't get forgotten
- Prioritize which items to address first
- Plan migration paths for future refactoring
- Onboard new contributors by explaining why certain patterns exist
Active Technical Debt Items¶
TD-001: Module-Level Singletons in API Layer¶
Status: Partially migrated (Phase 1 complete - app.state-aware getters added)
Priority: Medium
Affected Files:
- packages/maid-engine/src/maid_engine/api/auth.py
- packages/maid-engine/src/maid_engine/api/admin/websocket.py
- packages/maid-engine/src/maid_engine/api/admin/dashboard.py
- packages/maid-engine/src/maid_engine/api/admin/entities.py
- packages/maid-engine/src/maid_engine/api/v1/events_ws.py
Migration Progress (2026-02-04):
Phase 1 of the migration has been completed. All affected modules now have app.state-aware getter functions that:
1. Check request.app.state first for the dependency
2. Fall back to the module-level global if not found in app.state
This provides backwards compatibility while enabling new code to use proper dependency injection. The server initialization code (net/web/server.py) now stores auth components in app.state.
New app.state-aware functions added:
- auth.py: _get_key_store_from_app(), _get_rate_limiter_from_app(), _get_auth_rate_limiter_from_app()
- websocket.py: get_ws_manager_from_app(), get_event_broadcaster_from_app()
- dashboard.py: get_event_buffer_from_app(), get_metrics_collector_from_app(), get_websocket_handler_from_app()
- entities.py: get_component_registry_from_app()
- events_ws.py: get_event_broadcaster_from_app()
Remaining work (Phase 2-4):
- Phase 2: Update remaining server initialization to store all singletons in app.state
- Phase 3: Update route functions to use Depends() with the new functions (partially done)
- Phase 4: Update tests to use app.dependency_overrides instead of patching globals
Description:
Several API modules use module-level singletons (global variables with lazy initialization) to manage stateful objects like WebSocket managers, event broadcasters, metrics collectors, and component registries.
Current Pattern:
# Module-level global variable
_ws_manager: WebSocketManager | None = None
def get_websocket_manager() -> WebSocketManager:
"""Get or create the global WebSocket manager."""
global _ws_manager
if _ws_manager is None:
_ws_manager = WebSocketManager()
return _ws_manager
def reset_websocket_manager() -> None:
"""Reset the global WebSocket manager. Useful for testing."""
global _ws_manager
_ws_manager = None
Why This Exists:
The singleton pattern was adopted for simplicity during initial development. It provides:
- Simple access to shared state from anywhere in the API layer
- Lazy initialization (objects created only when needed)
- Easy integration with FastAPI's dependency injection via Depends(get_websocket_manager)
Problems:
- Testing Difficulties
- Tests must manually reset state between runs using
reset_*()functions - State from one test can leak into another if fixtures aren't configured correctly
-
Makes it harder to reason about test isolation
-
Parallel Test Execution
- Tests that modify these globals cannot safely run in parallel
-
This limits test suite performance on multi-core machines
-
Implicit Dependencies
- The init pattern (e.g.,
init_auth()) requires careful setup/teardown - Dependencies between modules are hidden rather than explicit
-
Makes code harder to understand and maintain
-
Single Instance Limitation
- Cannot easily create multiple instances for different scenarios
- Limits flexibility in testing edge cases
Current Workaround:
We provide reset functions and a unified reset_all_api_state() function for testing:
# In test fixtures
@pytest.fixture(autouse=True)
def reset_api_state():
yield
reset_all_api_state() # Resets all API singletons
Individual reset functions are also available:
- reset_auth_singletons() - Auth module (key store, rate limiter)
- reset_admin_api_state() - All admin API singletons
- reset_dashboard_state() - Dashboard metrics and event buffer
- reset_websocket_state() - WebSocket manager and broadcaster
- reset_component_registry() - Component registry
- reset_events_ws_state() - v1 events WebSocket broadcaster
- reset_audit_log_store() - Middleware audit log store
Recommended Migration Path:
Use FastAPI's built-in dependency injection more fully:
Phase 1: App State Storage
from contextlib import asynccontextmanager
from fastapi import FastAPI
@asynccontextmanager
async def lifespan(app: FastAPI):
# Initialize state during startup
app.state.ws_manager = WebSocketManager()
app.state.key_store = APIKeyStore(...)
app.state.rate_limiter = RateLimiter(...)
app.state.metrics_collector = MetricsCollector()
yield
# Cleanup during shutdown
await app.state.ws_manager.shutdown()
Phase 2: Dependency Functions
from fastapi import Request, Depends
def get_ws_manager(request: Request) -> WebSocketManager:
"""Dependency that retrieves WebSocketManager from app state."""
return request.app.state.ws_manager
def get_key_store(request: Request) -> APIKeyStore:
"""Dependency that retrieves APIKeyStore from app state."""
return request.app.state.key_store
Phase 3: Use in Routes
from typing import Annotated
@router.websocket("/ws")
async def websocket_endpoint(
websocket: WebSocket,
manager: Annotated[WebSocketManager, Depends(get_ws_manager)],
):
await manager.connect(websocket)
Phase 4: Testing with Overrides
def test_websocket_connection():
app = create_test_app()
mock_manager = Mock(spec=WebSocketManager)
# Override dependency for testing
app.dependency_overrides[get_ws_manager] = lambda: mock_manager
client = TestClient(app)
# Test with mock_manager
Benefits of Migration:
- Better Test Isolation - Each test can create its own app with fresh state
- Parallel Tests - No shared global state means safe parallel execution
- Explicit Dependencies - Dependencies are visible in function signatures
- Easy Mocking - Use
app.dependency_overridesinstead of patching globals - Type Safety - Better IDE support and type checking
Migration Considerations:
- Requires updating all route functions to use
Depends() - Need to ensure app state is available in all contexts (background tasks, etc.)
- Some edge cases around startup order may need careful handling
- Should be done incrementally, one module at a time
Estimated Effort: Medium (2-3 days per module)
References: - FastAPI Lifespan Events: https://fastapi.tiangolo.com/advanced/events/ - FastAPI Testing with Overrides: https://fastapi.tiangolo.com/advanced/testing-dependencies/ - Python Dependency Injection Patterns: https://python-dependency-injector.ets-labs.org/
TD-002: Low Severity Issues from Implementation Review¶
Status: Backlog Priority: Low Source: Multi-agent implementation review (2026-02-04)
The following issues were identified during a comprehensive code review but are low priority and can be addressed as time permits:
Documentation (Opportunistic)¶
These minor documentation issues should be addressed opportunistically when working on related code:
- Documentation mismatches - Some docs may be slightly out of date
- API doc coverage gaps - Some endpoints missing OpenAPI descriptions
- Env var naming mismatches - Minor inconsistencies in environment variable naming
Recommended Approach: Address when working on related code, during refactoring sprints, or when the issue causes user-visible problems.
Note: These are not blocking issues and the system functions correctly. They represent polish improvements.
Resolved Technical Debt¶
Items that have been addressed are moved here for historical reference.
TD-002-8: PO Parser Escape Sequences (Resolved 2026-02-04)¶
Comprehensive escape sequence handling was already implemented. Added tests verifying handling of \n, \t, \r, \\, \", \xNN, \uXXXX, \UXXXXXXXX, octal escapes, and C-style escapes (\a, \b, \f, \v).
TD-002-9: Locale Handling Edge Cases (Resolved 2026-02-04)¶
Added comprehensive tests for locale normalization edge cases including empty strings, whitespace, script codes (zh-Hans, zh-Hant), unknown locales, mixed case handling, and multiple separators.
TD-002-10: conversation_count Property Lock (Resolved 2026-02-04)¶
The property is already thread-safe: in asyncio, only one coroutine runs at a time (cooperative multitasking), and Python's len() on dict is atomic in CPython. Added documentation explaining this and tests verifying the behavior.
TD-002-11: Token Estimation Edge Cases (Resolved 2026-02-04)¶
Added comprehensive edge case tests for token estimation: Unicode text (Russian, Chinese, Arabic), emoji handling, mixed scripts, punctuation-only text, whitespace-only, single characters, newlines/tabs, very long text, and messages missing content keys.
TD-002-12: Callable Conditions in Persistence (Resolved 2026-02-04)¶
Not an issue: all conditions use serializable string enums (SpawnCondition, ResetCondition, etc.) rather than callables. Pydantic's model_dump_json() would fail loudly if non-serializable data were present. The architecture correctly separates serializable data models from runtime behavior.
Contributing¶
When adding new technical debt documentation:
- Assign an ID - Use the format
TD-XXXwith incrementing numbers - Document why - Explain why the pattern exists, not just that it's bad
- Provide workarounds - Help current developers work with the code as-is
- Plan migration - Include a concrete migration path with code examples
- Estimate effort - Give a rough estimate to help with prioritization
When resolving technical debt:
- Update this document - Move the item to "Resolved" section
- Remove TODO comments - Clean up the source code markers
- Update tests - Remove any workarounds that are no longer needed
- Document in PR - Reference the tech debt item in your PR description
Last updated: 2025
See also: Exception Handling Policy | Style Guide