System Services Overview

ModelMesh Lite is composed of a set of cooperating runtime services that together implement capability-based routing, model lifecycle management, provider abstraction, and observability. This document describes the overall architecture, initialization sequence, request flow, and service groupings. For conceptual foundations see SystemConcept.md; for YAML configuration see SystemConfiguration.md.

Service Dependency Diagram

ModelMesh (facade)
├── Router
│   ├── RoutingPipeline
│   │   ├── CapabilityResolver → CapabilityTree
│   │   ├── DeliveryFilter
│   │   ├── StateFilter
│   │   └── SelectionStrategy
│   ├── RetryPolicy
│   └── CapabilityPool[]
│       ├── Model[] → ModelState
│       ├── Provider[] → ProviderState
│       └── RotationPolicy
│           ├── DeactivationEvaluator
│           ├── RecoveryEvaluator
│           └── SelectionStrategy
├── ModelRegistry
├── ConnectorRegistry
├── OpenAIClient → Router
├── ProxyServer → Router
├── EventEmitter → ObservabilityConnector[]
├── RequestLogger → ObservabilityConnector[]
├── StatisticsCollector → ObservabilityConnector[]
├── SecretResolver → SecretStoreConnector
└── StateManager → StorageConnector

Initialization Sequence

The following steps execute in order when ModelMesh.initialize(config) is called:

Parse configuration – Load YAML or programmatic configuration into the internal MeshConfig structure.
Register connectors – Instantiate the ConnectorRegistry and load all built-in and custom connector packages.
Resolve secrets – Initialize the SecretResolver with the configured secret store connector. Resolve all ${secrets:name} references in the configuration.
Load persisted state – Initialize the StateManager with the configured storage connector. Call load() to restore ModelState, ProviderState, and pool memberships from the previous session.
Build capability tree – Construct the CapabilityTree from the default hierarchy plus any custom extensions declared in configuration.
Register models and providers – Populate the ModelRegistry with model definitions from configuration. Instantiate Provider wrappers for each configured provider.
Build capability pools – Create CapabilityPool instances for each configured pool, assign models based on capability node membership, and attach RotationPolicy components.
Wire the routing pipeline – Assemble the RoutingPipeline with default stages (CapabilityResolver, pool selection, DeliveryFilter, StateFilter, SelectionStrategy, RetryPolicy).
Initialize the router – Create the Router with the assembled pipeline and pool set.
Start observability services – Initialize EventEmitter, RequestLogger, and StatisticsCollector with their configured observability connectors.
Start background services – Launch discovery sync, health monitor probes, periodic state sync, and statistics flush timers.

Request Flow

A typical synchronous completion request follows this path:

The application calls OpenAIClient.chat.completions.create(model="text-generation", messages=[...]) or sends an HTTP request to ProxyServer at POST /v1/chat/completions.
The virtual model name "text-generation" is passed to Router.complete() as the capability identifier.
The Router invokes RoutingPipeline.execute(), which runs each stage in sequence:
- CapabilityResolver maps "text-generation" to matching CapabilityPool instances using the CapabilityTree.
- Pool selection chooses the target pool (single match or priority-based).
- DeliveryFilter excludes models that do not support the requested delivery mode (sync, streaming, or batch).
- StateFilter excludes standby models and models from deactivated providers.
- SelectionStrategy scores remaining candidates and selects the best model.
The Router sends the request to the selected Provider.execute(), which delegates to the underlying provider connector.
On success, the response flows back through the Router. RequestLogger records the request, StatisticsCollector buffers metrics, and ModelState is updated.
On failure, RetryPolicy determines whether to retry the same model (with backoff) or rotate to the next candidate. If deactivation thresholds are reached, the RotationPolicy moves the model to standby and EventEmitter publishes a model_deactivated event.

Service Groupings

Group	Services	Purpose
Facade	ModelMesh	Library entry point; initializes and wires all subsystems
Routing	Router, RoutingPipeline, CapabilityResolver, DeliveryFilter, StateFilter, RetryPolicy	Request orchestration, pipeline stages, and retry logic
Pools & Models	CapabilityPool, Model, ProviderService, ModelState, ProviderState	Model grouping, runtime state, and provider abstraction
Rotation	RotationPolicyService, DeactivationEvaluator, RecoveryEvaluator, SelectionStrategy	Deactivation, recovery, and selection governance
Registries	ModelRegistry, ConnectorRegistry, CapabilityTree	Model catalogue, connector catalogue, capability hierarchy
External Interfaces	OpenAIClient, ProxyServer	Application-facing API surfaces
Observability	EventEmitter, RequestLogger, StatisticsCollector	Events, logging, and metrics
Infrastructure	SecretResolver, StateManager	Secret resolution and state persistence

Cross-References

SystemConcept.md – Conceptual architecture, design principles, and capability model
SystemConfiguration.md – Full YAML configuration reference
SystemServices.md – Consolidated service reference (source for individual docs)
ConnectorInterfaces.md – Connector API contracts