SuperLocalMemory Logo — Local AI Memory Layer
SuperLocalMemory
A Qualixar Research Initiative

SLM MCP Hub
Federated Model Context Protocol

A research investigation into federated MCP gateways. One hub process, every MCP server, every AI client. Tool schemas discovered on demand rather than preloaded. Cache, cost telemetry, and retrieval learning shared across concurrent sessions.

Three Compounding Problems

Each AI coding session that integrates with MCP inherits a per-session architecture that was designed for one-to-one client-to-server use. At cross-session scale the design produces three observable regressions.

01

Per-session process duplication

Each AI coding session spawns its own MCP server children. With N sessions × M MCPs the process count grows as O(N·M). At five sessions and thirty-six MCPs the resident process set reaches 180 processes and roughly 9 GB of memory — before any model call has been made.

02

Tool schema tax on the context window

Every MCP tool schema is preloaded into each session’s context window. 400+ tool definitions consume approximately 150K tokens. On a 200K-context model, 75% of the working set is spent on tool prospectus rather than task content.

03

Absence of cross-session learning signal

Independent sessions cannot share cache, cost ledgers, or retrieval signals. Every session re-computes the same lookups, re-pays the same API costs, and re-discovers the same failure modes. There is no shared substrate for learning.

Approach

A hub process owns the MCP server lifecycle. AI clients connect to the hub over HTTP and interact exclusively through three meta-tools. Tool schemas are retrieved on demand. Cache, cost telemetry, and optional retrieval learning are shared across every connected client.

hub__search_tools
Query the federated tool registry by name or description. Returns full input schemas on demand.
hub__call_tool
Invoke any tool on any connected MCP server. The hub handles routing, auth, and response normalization.
hub__list_servers
Enumerate connected servers and their declared tool counts. Used for discovery and health checks.

Observed Characteristics

Measurements taken on a reference workload of five concurrent AI coding sessions with thirty-six MCP servers. Absolute values depend on the client, model, and specific MCP set in use; the relative shape is stable.

180 → 37
Processes (5 sessions, 36 MCPs)
~9 GB → ~1.9 GB
Aggregate RAM
150K
Tokens saved per session
3
Meta-tools replace 430+ tool schemas
AGPL v3
License
0
Cloud dependencies

Try the Research Artifact

The reference implementation is distributed on PyPI and npm. Source under AGPL v3. The artifact runs locally with no cloud dependency and is intended to be forked, studied, and critiqued.

# install
pip install slm-mcp-hub
# initialize config and import existing MCPs
slm-hub config init
slm-hub setup import ~/.claude.json
# start the hub
slm-hub start
Add the hub endpoint to your MCP client configuration and every participating session will share the same tool surface, cache, and cost ledger through a single HTTP connection.

Related Qualixar Initiatives

Part of the Qualixar research platform. Maintained by Varun Pratap Bhardwaj.