A Qualixar Research Initiative

SLM MCP Hub
Federated Model Context Protocol

Name: SuperLocalMemory
Author: Varun Pratap Bhardwaj

A research investigation into federated MCP gateways. One hub process, every MCP server, every AI client. Tool schemas discovered on demand rather than preloaded. Cache, cost telemetry, and retrieval learning shared across concurrent sessions.

Source on GitHub PyPI npm Qualixar Platform

Three Compounding Problems

Each AI coding session that integrates with MCP inherits a per-session architecture that was designed for one-to-one client-to-server use. At cross-session scale the design produces three observable regressions.

01

Per-session process duplication

Each AI coding session spawns its own MCP server children. With N sessions × M MCPs the process count grows as O(N·M). At five sessions and thirty-six MCPs the resident process set reaches 180 processes and roughly 9 GB of memory — before any model call has been made.

02

Tool schema tax on the context window

Every MCP tool schema is preloaded into each session’s context window. 400+ tool definitions consume approximately 150K tokens. On a 200K-context model, 75% of the working set is spent on tool prospectus rather than task content.

03

Absence of cross-session learning signal

Independent sessions cannot share cache, cost ledgers, or retrieval signals. Every session re-computes the same lookups, re-pays the same API costs, and re-discovers the same failure modes. There is no shared substrate for learning.

Approach

A hub process owns the MCP server lifecycle. AI clients connect to the hub over HTTP and interact exclusively through three meta-tools. Tool schemas are retrieved on demand. Cache, cost telemetry, and optional retrieval learning are shared across every connected client.

hub__search_tools

Query the federated tool registry by name or description. Returns full input schemas on demand.

hub__call_tool

Invoke any tool on any connected MCP server. The hub handles routing, auth, and response normalization.

hub__list_servers

Enumerate connected servers and their declared tool counts. Used for discovery and health checks.

Observed Characteristics

Measurements taken on a reference workload of five concurrent AI coding sessions with thirty-six MCP servers. Absolute values depend on the client, model, and specific MCP set in use; the relative shape is stable.

180 → 37

Processes (5 sessions, 36 MCPs)

~9 GB → ~1.9 GB

Aggregate RAM

150K

Tokens saved per session

3

Meta-tools replace 430+ tool schemas

AGPL v3

License

0

Cloud dependencies

Try the Research Artifact

The reference implementation is distributed on PyPI and npm. Source under AGPL v3. The artifact runs locally; configured upstream servers and connectors may use the network. It is intended to be forked, studied, and critiqued.

# install

pip install slm-mcp-hub

# initialize config and import existing MCPs

slm-hub config init

slm-hub setup import ~/.claude.json

# start the hub

slm-hub start

Add the hub endpoint to your MCP client configuration and every participating session will share the same tool surface, cache, and cost ledger through a single HTTP connection.