The Model Context Protocol’s July 28, 2026 release candidate makes servers stateless at the protocol layer. Servers can no longer rely on the transport layer to carry state between tool calls. Every request will arrive with no built-in context, and your server will need to retrieve or reconstruct state on its own.

This transition started with the March 26, 2025 specification revision, which introduced the Streamable HTTP transport and began moving away from persistent connections. Each subsequent draft in the specification repository has moved further toward protocol-layer statelessness, with the July 28, 2026 release candidate removing Mcp-Session-Id and the initialize lifecycle entirely. Most MCP tutorials still assume stateful sessions. Here’s what actually changes, what breaks, and how to fix it without building and maintaining a persistence layer yourself.

What MCP Stateless Servers Mean for Conversation History

Before this change, an MCP server could hold session context in memory between tool calls. A user asks a question, your server stores the conversation, and the next tool call can reference what came before. The transport layer maintained session identity, so the server process could keep state in memory across the session lifecycle.

After the change, the protocol no longer manages session state for you, but removing the protocol-level session doesn’t mean your application has to be stateless. Each tool call arrives without built-in context from previous calls, and your server carries state explicitly. The maintainers recommend minting an explicit handle: accept a session identifier as a tool argument and use it to look up and persist context in a durable backend. Multi-turn agent conversations, user preferences, and tool call history all move to that external store.

In-memory session state was never durable to begin with; a server restart always wiped it. Persisting conversation history to an external store is something well-built servers should already be doing. The spec change makes it unavoidable rather than optional.

The practical impact hits three areas:

Multi-turn conversations: An agent that asks clarifying questions across multiple tool calls needs to retrieve the thread on each call rather than relying on the transport to maintain it.
User context: Preferences, permissions, and identity that were stored in session memory need an external home.
Tool call chains: Workflows where one tool call depends on the result of a previous one need a durable store to carry state forward, since server memory is no longer guaranteed to persist between calls.

Filesystem, Redis, Postgres, and Vector DBs: Tradeoffs for MCP Persistence

Filesystem (JSON or text files): The simplest option; write each conversation to a file, one per session. No dependencies, no setup. But it doesn’t scale: concurrent writes risk corruption, there’s no way to search across sessions without scanning every file, and multi-tenant isolation comes down to careful path management. It works for local development or single-user tools, but becomes fragile the moment you need concurrency, durability guarantees, or queryability.

Redis or in-memory cache: Fast reads and writes, and Redis can persist data durably with RDB snapshots or AOF logging. But you still need to model conversations, build retrieval logic, and enforce tenant isolation yourself. Redis is optimized for caching and real-time data structures, not for the ordered, scoped, queryable history that conversation persistence requires. It’s a solid building block, but not a ready-made solution for this access pattern.

PostgreSQL or MySQL with a custom schema: This works, and many teams already run Postgres. But conversation persistence is ongoing infrastructure, not a one-time setup. You’ll write the initial schema, then maintain migrations as requirements evolve, build TTL and cleanup jobs, scope every query to the correct tenant, handle retries and connection health, and keep it all running. The initial code is straightforward, it’s the continuous maintenance that adds up.

Vector databases: Good for semantic search over content, but conversation history is ordered and structured. You need messages in sequence, scoped to sessions, with metadata filtering. A vector database solves a different problem. If you need both retrieval and persistence, you end up running two systems and keeping them in sync.

None of these are bad choices in the right context. But they’re either too fragile, too low-level, too much work, or the wrong shape for conversation persistence specifically.

What MCP Servers Need for Conversation Storage

The requirements for MCP conversation persistence are specific:

Ordered message storage scoped to a session, user, or agent
Full conversation retrieval or a recent context window on demand
Multi-tenant isolation so one user’s data never leaks into another’s context
A simple API that takes a few lines to integrate, not a schema design project

These requirements describe a conversation database, not a cache, not a general-purpose relational database, and not a vector store.

Persisting MCP Conversation History with DialogueDB

DialogueDB is a managed conversation database built for exactly this access pattern. Instead of designing schemas and standing up infrastructure, you store and retrieve conversation history through an API that already models the data the way your MCP server needs it.

Here’s a working MCP server in TypeScript that follows this pattern. The sessionId tool argument is the explicit handle the MCP spec recommends, and tenantId from the authenticated request context ensures each tenant’s data stays isolated. DialogueDB is the durable store behind both:

server.ts

TypeScript

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { DialogueDB } from "dialogue-db";
import { z } from "zod";

const db = new DialogueDB({ apiKey: process.env.DIALOGUEDB_API_KEY });
const server = new McpServer({ name: "my-agent", version: "1.0.0" });

server.tool(
  "ask",
  { question: z.string(), sessionId: z.string() },
  async ({ question, sessionId }, { authInfo }) => {
    // tenantId comes from your authenticated context, never from the tool argument
    const tenantId = authInfo.tenantId;

    const dialogue = await db.getOrCreateDialogue({
      id: sessionId,
      namespace: tenantId,
    });

    const history = await dialogue.loadMessages({ limit: 50 });

    await dialogue.saveMessage({ role: "user", content: question });

    const answer = await generateResponse(history, question);

    await dialogue.saveMessage({ role: "assistant", content: answer });

    return { content: [{ type: "text", text: answer }] };
  }
);

Every tool call retrieves the recent conversation history from DialogueDB, processes the request with that context, and stores the result back. The server itself holds no state. Restart it, scale it horizontally, run it in a serverless function: the conversation persists regardless.

Notice what’s missing from that code: no schema definitions, no connection pooling, no session management logic, no TTL handling, no multi-tenant filtering queries. The namespace parameter isolates tenants automatically.

Custom Persistence vs. Managed Conversation Database

The contrast speaks for itself. First, what you build and maintain when rolling your own. Below it, the same functionality with DialogueDB.

Rolling your own (~50 lines, plus tests and upkeep)

persistence.ts

TypeScript

import { Pool } from "pg";

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

// Migration: run before first use
// CREATE TABLE conversations (
//   id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
//   session_id TEXT NOT NULL,
//   tenant_id TEXT NOT NULL,
//   created_at TIMESTAMPTZ DEFAULT NOW(),
//   UNIQUE(session_id, tenant_id)
// );
// CREATE TABLE messages (
//   id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
//   conversation_id UUID REFERENCES conversations(id),
//   role TEXT NOT NULL,
//   content TEXT NOT NULL,
//   created_at TIMESTAMPTZ DEFAULT NOW()
// );
// CREATE INDEX idx_msgs ON messages(conversation_id, created_at);

async function getOrCreateConversation(
  sessionId: string, tenantId: string
) {
  const { rows } = await pool.query(
    `INSERT INTO conversations (session_id, tenant_id)
     VALUES ($1, $2)
     ON CONFLICT (session_id, tenant_id)
     DO UPDATE SET session_id = $1
     RETURNING id`,
    [sessionId, tenantId]
  );
  return rows[0].id;
}

async function saveMessage(
  convId: string, role: string, content: string
) {
  await pool.query(
    `INSERT INTO messages (conversation_id, role, content)
     VALUES ($1, $2, $3)`,
    [convId, role, content]
  );
}

async function loadMessages(convId: string, limit = 50) {
  const { rows } = await pool.query(
    `SELECT role, content FROM messages
     WHERE conversation_id = $1
     ORDER BY created_at DESC, id DESC LIMIT $2`,
    [convId, limit]
  );
  return rows.reverse();
}

// Plus: cleanup jobs, error handling, retries,
// connection health checks, migration runner...

With DialogueDB (3 calls, complete)

persistence.ts

TypeScript

import { DialogueDB } from "dialogue-db";

const db = new DialogueDB({
  apiKey: process.env.DIALOGUEDB_API_KEY,
});

const dialogue = await db.getOrCreateDialogue({
  id: sessionId,
  namespace: tenantId,
});

const history = await dialogue.loadMessages({ limit: 50 });
await dialogue.saveMessage({ role: "user", content: question });

Three API calls replace the schema, the connection pool, the migration files, the cleanup jobs, and the tenant isolation logic. The conversation data is also searchable across sessions, which matters when agents need to reference past interactions.

Getting Started with MCP Conversation Persistence

Servers built against the current spec continue to work; the transition supports both protocol versions, so existing servers won’t break when the spec finalizes on July 28. But if you’re adopting the new spec or building new servers, conversation persistence is worth setting up now. Install the SDK, grab an API key from the DialogueDB dashboard, and add the lines above to your MCP server. The entire integration takes about five minutes.

For the full setup, start with the quickstart guide. For a broader look at conversation persistence options across frameworks, see conversation persistence in TypeScript agent frameworks.

Your server’s persistence layer shouldn’t be the hard part.

Frequently Asked Questions

Ready to Build Better Conversations?

Get started with DialogueDB in minutes. Free tier included.

Get Your API Key

MCP Is Going Stateless: How to Handle Conversation Persistence

What MCP Stateless Servers Mean for Conversation History

Filesystem, Redis, Postgres, and Vector DBs: Tradeoffs for MCP Persistence

What MCP Servers Need for Conversation Storage

Persisting MCP Conversation History with DialogueDB

Custom Persistence vs. Managed Conversation Database

Getting Started with MCP Conversation Persistence

Frequently Asked Questions

Ready to Build Better Conversations?