# Replicate

> Run any open-source model on Replicate from inside an AI agent.

[Canonical HTML page](https://top-mcps.com/mcp/replicate) · [server.json](https://top-mcps.com/mcp/replicate.json) · [methodology](https://top-mcps.com/about/methodology)

## Install

### Claude Desktop — `claude_desktop_config.json`

Paste under mcpServers. Fully quit and reopen Claude after editing.

```json
{
  "mcpServers": {
    "replicate": {
      "command": "npx",
      "args": [
        "-y",
        "@replicate/mcp"
      ],
      "env": {
        "REPLICATE_API_TOKEN": "${REPLICATE_API_TOKEN}"
      }
    }
  }
}
```

### Claude Code — `CLI or .mcp.json`

Run from your repo. Commit .mcp.json to share with your team.

```shell
# export REPLICATE_API_TOKEN=YOUR_API_TOKEN
claude mcp add replicate -- npx -y @replicate/mcp
```

### Cursor — `.cursor/mcp.json`

Global path: ~/.cursor/mcp.json. Reload window after editing.

```json
{
  "mcpServers": {
    "replicate": {
      "command": "npx",
      "args": [
        "-y",
        "@replicate/mcp"
      ],
      "env": {
        "REPLICATE_API_TOKEN": "${REPLICATE_API_TOKEN}"
      }
    }
  }
}
```

### VS Code — `.vscode/mcp.json`

VS Code uses the "servers" key (not "mcpServers").

```jsonc
{
  "servers": {
    "replicate": {
      "command": "npx",
      "args": [
        "-y",
        "@replicate/mcp"
      ],
      "env": {
        "REPLICATE_API_TOKEN": "${REPLICATE_API_TOKEN}"
      }
    }
  }
}
```

### Windsurf — `~/.codeium/windsurf/mcp_config.json`

Open via Cascade → hammer icon → Configure.

```json
{
  "mcpServers": {
    "replicate": {
      "command": "npx",
      "args": [
        "-y",
        "@replicate/mcp"
      ],
      "env": {
        "REPLICATE_API_TOKEN": "${REPLICATE_API_TOKEN}"
      }
    }
  }
}
```

### Cline — `cline_mcp_settings.json`

Open via the Cline sidebar → MCP Servers → Edit.

```json
{
  "mcpServers": {
    "replicate": {
      "command": "npx",
      "args": [
        "-y",
        "@replicate/mcp"
      ],
      "env": {
        "REPLICATE_API_TOKEN": "${REPLICATE_API_TOKEN}"
      }
    }
  }
}
```

### Continue — `~/.continue/config.json`

Continue uses modelContextProtocolServers with a transport block.

```json
{
  "experimental": {
    "modelContextProtocolServers": [
      {
        "transport": {
          "type": "stdio",
          "command": "npx",
          "args": [
            "-y",
            "@replicate/mcp"
          ],
          "env": {
            "REPLICATE_API_TOKEN": "${REPLICATE_API_TOKEN}"
          }
        }
      }
    ]
  }
}
```

### Codex CLI — `~/.codex/config.toml`

Codex uses TOML. Each server is a [mcp_servers.<name>] subtable.

```shell
# ~/.codex/config.toml
[mcp_servers.replicate]
command = "npx"
args = [
  "-y",
  "@replicate/mcp",
]
env = { REPLICATE_API_TOKEN = "${REPLICATE_API_TOKEN}" }
```

### Zed — `~/.config/zed/settings.json`

Zed calls them "context_servers". Settings live-reload on save.

```jsonc
{
  "context_servers": {
    "replicate": {
      "command": {
        "path": "npx",
        "args": [
          "-y",
          "@replicate/mcp"
        ]
      },
      "env": {
        "REPLICATE_API_TOKEN": "${REPLICATE_API_TOKEN}"
      }
    }
  }
}
```

### ChatGPT — `ChatGPT → Apps directory`

Replicate doesn't ship a hosted HTTPS endpoint today. ChatGPT supports remote MCP servers only — to use this server in ChatGPT you'll need to deploy it to a public HTTPS URL first (e.g. via Cloudflare Workers or Vercel) or wait for an official remote build.

```none

```

## At a glance

- **Maintainer:** Replicate
- **Transport:** stdio
- **Auth model:** API key
- **Required secrets:** REPLICATE_API_TOKEN
- **Supported clients:** Claude, Cursor, Any MCP-compatible client
- **License:** MIT
- **Language:** TypeScript
- **Latest version:** latest
- **Last verified:** 2026-05-26
- **Score:** 56/100 (rubric 2026-04 — see https://top-mcps.com/about/methodology)
- **Source:** https://github.com/replicate/mcp

## Security & scope

- **Access scope:** network
- **Sandbox:** All inference runs on Replicate's infra. The API token grants account-level access to your billing and predictions.
- **Gotchas:**
  - Tokens are full-account; restrict use to one MCP and rotate periodically.
  - Async predictions stay in your account history — clean up sensitive inputs.

## Quick answer

**What it does.** Wraps Replicate's prediction API: list models, get model schema, create a prediction, poll status, retrieve output. Supports both synchronous and async modes.

**Best for:**
- Image generation
- Video generation
- Audio processing
- Calling long-tail open-source models
- Prototyping with new releases

**Not for:**
- High-volume production
- Local-only workflows
- Latency-critical inference

## Description

The Replicate MCP exposes Replicate's catalog of thousands of open-source models — image, video, audio, language — as MCP tools. The agent picks a model by name or shortcut, submits inputs, polls status, and gets back outputs without touching the Replicate REST API.

## Why it matters

Replicate is the easiest way to run open-source models you do not want to host yourself. The MCP version means an agent can call SDXL, Llama, Whisper, or any of the long-tail community models without ever leaving the conversation.

## Key features

- Thousands of models
- Per-call pricing
- Async predictions
- Model schema discovery
- Streaming for LLMs

## FAQ

### How is it priced?

Per-second of GPU time, model by model. Image models are typically a few cents per generation; video and large LLMs cost more. Replicate's site lists per-model rates.

### Can it stream LLM tokens?

Yes for models that support streaming. The MCP exposes a streaming mode that emits chunks as they arrive.

### Does it auto-pick the right model?

No — you (or the agent) pass the model owner/name. Use the `list_models` tool to discover.

## Changelog

- **2026-05-26** — Refreshed install snippets and fact sheet; verified for 2026.
- **2025-02-01** — Initial directory listing.
