Case Study

PocketScout

Binding Site Intelligence

Role
Creator & Developer
Platform
MCP Server (Claude, Cursor, etc.)
Stack
Python · FastMCP · gemmi · httpx
6
Tools
5
APIs
Live
MCP Server

Overview

An open-source MCP server that aggregates structural, chemical, and literature data to evaluate druggable binding pockets on protein targets — filling the gap between “I have a target” and “I'm running RFdiffusion.”

Six domain-informed tools compose across five public APIs (UniProt, RCSB PDB, AlphaFold DB, ChEMBL, PubMed) to give Claude the scientific knowledge it needs to reason about binding sites, ligand history, cross-species conservation, and recent literature.

What is MCP?

Model Context Protocol (MCP) is how AI assistants connect to external tools and data sources. PocketScout provides six specialized tools that give Claude capabilities it doesn't have natively — the ability to query protein databases, assess binding site druggability, and evaluate competitive landscapes in real time.

“The best MCP tools don't just fetch data — they interpret it, so the model can reason about what matters.”
— Design principle
The Problem

The gap between target and design

Before any computational design work begins, scientists spend hours to days manually gathering information across half a dozen browser tabs. They check UniProt for protein function, browse PDB for structures, search ChEMBL for prior art, read papers for allosteric insights — then synthesize it all in their heads.

This reconnaissance step is where campaigns quietly go wrong. A scientist picks the obvious binding site without checking that 200 compounds have already failed there. They miss an allosteric pocket described in a recent paper. They don't realize the binding site residues aren't conserved in mouse until their in vivo model fails six months later.

01
Fragmented Data
Binding site data scattered across PDB, UniProt, PubMed, ChEMBL — each with different formats, different access patterns, different expertise required.
02
Expert Translation
Raw crystallographic data requires structural biology expertise to interpret. Resolution, B-factors, ligand occupancy mean nothing to a general LLM.
03
No Composition
Even when a researcher knows the steps, there’s no tool that composes the full assessment workflow into a single, reasoned evaluation.
The Gap

The life sciences MCP ecosystem is thin — most servers are hackathon-quality wrappers around REST APIs with no error handling, no scientific context in the tool descriptions, and no thought about how tools compose into real workflows. The ecosystem has plumbing. What it lacks is domain expertise.

Why MCP

Tool descriptions as prompt engineering

Claude selects and sequences tools based on their docstrings and parameter descriptions. Every description in PocketScout encodes scientific reasoning — not just what the tool returns, but when to use it, what the results mean in context, and how they gate downstream decisions.

Descriptions Guide Reasoning
A get_binding_sites tool that says "get binding sites" is useless. One that says "call this AFTER characterize_target, each site includes druggability assessment and modality recommendations" actually helps Claude reason like a medicinal chemist.
Composition Reflects Workflow
The six tools mirror how expert drug hunters actually evaluate targets: establish biological context → assess structural coverage → map known pockets → evaluate competitive landscape → check translatability → search literature.
Pre-Computed Interpretations
A raw list of ChEMBL bioactivities is hard for Claude to reason about. A structured competitive landscape assessment — "crowded, 200+ compounds with sub-micromolar activity" — lets Claude synthesize meaningfully.
Architecture

Six tools, one assessment

Built on FastMCP 3.0 with async API clients for UniProt, RCSB PDB, AlphaFold DB, ChEMBL, and PubMed. Each tool queries one or more public APIs and returns pre-interpreted, LLM-ready results.

Target Intel
characterize_target
Structure
get_related_structures
get_binding_sites
Chemistry
get_ligand_history
Context
check_conservation
search_target_literature
Orchestration
binding_site_assessment
Protein conformational change visualization
characterize_target
Protein function, family classification, subcellular location, disease associations, plus AlphaFold structure confidence to flag regions where predicted binding sites can’t be trusted.
UniProtAlphaFold DB
get_related_structures
All experimental structures for the target across PDB, sorted by resolution, with ligand inventories. A target with 200 co-crystal structures is a fundamentally different design problem than one with a single cryo-EM map.
RCSB PDB
get_binding_sites
Known pockets from co-crystallized ligands, filtered for crystallization artifacts (45 common buffer/cryo compounds), classified as orthosteric, allosteric, cofactor, or PPI — with modality-specific druggability notes.
RCSB PDB
get_ligand_history
Bioactivity landscape from ChEMBL — how many compounds tested, potency distribution, clinical candidates — classified as crowded, moderate, emerging, or untargeted.
ChEMBL
check_conservation
Compares binding site residues between human and mouse orthologs. The step most scientists skip and the one that most often causes late-stage preclinical failures.
UniProt
search_target_literature
Focused PubMed searches for structural biology and drug design papers — allosteric mechanisms, resistance mutations, cryptic sites from MD simulations.
PubMed
binding_site_assessment
Orchestration prompt that composes all six tools in sequence and instructs Claude to synthesize a ranked recommendation: which binding regions are most promising for de novo design, what are the trade-offs, what modality fits each site, and what are the key risks.
Design Decisions

What I chose not to build

The hardest design decisions in PocketScout were about what to leave out. Each decision reflects a principle about what MCP tools should and shouldn't do.

Why Not Pocket Prediction?
Tools like fpocket and P2Rank computationally predict novel binding sites, but they require CPU/GPU infrastructure that doesn’t fit the MCP model of lightweight API-based tools. PocketScout focuses on known binding intelligence from experimental data. Pocket prediction belongs in a separate compute-oriented server — clear boundaries, clear scope.
Simplified Conservation
Full multi-species multiple sequence alignment is computationally expensive and error-prone. Human vs. mouse sequence comparison using context-aware residue matching covers the most critical preclinical translatability question and handles indels between orthologs. The limitation is documented; the pragmatic choice is intentional.
Pre-Computed Interpretations
Rather than returning raw resolution values (2.1 Ångstroms) and expecting the LLM to know what that means, PocketScout returns "well-resolved (2.1Å)." The tool does the domain translation so the model can focus on reasoning.
Try It Live

Connect in 30 seconds

PocketScout runs as a live MCP server. You can connect it to Claude right now.

Setup Steps
1
Open claude.ai
2
Go to Settings → Connectors
3
Click "Add custom connector"
4
Paste the server URL:
https://pocketscout-mcp.up.railway.app/mcp
PocketScout running in Claude — KRAS G12C binding site assessment

PocketScout assessment of KRAS G12C (PDB 6OIM) running in Claude

Where PocketScout wins

Prompts where live data, coordinate-level computation, or current database queries matter — not general knowledge.

1
Novel / recent structure

Assess the binding landscape of PDB 8QXB for designing a protein binder to block viral entry.

Claude won’t know structures deposited after its training cutoff. PocketScout pulls the actual ligands, contacts, and resolution from PDB in real time.

2
Construct coverage trap

I want to design a binder targeting the ATP site of EGFR using PDB 1M17. Is this feasible for a protein therapeutic?

Vanilla Claude will say "yes, here’s the ATP pocket." PocketScout flags that 1M17 is the intracellular kinase domain — unreachable by a protein binder that can’t cross the membrane.

3
Competitive landscape

Is PCSK9 (Q8NBP7) already too crowded for a new binder program, or is there room to differentiate?

Claude has general knowledge, but PocketScout pulls current ChEMBL bioactivity counts, clinical candidates, and potency distributions to give a quantitative answer.

4
Allosteric vs orthosteric

PDB 6OIM has multiple bound ligands. Which pocket is allosteric and which is the active site? Where should I target a de novo binder?

PocketScout computes residue contacts from coordinates, classifies cofactor vs non-cofactor sites, checks overlap, and flags the allosteric pocket automatically.

5
Cross-species conservation

I’m designing a binder against the PD-L1 interface on PDB 5J89. Will the key contact residues be conserved in mouse for preclinical testing?

Claude will guess. PocketScout pulls actual human and mouse ortholog sequences from UniProt and checks conservation at the exact binding site residues, handling indels correctly.

Open Source
github.com/Proprius-Labs/pocketscout-mcp
View on GitHub
How I Work

Building at the intersection

PocketScout started as an observation about where drug discovery workflows break down, and became a working piece of infrastructure in Anthropic's MCP ecosystem. The tool design — the workflow sequence, the tool descriptions, the competitive landscape classification, the conservation check that most scientists skip — comes from understanding how domain experts actually think, not from wrapping APIs.

I identified that the gap in the life sciences MCP ecosystem isn't more database connectors — it's scientifically-informed tool composition that reflects real decision-making. PocketScout is a working demonstration of that thesis: six tools, one prompt, deployed and usable today through Claude.

“The best tools don't just expose APIs. They encode domain knowledge into the interface between human intent and machine capability.”