MCP server reference

What your agent can call

The full surface of the WinDAGs MCP server. 551 specialist skills, zero API keys for the read-only tools, on-demand reference loading. Install it once with brew install and any MCP-speaking client gets the toolset.

shell
$ brew install curiositech/windags/windags && windags init

Then in Claude Code: claude mcp add windags -- windags-mcp. Other clients: point your MCP config at windags-mcp over stdio.

Shipping

windags_skill_graftShipping
#

Full SKILL.md bodies for the top N + adjacency descriptions + per-skill asset manifest.

Runs the same six-stage cascade as skill_search (lexical + semantic + RRF + cross-encoder rerank + per-user attribution k-NN + cross-user global priors), then loads the actual SKILL.md bodies from disk for the top N primaries (~10–12K tokens for top-4) plus 4 adjacent catalog entries (name + description, ~500 tokens) for awareness, plus a per-skill asset manifest (references, scripts, templates, examples — paths + sizes) the agent can call windags_skill_reference on. This is the tool measured in the bench article.

Inputs
  • task: stringWhat the user is trying to do
  • count: number = 4How many primary skills to graft (full body)
Returns

{ primary: [{ id, name, body, references[] }], adjacencies: [{ id, name, description }], task_summary }

Example
windags_skill_graft({ task: "set up stripe webhook with retries" })
CostFull cascade runs locally — embeddings via Transformers.js, model cached after first use. Zero API keys, no phone-home.
windags_skill_referenceShipping
#

Load one reference file from a skill on demand.

Skills carry references/, scripts/, templates/, diagrams/, and examples/ directories. The graft response includes a manifest (paths + sizes), and the agent calls this tool when it decides a specific reference is worth pulling into context. Returns the file contents as text. If the path is wrong, returns the actual file list so the agent can pivot.

Inputs
  • skill_id: stringID from a previous graft response
  • file_path: stringPath within the skill, e.g., references/oauth-flow-types.md
Returns

{ contents } on success, or { error, available_paths } if the path doesn't exist.

Example
windags_skill_reference({ skill_id: "stripe-webhooks", file_path: "references/idempotency-keys.md" })
Cost~10 ms, local read. Zero API keys.
windags_historyShipping
#

Recent /next-move predictions for context.

Returns the last N /next-move predictions from your local history (~/.windags/history/). Useful for resuming work, comparing predictions, or letting an agent see what was recently planned without re-running.

Inputs
  • limit: number = 10How many recent predictions to return
Returns

Array of prediction summaries with timestamp, prompt, top skill, and acceptance state.

Cost~2 ms, local SQLite read. Zero API keys.
windags_skill_search_batchShipping
#

N cascade searches in one round-trip — designed for DAG planners.

When a planner is materializing a multi-node DAG, calling skill_search per node costs N MCP round-trips. This tool takes an array of {query, limit} and runs them in parallel, returning results in the same order. Same six-stage cascade as skill_search per query — bundled+user catalogs, BM25 + MiniLM + RRF + cross-encoder + per-user attribution k-NN + cross-user global priors. Hard cap of 20 queries per call to bound payload size.

Inputs
  • queries: { query: string; limit?: number }[]Array of search queries (max 20)
Returns

{ batch_size, results: [{ query, stage, total_matches, skills }] } — preserves input order.

Example
windags_skill_search_batch({ queries: [{ query: 'caching', limit: 5 }, { query: 'docker', limit: 5 }] })
CostSame per-query cost as skill_search; N queries run in parallel after first model load.
windags_skill_graft_batchShipping
#

N grafts in one round-trip for materializing a whole DAG up-front.

Same shape as skill_search_batch but for graft — array of {task, count} returns the full SKILL.md bodies + adjacencies + asset manifests for each task in parallel. Per-task primary count is capped at 3 to bound payload size. When you need a deeper reference for a specific node, call windags_skill_reference. Hard cap of 20 tasks per call.

Inputs
  • tasks: { task: string; count?: number }[]Array of graft requests (max 20, count capped at 3 each)
Returns

{ batch_size, results: [{ task, primary, adjacencies, cascade, reasoning }] }

Example
windags_skill_graft_batch({ tasks: [{ task: 'design REST API' }, { task: 'add caching layer' }] })
CostSame per-task cost as skill_graft; tasks run in parallel after first model load.
windags_node_requirementsShipping
#

Per-node specs a DAG planner needs — allowed-tools, pairs-with, and provider-native model IDs.

Given an array of skill IDs, returns each skill's allowed-tools (from frontmatter), pairs-with relationships, suggested model_tier (fast/balanced/powerful — heuristic by category), and a list of provider-native model IDs that match the tier. Critical: model IDs are real, provider-accepted strings (`gpt-5.4-nano`, `llama-3.1-8b-instant`, `claude-haiku-4-5-20251001`), not abstract Anthropic-only labels. Fixes the bug where DAGs emitted bare "haiku" / "sonnet" strings that 400'd on OpenAI/Groq/etc. The full all_tier_options block is included so a planner can pick a different provider per node if needed.

Inputs
  • skill_ids: string[]Up to 50 skill IDs to look up
Returns

{ requirements: [{ skill_id, name, category, allowed_tools, pairs_with, recommended_model_tier, model_options }], tier_legend, all_tier_options }

Example
windags_node_requirements({ skill_ids: ['api-architect', 'data-pipeline-engineer'] })
Cost~5 ms after catalog warm-up. No external calls.
windags_validate_dagShipping
#

Schema-check a candidate DAG before you save it.

A DAG planner emits JSON; this tool validates it against the PredictedDAG schema (waves, nodes, dependencies, premortem, confidence, problem_classification). Returns either { valid: true, plus a summary } or { valid: false, errors: ["path.to.field: message", ...] }. The schema is permissive on most fields (defaults applied) but strict on shape — a malformed wave or missing required node field surfaces clearly. Use this during multi-step planning to catch problems before they reach the executor.

Inputs
  • dag: unknownAny JSON-shaped object claimed to be a PredictedDAG
Returns

{ valid: true, title, topology, wave_count, node_count, confidence } | { valid: false, errors: string[] }

Example
windags_validate_dag({ dag: { title: 'my plan', waves: [{ nodes: [{ id: 'r', skill_id: 'research-craft', role_description: 'Research' }] }] } })
Cost~2 ms. No external calls. Pure zod validation against the canonical schema.
windags_estimate_costShipping
#

Per-node + total cost estimate during planning, surfaced before the executor's runtime cost gate.

Given the planned nodes (id, skillIds, description, dependencies, optional model + referenceFileCount) plus a defaultModel tier, returns the predicted total USD + total tokens + a per-node breakdown. Char-based estimator (no tokenizer) — calibrated to Anthropic Claude tier pricing. Treat as planning-time order-of-magnitude, not a billing prediction. The point is to surface roughly-how-much-this-will-cost during predict, before the runtime cost gate fires at execution time. Reads bundled SKILL.md bodies from disk to estimate prompt size.

Inputs
  • nodes: { id, skillIds[], description, dependencies[], model?, referenceFileCount? }[]DAG nodes in dependency order
  • defaultModel: "haiku" | "sonnet" | "opus"Tier used when a node doesn't specify one
Returns

{ total_cost_usd, total_tokens, node_count, default_model, per_node: [{ node_id, input_tokens, output_tokens, total_tokens, cost_usd }] }

Example
windags_estimate_cost({ nodes: [{ id: 'r', skillIds: ['research-craft'], description: 'Research', dependencies: [] }, { id: 'w', skillIds: ['technical-writer'], description: 'Write', dependencies: ['r'], model: 'sonnet' }], defaultModel: 'haiku' })
Cost~10 ms. Local skill body reads + arithmetic. No external calls.

Preview

On the roadmap. We're publishing the design now so you can argue with the API surface before it lands. Open an issue at curiositech/windags-skills.

windags_run_skill_scriptPreview
#

Execute a script bundled with a skill (locally, sandboxed) and return its output.

Many skills carry executable scripts (compliance checks, code generators, lint rules). Today an agent has to inline the script into context to use it. This tool runs the script on the user's machine in a subprocess sandbox (no network unless --remote=allow, time + output caps), returning only stdout/stderr/exit. The agent gets the result without burning context on the script body. Local-only by design — the MCP server is what runs on the user's machine; api.windags.ai never sees the user's files.

Inputs
  • skill_id: stringSkill that owns the script
  • script_path: stringRelative path within the skill, e.g., scripts/lint.py
  • args: string[]Arguments passed to the script
  • stdin: string?Optional stdin payload
Returns

{ stdout, stderr, exit_code, duration_ms, truncated }

CostLocal subprocess. Sandboxed: 30s wall clock, 1 MB output cap, no network by default.
windags_list_skill_assetsPreview
#

Index the references/, scripts/, templates/, examples/ for a grafted skill.

Graft already returns a per-skill manifest, but this lets the agent re-list assets later in the conversation without re-grafting. Returns paths + sizes + a one-line peek so the agent can decide what's worth pulling via skill_reference. Also surfaces script entry points (scripts/lint.py, scripts/migrate.sh) so the agent knows what's runnable via run_skill_script.

Inputs
  • skill_id: stringSkill to index
  • kind: "references" | "scripts" | "templates" | "examples" | "all" = "all"Filter
Returns

Array of { path, kind, bytes, peek }.

Cost~5 ms, local fs walk. Zero API keys.
windags_pairs_withPreview
#

Graph traversal: given a skill, return the skills it pairs with (frontmatter pairs-with field).

Skills declare their natural partners in YAML frontmatter (`pairs-with: [{skill: data-pipeline-engineer, reason: ...}]`). When the agent realizes a grafted skill needs reinforcement, this tool returns the partners without a fresh search.

Inputs
  • skill_id: stringThe skill to traverse from
  • depth: number = 11 = direct partners, 2 = partners-of-partners
Returns

Array of { id, name, reason, distance }.

Cost~3 ms, in-process graph walk.