Start
model-serving-api-builder
model-serving-api-builder - Skill Dossier
model-serving-api-builder

model-serving-api-builder

Deploy ML models as production APIs with vLLM, TGI, ONNX Runtime, batching, autoscaling, and GPU optimization. Activate on: model serving, deploy LLM, vLLM setup, inference API, GPU serving. NOT for: model training (ai-engineer), prompt engineering (prompt-engineer).

AI & Machine Learning
#model-serving#vllm#inference#gpu-optimization#api

Allowed Tools

ReadWriteEditBash(python:*pip:*npm:*npx:*)

Share this skill

Coming in Spring 2026 Beta

WinDAGs will match this skill automatically. Then ask:

"Use model-serving-api-builder to help me build..."
Request Early Access
"Use model-serving-api-builder to help me build a model-serving system"
"I need expert help with deploy ml models as production apis with vllm, tgi..."
"Orchestrate model-serving-api-builder with ai-engineer for ai engineer builds the model; this skill deploys it as a scalable api"