Start
model-serving-api-builder
model-serving-api-builder - Skill Dossier

model-serving-api-builder
Deploy ML models as production APIs with vLLM, TGI, ONNX Runtime, batching, autoscaling, and GPU optimization. Activate on: model serving, deploy LLM, vLLM setup, inference API, GPU serving. NOT for: model training (ai-engineer), prompt engineering (prompt-engineer).
AI & Machine Learning
#model-serving#vllm#inference#gpu-optimization#api
Allowed Tools
ReadWriteEditBash(python:*pip:*npm:*npx:*)
⚡
Coming in Spring 2026 Beta
WinDAGs will match this skill automatically. Then ask:
"Use model-serving-api-builder to help me build..."
Request Early Access