senior-ml-engineerlisted

ML engineering skill for productionizing models, building MLOps pipelines, and integrating LLMs. Covers model deployment, feature stores, drift monitoring, RAG systems, and cost optimization. Use when the user asks about deploying ML models to production, setting up MLOps infrastructure (MLflow, Kubeflow, Kubernetes, Docker), monitoring model performance or drift, building RAG pipelines, or integrating LLM APIs with retry logic and cost controls. Focused on production and operational concerns rather than model research or initial training.
mdnaimul22/human-skills · ★ 2 · AI & Automation · score 78

Install: claude install-skill mdnaimul22/human-skills

# Senior ML Engineer Production ML engineering patterns for model deployment, MLOps infrastructure, and LLM integration. --- ## Table of Contents - [Model Deployment Workflow](#model-deployment-workflow) - [MLOps Pipeline Setup](#mlops-pipeline-setup) - [LLM Integration Workflow](#llm-integration-workflow) - [RAG System Implementation](#rag-system-implementation) - [Model Monitoring](#model-monitoring) - [Reference Documentation](#reference-documentation) - [Tools](#tools) --- ## Model Deployment Workflow Deploy a trained model to production with monitoring: 1. Export model to standardized format (ONNX, TorchScript, SavedModel) 2. Package model with dependencies in Docker container 3. Deploy to staging environment 4. Run integration tests against staging 5. Deploy canary (5% traffic) to production 6. Monitor latency and error rates for 1 hour 7. Promote to full production if metrics pass 8. **Validation:** p95 latency < 100ms, error rate < 0.1% ### Container Template ```dockerfile FROM python:3.11-slim COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY model/ /app/model/ COPY src/ /app/src/ HEALTHCHECK CMD curl -f http://localhost:8080/health || exit 1 EXPOSE 8080 CMD ["uvicorn", "src.server:app", "--host", "0.0.0.0", "--port", "8080"] ``` ### Serving Options | Option | Latency | Throughput | Use Case | |--------|---------|------------|----------| | FastAPI + Uvicorn | Low | Medium | REST APIs, small models | | Triton Inference Ser