senior-ml-engineer

Solid

ML engineering skill for productionizing models, building MLOps pipelines, and integrating LLMs. Covers model deployment, feature stores, drift monitoring, RAG systems, and cost optimization. Use when the user asks about deploying ML models to production, setting up MLOps infrastructure (MLflow, Kubeflow, Kubernetes, Docker), monitoring model performance or drift, building RAG pipelines, or integrating LLM APIs with retry logic and cost controls. Focused on production and operational concerns rather than model research or initial training.

AI & Automation 16,782 stars 2310 forks Updated 3 days ago MIT

Install

View on GitHub

Quality Score: 96/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Senior ML Engineer Production ML engineering patterns for model deployment, MLOps infrastructure, and LLM integration. --- ## Table of Contents - [Model Deployment Workflow](#model-deployment-workflow) - [MLOps Pipeline Setup](#mlops-pipeline-setup) - [LLM Integration Workflow](#llm-integration-workflow) - [RAG System Implementation](#rag-system-implementation) - [Model Monitoring](#model-monitoring) - [Reference Documentation](#reference-documentation) - [Tools](#tools) --- ## Model Deployment Workflow Deploy a trained model to production with monitoring: 1. Export model to standardized format (ONNX, TorchScript, SavedModel) 2. Package model with dependencies in Docker container 3. Deploy to staging environment 4. Run integration tests against staging 5. Deploy canary (5% traffic) to production 6. Monitor latency and error rates for 1 hour 7. Promote to full production if metrics pass 8. **Validation:** p95 latency < 100ms, error rate < 0.1% ### Container Template ```dockerfile FROM python:3.11-slim COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY model/ /app/model/ COPY src/ /app/src/ HEALTHCHECK CMD curl -f http://localhost:8080/health || exit 1 EXPOSE 8080 CMD ["uvicorn", "src.server:app", "--host", "0.0.0.0", "--port", "8080"] ``` ### Serving Options | Option | Latency | Throughput | Use Case | |--------|---------|------------|----------| | FastAPI + Uvicorn | Low | Medium | REST APIs, small models | | Triton Inference Ser...

Details

Author: alirezarezvani
Repository: alirezarezvani/claude-skills
Created: 7 months ago
Last Updated: 3 days ago
Language: Python
License: MIT

Integrates with

OpenAI · AI Anthropic · AI Docker · Infrastructure Kubernetes · Infrastructure FastAPI · Backend REST API · API

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Listed