Kubernetes
InfrastructureCommonly used with
Skills using Kubernetes (1072)
devops-engineer
Builds infrastructure that scales without babysitting. Automates everything worth automating. Monitors before it breaks. Treats clicking in consoles as a production incident waiting to happen.
cloud-devops
Cloud infrastructure and DevOps workflow covering AWS, Azure, GCP, Kubernetes, Terraform, CI/CD, monitoring, and cloud-native development.
deep-research
Run autonomous research tasks that plan, search, read, and synthesize information into comprehensive reports.
docker-expert
Docker containerization expert with deep knowledge of multi-stage builds, image optimization, container security, Docker Compose orchestration, and production deployment patterns. Use PROACTIVELY for Dockerfile optimization, container issues, image size problems, security hardening, networking, and orchestration challenges.
github-actions-templates
Production-ready GitHub Actions workflow patterns for testing, building, and deploying applications.
gitops-workflow
Complete guide to implementing GitOps workflows with ArgoCD and Flux for automated Kubernetes deployments.
kubernetes-architect
Expert Kubernetes architect specializing in cloud-native infrastructure, advanced GitOps workflows (ArgoCD/Flux), and enterprise container orchestration.
modal-serverless-gpu
Serverless GPU cloud platform for running ML workloads. Use when you need on-demand GPU access without infrastructure management, deploying ML models as APIs, or running batch jobs with automatic scaling.
prometheus-configuration
Complete guide to Prometheus setup, metric collection, scrape configuration, and recording rules.
skypilot-multi-cloud-orchestration
Multi-cloud orchestration for ML workloads with automatic cost optimization. Use when you need to run training or batch jobs across multiple clouds, leverage spot instances with auto-recovery, or optimize GPU costs across providers.
infrastructure
云原生基础设施。Kubernetes、Helm、Kustomize、Operator、CRD、GitOps、ArgoCD、Flux、IaC、Terraform、Pulumi、CDK。当用户提到 K8s、Helm、GitOps、IaC 时路由到此。
adobe-incident-runbook
Execute Adobe incident response procedures with triage, mitigation, and postmortem for Firefly Services, PDF Services, and I/O Events outages. Use when responding to Adobe-related incidents, investigating API failures, or running post-incident reviews. Trigger with phrases like "adobe incident", "adobe outage", "adobe down", "adobe on-call", "adobe emergency".
adobe-load-scale
Implement load testing, auto-scaling, and capacity planning for Adobe API integrations with k6 scripts targeting Firefly, PDF Services, and Photoshop APIs, plus Kubernetes HPA configuration. Trigger with phrases like "adobe load test", "adobe scale", "adobe performance test", "adobe capacity", "adobe benchmark".
anth-deploy-integration
Deploy Claude API integrations to production cloud environments. Use when deploying Claude-powered services to Docker, Cloud Run, ECS, or Kubernetes with proper secret management and health checks. Trigger with phrases like "deploy anthropic", "claude production deploy", "ship claude integration", "anthropic cloud deployment".
apex-takeover
System takeover — take ownership of an existing codebase or inherited system. Use when "we acquired this", "previous team left", "take over this system", "inherited this codebase".
apollo-deploy-integration
Deploy Apollo.io integrations to production. Use when deploying Apollo integrations, configuring production environments, or setting up deployment pipelines. Trigger with phrases like "deploy apollo", "apollo production deploy", "apollo deployment pipeline", "apollo to production".
apollo-multi-env-setup
Configure Apollo.io multi-environment setup. Use when setting up development, staging, and production environments, or managing multiple Apollo configurations. Trigger with phrases like "apollo environments", "apollo staging", "apollo dev prod", "apollo multi-tenant", "apollo env config".
castai-ci-integration
Integrate CAST AI policy validation and cost checks into CI/CD pipelines. Use when adding CAST AI savings verification to GitHub Actions, validating Terraform plans, or gating deployments on cost thresholds. Trigger with phrases like "cast ai CI", "cast ai github actions", "cast ai terraform CI", "cast ai pipeline".
castai-common-errors
Diagnose and fix CAST AI agent, API, and autoscaler errors. Use when the CAST AI agent is offline, nodes are not scaling, or API calls return errors. Trigger with phrases like "cast ai error", "cast ai not working", "cast ai agent offline", "cast ai debug", "fix cast ai".
castai-core-workflow-a
Configure CAST AI autoscaler policies and node templates for cost optimization. Use when enabling Phase 2 automation, setting spot instance policies, or configuring node downscaler and evictor settings. Trigger with phrases like "cast ai autoscaler", "cast ai policies", "cast ai spot instances", "cast ai node optimization".
castai-cost-tuning
Maximize Kubernetes cost savings with CAST AI spot strategies and right-sizing. Use when analyzing cloud spend, optimizing spot-to-on-demand ratios, or configuring CAST AI for maximum savings. Trigger with phrases like "cast ai cost", "cast ai savings", "cast ai spot strategy", "reduce kubernetes cost", "cast ai budget".
castai-debug-bundle
Collect CAST AI diagnostic bundle for support tickets and troubleshooting. Use when preparing a support case, collecting agent logs, or building a diagnostic snapshot of cluster state. Trigger with phrases like "cast ai debug", "cast ai support bundle", "collect cast ai diagnostics", "cast ai logs".
castai-deploy-integration
Deploy CAST AI across multi-cloud Kubernetes clusters with Terraform modules. Use when onboarding EKS, GKE, or AKS clusters to CAST AI using infrastructure-as-code patterns. Trigger with phrases like "deploy cast ai", "cast ai eks", "cast ai gke", "cast ai aks", "cast ai terraform module".
castai-hello-world
Query CAST AI cluster savings report and node inventory. Use when verifying CAST AI connectivity, viewing cluster cost savings, or listing managed nodes after onboarding. Trigger with phrases like "cast ai hello world", "cast ai savings", "cast ai cluster status", "test cast ai connection".
castai-install-auth
Install and configure CAST AI agent on a Kubernetes cluster with API key authentication. Use when onboarding a cluster to CAST AI, setting up Helm charts, or configuring Terraform provider authentication. Trigger with phrases like "install cast ai", "connect cluster to cast ai", "cast ai setup", "cast ai api key", "cast ai helm install".
castai-local-dev-loop
Set up a local Kubernetes development loop with CAST AI cost monitoring. Use when building cost-aware deployments, testing autoscaler policies, or iterating on Terraform CAST AI configurations locally. Trigger with phrases like "cast ai dev setup", "cast ai local testing", "develop with cast ai", "cast ai terraform dev".
castai-performance-tuning
Optimize CAST AI autoscaler performance, node provisioning speed, and API efficiency. Use when nodes take too long to provision, autoscaler is not reacting fast enough, or optimizing API call patterns for multi-cluster dashboards. Trigger with phrases like "cast ai performance", "cast ai slow", "cast ai node provisioning", "cast ai autoscaler speed".
castai-prod-checklist
Production readiness checklist for CAST AI cluster onboarding. Use when going live with CAST AI autoscaling, validating Phase 2 setup, or preparing for production cost optimization. Trigger with phrases like "cast ai production", "cast ai go-live", "cast ai checklist", "cast ai launch".
castai-reference-architecture
CAST AI reference architecture for multi-cluster Kubernetes cost optimization. Use when designing CAST AI deployment across environments, planning Terraform module structure, or establishing team standards. Trigger with phrases like "cast ai architecture", "cast ai best practices", "cast ai multi-cluster", "cast ai terraform structure".
castai-security-basics
Secure CAST AI API keys, RBAC configuration, and Kvisor security agent. Use when hardening CAST AI cluster access, configuring security scanning, or implementing API key rotation procedures. Trigger with phrases like "cast ai security", "cast ai api key rotation", "cast ai rbac", "cast ai kvisor", "secure cast ai".
castai-upgrade-migration
Upgrade CAST AI Helm charts, Terraform provider, and agent components. Use when upgrading CAST AI versions, checking for breaking changes, or migrating between CAST AI agent releases. Trigger with phrases like "upgrade cast ai", "update cast ai agent", "cast ai helm upgrade", "cast ai terraform upgrade".
castai-webhooks-events
Configure CAST AI webhook notifications for cluster events and audit logs. Use when setting up alerts for node scaling, cost threshold events, or integrating CAST AI events with Slack, PagerDuty, or custom endpoints. Trigger with phrases like "cast ai webhooks", "cast ai notifications", "cast ai slack alerts", "cast ai events".
cohere-incident-runbook
Execute Cohere incident response procedures with triage, mitigation, and postmortem. Use when responding to Cohere API outages, investigating errors, or running post-incident reviews for Cohere integration failures. Trigger with phrases like "cohere incident", "cohere outage", "cohere down", "cohere on-call", "cohere emergency", "cohere broken".
coreweave-ci-integration
Integrate CoreWeave deployments into CI/CD pipelines with GitHub Actions. Use when automating container builds, deploying inference services from CI, or validating GPU manifests in pull requests. Trigger with phrases like "coreweave CI", "coreweave github actions", "coreweave pipeline", "automate coreweave deploy".
coreweave-core-workflow-b
Run distributed GPU training jobs on CoreWeave with multi-node PyTorch. Use when training models across multiple GPUs, setting up distributed training, or running fine-tuning jobs on CoreWeave H100 clusters. Trigger with phrases like "coreweave training", "coreweave multi-gpu", "distributed training coreweave", "fine-tune on coreweave".
coreweave-data-handling
Handle training data and model artifacts on CoreWeave persistent storage. Use when managing large datasets, configuring storage classes, or implementing data pipelines for GPU workloads. Trigger with phrases like "coreweave data", "coreweave storage", "coreweave pvc", "coreweave dataset management".
coreweave-debug-bundle
Collect CoreWeave cluster diagnostics for support tickets. Use when preparing a support case, collecting GPU node status, or documenting pod failures. Trigger with phrases like "coreweave debug", "coreweave support", "coreweave diagnostics", "collect coreweave logs".
coreweave-deploy-integration
Deploy inference services on CoreWeave with Helm charts and Kustomize. Use when deploying multi-model inference, managing GPU deployments at scale, or templating CoreWeave manifests. Trigger with phrases like "deploy coreweave", "coreweave helm", "coreweave kustomize", "coreweave deployment patterns".
coreweave-enterprise-rbac
Configure RBAC and namespace isolation for CoreWeave multi-team GPU access. Use when managing team permissions, isolating GPU quotas, or implementing namespace-level access control. Trigger with phrases like "coreweave rbac", "coreweave permissions", "coreweave namespace isolation", "coreweave team access".
coreweave-hello-world
Deploy a GPU workload on CoreWeave with kubectl. Use when running your first GPU job, testing inference, or verifying CoreWeave cluster access. Trigger with phrases like "coreweave hello world", "coreweave first deploy", "coreweave gpu test", "run on coreweave".
coreweave-install-auth
Configure CoreWeave Kubernetes Service (CKS) access with kubeconfig and API tokens. Use when setting up kubectl access to CoreWeave, configuring CKS clusters, or authenticating with CoreWeave cloud services. Trigger with phrases like "install coreweave", "setup coreweave", "coreweave kubeconfig", "coreweave auth", "connect to coreweave".
coreweave-local-dev-loop
Set up local development workflow for CoreWeave GPU deployments. Use when building containers locally, testing YAML manifests, or iterating on model serving configurations before deploying. Trigger with phrases like "coreweave dev setup", "coreweave local testing", "develop for coreweave", "coreweave container build".
coreweave-multi-env-setup
Configure CoreWeave across development, staging, and production environments. Use when setting up multi-environment GPU infrastructure, separating namespaces, or managing per-environment GPU quotas. Trigger with phrases like "coreweave environments", "coreweave staging", "coreweave multi-env", "coreweave namespace setup".
coreweave-observability
Set up GPU monitoring and observability for CoreWeave workloads. Use when implementing GPU metrics dashboards, configuring alerts, or tracking inference latency and throughput. Trigger with phrases like "coreweave monitoring", "coreweave observability", "coreweave gpu metrics", "coreweave grafana".
coreweave-sdk-patterns
Production-ready patterns for CoreWeave GPU workload management with kubectl and Python. Use when building inference clients, managing GPU deployments programmatically, or creating reusable CoreWeave deployment templates. Trigger with phrases like "coreweave patterns", "coreweave client", "coreweave Python", "coreweave deployment template".
coreweave-security-basics
Secure CoreWeave deployments with RBAC, network policies, and secrets management. Use when hardening GPU workloads, managing model access, or configuring namespace isolation. Trigger with phrases like "coreweave security", "coreweave rbac", "secure coreweave", "coreweave secrets".
coreweave-upgrade-migration
Upgrade CoreWeave deployments and migrate between GPU types. Use when migrating from A100 to H100, upgrading CUDA versions, or updating inference server versions. Trigger with phrases like "upgrade coreweave", "coreweave gpu migration", "coreweave cuda upgrade", "migrate coreweave".
coreweave-webhooks-events
Monitor CoreWeave cluster events and GPU workload status. Use when tracking pod lifecycle events, monitoring GPU utilization, or alerting on inference service health changes. Trigger with phrases like "coreweave events", "coreweave monitoring", "coreweave pod alerts", "coreweave gpu monitoring".
customerio-deploy-pipeline
Deploy Customer.io integrations to production cloud platforms. Use when deploying to Cloud Run, Vercel, AWS Lambda, or Kubernetes with proper secrets management and health checks. Trigger: "deploy customer.io", "customer.io cloud run", "customer.io kubernetes", "customer.io lambda", "customer.io vercel".
customerio-load-scale
Implement Customer.io load testing and horizontal scaling. Use when preparing for high traffic, running load tests, or designing queue-based architectures for scale. Trigger: "customer.io load test", "customer.io scale", "customer.io high volume", "customer.io k6", "customer.io performance test".
customerio-multi-env-setup
Configure Customer.io multi-environment setup with workspace isolation. Use when setting up dev/staging/prod workspaces, environment-aware clients, or Kubernetes config overlays. Trigger: "customer.io environments", "customer.io staging", "customer.io dev prod", "customer.io workspace isolation".
deepgram-deploy-integration
Deploy Deepgram integrations to production environments. Use when deploying to cloud platforms, configuring containers, or setting up Deepgram in Docker/Kubernetes/serverless. Trigger: "deploy deepgram", "deepgram docker", "deepgram kubernetes", "deepgram production deploy", "deepgram cloud run", "deepgram lambda".
deepgram-multi-env-setup
Configure Deepgram multi-environment setup for dev, staging, and production. Use when setting up environment-specific configurations, managing multiple Deepgram projects, or implementing environment isolation. Trigger: "deepgram environments", "deepgram staging", "deepgram dev prod", "multi-environment deepgram", "deepgram config management".
documenso-deploy-integration
Deploy Documenso integrations across different platforms and environments. Use when deploying to cloud platforms, containerizing applications, or setting up infrastructure for Documenso integrations. Trigger with phrases like "deploy documenso", "documenso docker", "documenso kubernetes", "documenso cloud deployment".
evernote-deploy-integration
Deploy Evernote integrations to production environments. Use when deploying to cloud platforms, configuring production, or setting up deployment pipelines. Trigger with phrases like "deploy evernote", "evernote production deploy", "release evernote", "evernote cloud deployment".
glean-core-workflow-a
Execute Glean primary workflow: search, chat, and AI-powered answers across enterprise data. Use when building search integrations, implementing Glean chat, or creating AI assistants. Trigger: "glean search API", "glean chat", "glean AI answers", "enterprise search".
maintainx-deploy-integration
Deploy MaintainX integrations to production environments. Use when deploying to cloud platforms, configuring production environments, or automating deployment pipelines for MaintainX integrations. Trigger with phrases like "deploy maintainx", "maintainx deployment", "maintainx cloud deploy", "maintainx kubernetes", "maintainx docker".
onenote-deploy-integration
Deploy OneNote integrations with MSAL token persistence, health checks, and container best practices. Use when containerizing OneNote services, configuring health endpoints, or managing token cache in production. Trigger with "onenote deploy", "onenote docker", "onenote container", "onenote health check".
oraclecloud-deploy-integration
Deploy containers to OCI using OKE (Kubernetes) or Container Instances. Use when deploying applications to Oracle Cloud, pushing images to OCIR, or configuring OKE clusters. Trigger with "oraclecloud deploy", "oci kubernetes", "oke deploy", "oci container instances", "oracle cloud deploy integration".
research-to-deploy
Researches infrastructure best practices and generates deployment-ready configurations, Terraform modules, Dockerfiles, and CI/CD pipelines. Use when the user needs to deploy services, set up infrastructure, or create cloud configurations based on current best practices. Trigger with phrases like "research and deploy", "set up Cloud Run", "create Terraform for", "deploy this to AWS", or "generate infrastructure configs".
moai-workflow-jit-docs
Enhanced Just-In-Time document loading system that discovers, loads, and caches relevant documentation based on user intent and project context. Use when users need specific documentation on demand.
aegisops-ai
Autonomous DevSecOps & FinOps Guardrails. Orchestrates Gemini 3 Flash to audit Linux Kernel patches, Terraform cost drifts, and K8s compliance.
azure-identity-py
Azure Identity SDK for Python authentication. Use for DefaultAzureCredential, managed identity, service principals, and token caching.
azure-identity-rust
Azure Identity SDK for Rust authentication. Use for DeveloperToolsCredential, ManagedIdentityCredential, ClientSecretCredential, and token-based authentication.
azure-identity-ts
Authenticate to Azure services with various credential types.
bdistill-knowledge-extraction
Extract structured domain knowledge from AI models in-session or from local open-source models via Ollama. No API key needed.
c4-container
Expert C4 Container-level documentation specialist.
cloud-devops
Cloud infrastructure and DevOps workflow covering AWS, Azure, GCP, Kubernetes, Terraform, CI/CD, monitoring, and cloud-native development.
deep-research
Run autonomous research tasks that plan, search, read, and synthesize information into comprehensive reports.
deployment-procedures
Production deployment principles and decision-making. Safe deployment workflows, rollback strategies, and verification. Teaches thinking, not scripts.
distributed-tracing
Implement distributed tracing with Jaeger and Tempo for request flow visibility across microservices.
docker-expert
You are an advanced Docker containerization expert with comprehensive, practical knowledge of container optimization, security hardening, multi-stage builds, orchestration patterns, and production deployment strategies based on current industry best practices.
faf-expert
Advanced .faf (Foundational AI-context Format) specialist. IANA-registered format, MCP server config, championship scoring, bi-directional sync.
github-actions-templates
Production-ready GitHub Actions workflow patterns for testing, building, and deploying applications.
gitlab-ci-patterns
Comprehensive GitLab CI/CD pipeline patterns for automated testing, building, and deployment.
gitops-workflow
Complete guide to implementing GitOps workflows with ArgoCD and Flux for automated Kubernetes deployments.
hybrid-cloud-architect
Expert hybrid cloud architect specializing in complex multi-cloud solutions across AWS/Azure/GCP and private clouds (OpenStack/VMware).
jq
Expert jq usage for JSON querying, filtering, transformation, and pipeline integration. Practical patterns for real shell workflows.
k8s-security-policies
Comprehensive guide for implementing NetworkPolicy, PodSecurityPolicy, RBAC, and Pod Security Standards in Kubernetes.
kubernetes-architect
Expert Kubernetes architect specializing in cloud-native infrastructure, advanced GitOps workflows (ArgoCD/Flux), and enterprise container orchestration.
kubernetes-deployment
Kubernetes deployment workflow for container orchestration, Helm charts, service mesh, and production-ready K8s configurations.
linkerd-patterns
Production patterns for Linkerd service mesh - the lightweight, security-first service mesh for Kubernetes.
mlops-engineer
Build comprehensive ML pipelines, experiment tracking, and model registries with MLflow, Kubeflow, and modern MLOps tools.
multi-cloud-architecture
Decision framework and patterns for architecting applications across AWS, Azure, and GCP.
prometheus-configuration
Complete guide to Prometheus setup, metric collection, scrape configuration, and recording rules.
service-mesh-expert
Expert service mesh architect specializing in Istio, Linkerd, and cloud-native networking patterns. Masters traffic management, security policies, observability integration, and multi-cluster mesh con
golang-cli
Golang CLI application development. Use when building, modifying, or reviewing a Go CLI tool — especially for command structure, flag handling, configuration layering, version embedding, exit codes, I/O patterns, signal handling, shell completion, argument validation, and CLI unit testing. Also triggers when code uses cobra, viper, or urfave/cli.
drawio-skill
Use when the user requests diagrams, flowcharts, architecture diagrams, ER diagrams, UML / sequence / class diagrams, network topology, ML/DL model figures (Transformer/CNN/LSTM), mind maps, or any visualization. Also use proactively when explaining systems with 3+ components, complex data flows, or relationships that benefit from visual representation. Best suited when the diagram needs custom styling, rich shape vocabulary, swimlanes, or exportable images (PNG/SVG/PDF/JPG). Generates .drawio XML and exports locally via the native draw.io desktop CLI.
skypilot-multi-cloud-orchestration
Multi-cloud orchestration for ML workloads with automatic cost optimization. Use when you need to run training or batch jobs across multiple clouds, leverage spot instances with auto-recovery, or optimize GPU costs across providers.
helm-chart-scaffolding
Comprehensive guidance for creating, organizing, and managing Helm charts for packaging and deploying Kubernetes applications.
skypilot-multi-cloud-orchestration
Multi-cloud orchestration for ML workloads with automatic cost optimization. Use when you need to run training or batch jobs across multiple clouds, leverage spot instances with auto-recovery, or optimize GPU costs across providers.
alertmanager-rules-config
Manage alertmanager rules config operations. Auto-activating skill for DevOps Advanced. Triggers on: alertmanager rules config, alertmanager rules config Part of the DevOps Advanced skill category. Use when configuring systems or services. Trigger with phrases like "alertmanager rules config", "alertmanager config", "alertmanager".
ansible-playbook-generator
Generate ansible playbook generator operations. Auto-activating skill for DevOps Advanced. Triggers on: ansible playbook generator, ansible playbook generator Part of the DevOps Advanced skill category. Use when working with ansible playbook generator functionality. Trigger with phrases like "ansible playbook generator", "ansible generator", "ansible".
ansible-role-creator
Create ansible role creator operations. Auto-activating skill for DevOps Advanced. Triggers on: ansible role creator, ansible role creator Part of the DevOps Advanced skill category. Use when working with ansible role creator functionality. Trigger with phrases like "ansible role creator", "ansible creator", "ansible".
argocd-app-deployer
Deploy argocd app deployer operations. Auto-activating skill for DevOps Advanced. Triggers on: argocd app deployer, argocd app deployer Part of the DevOps Advanced skill category. Use when deploying applications or services. Trigger with phrases like "argocd app deployer", "argocd deployer", "deploy argocd app er".
cert-manager-setup
Manage cert manager setup operations. Auto-activating skill for DevOps Advanced. Triggers on: cert manager setup, cert manager setup Part of the DevOps Advanced skill category. Use when working with cert manager setup functionality. Trigger with phrases like "cert manager setup", "cert setup", "cert".
consul-service-discovery
Manage consul service discovery operations. Auto-activating skill for DevOps Advanced. Triggers on: consul service discovery, consul service discovery Part of the DevOps Advanced skill category. Use when working with consul service discovery functionality. Trigger with phrases like "consul service discovery", "consul discovery", "consul".
coreweave-common-errors
Diagnose and fix CoreWeave GPU scheduling, pod, and networking errors. Use when pods are stuck Pending, GPUs are not allocated, or experiencing CUDA and NCCL errors. Trigger with phrases like "coreweave error", "coreweave pod pending", "coreweave gpu not found", "coreweave debug", "fix coreweave".
coreweave-incident-runbook
Incident response runbook for CoreWeave GPU workload failures. Use when inference services are down, GPUs are unavailable, or responding to production incidents on CoreWeave. Trigger with phrases like "coreweave incident", "coreweave outage", "coreweave runbook", "coreweave service down".
coreweave-prod-checklist
Production readiness checklist for CoreWeave GPU workloads. Use when launching inference services, preparing GPU training for production, or validating deployment configurations. Trigger with phrases like "coreweave production", "coreweave go-live", "coreweave checklist", "coreweave launch".
coreweave-rate-limits
Handle CoreWeave API and GPU quota limits. Use when hitting quota limits, managing GPU resource allocation, or implementing request queuing for inference endpoints. Trigger with phrases like "coreweave quota", "coreweave limits", "coreweave gpu allocation", "coreweave throttle".
elasticsearch-index-manager
Manage elasticsearch index manager operations. Auto-activating skill for DevOps Advanced. Triggers on: elasticsearch index manager, elasticsearch index manager Part of the DevOps Advanced skill category. Use when working with elasticsearch index manager functionality. Trigger with phrases like "elasticsearch index manager", "elasticsearch manager", "elasticsearch".
envoy-proxy-config
Configure envoy proxy config operations. Auto-activating skill for DevOps Advanced. Triggers on: envoy proxy config, envoy proxy config Part of the DevOps Advanced skill category. Use when configuring systems or services. Trigger with phrases like "envoy proxy config", "envoy config", "envoy".
fluentd-config-generator
Generate fluentd config generator operations. Auto-activating skill for DevOps Advanced. Triggers on: fluentd config generator, fluentd config generator Part of the DevOps Advanced skill category. Use when configuring systems or services. Trigger with phrases like "fluentd config generator", "fluentd generator", "fluentd".
flux-gitops-setup
Configure flux gitops setup operations. Auto-activating skill for DevOps Advanced. Triggers on: flux gitops setup, flux gitops setup Part of the DevOps Advanced skill category. Use when working with flux gitops setup functionality. Trigger with phrases like "flux gitops setup", "flux setup", "flux".
grafana-dashboard-creator
Create grafana dashboard creator operations. Auto-activating skill for DevOps Advanced. Triggers on: grafana dashboard creator, grafana dashboard creator Part of the DevOps Advanced skill category. Use when working with grafana dashboard creator functionality. Trigger with phrases like "grafana dashboard creator", "grafana creator", "grafana".
helm-chart-generator
Generate helm chart generator operations. Auto-activating skill for DevOps Advanced. Triggers on: helm chart generator, helm chart generator Part of the DevOps Advanced skill category. Use when working with helm chart generator functionality. Trigger with phrases like "helm chart generator", "helm generator", "helm".
helm-values-manager
Manage helm values manager operations. Auto-activating skill for DevOps Advanced. Triggers on: helm values manager, helm values manager Part of the DevOps Advanced skill category. Use when working with helm values manager functionality. Trigger with phrases like "helm values manager", "helm manager", "helm".
istio-service-mesh-config
Configure istio service mesh config operations. Auto-activating skill for DevOps Advanced. Triggers on: istio service mesh config, istio service mesh config Part of the DevOps Advanced skill category. Use when configuring systems or services. Trigger with phrases like "istio service mesh config", "istio config", "istio".
juicebox-core-workflow-a
Execute Juicebox people search with power filters and ATS export. Trigger: "find candidates", "people search", "juicebox search".
kubernetes-configmap-handler
Configure kubernetes configmap handler operations. Auto-activating skill for DevOps Advanced. Triggers on: kubernetes configmap handler, kubernetes configmap handler Part of the DevOps Advanced skill category. Use when configuring systems or services. Trigger with phrases like "kubernetes configmap handler", "kubernetes handler", "kubernetes".
kubernetes-deployment-creator
Create kubernetes deployment creator operations. Auto-activating skill for DevOps Advanced. Triggers on: kubernetes deployment creator, kubernetes deployment creator Part of the DevOps Advanced skill category. Use when deploying applications or services. Trigger with phrases like "kubernetes deployment creator", "kubernetes creator", "deploy kubernetes ment creator".
kubernetes-ingress-config
Configure kubernetes ingress config operations. Auto-activating skill for DevOps Advanced. Triggers on: kubernetes ingress config, kubernetes ingress config Part of the DevOps Advanced skill category. Use when configuring systems or services. Trigger with phrases like "kubernetes ingress config", "kubernetes config", "kubernetes".
kubernetes-rbac-analyzer
Analyze kubernetes rbac analyzer operations. Auto-activating skill for Security Advanced. Triggers on: kubernetes rbac analyzer, kubernetes rbac analyzer Part of the Security Advanced skill category. Use when analyzing or auditing kubernetes rbac analyzer. Trigger with phrases like "kubernetes rbac analyzer", "kubernetes analyzer", "analyze kubernetes rbac r".
kubernetes-secrets-manager
Manage kubernetes secrets manager operations. Auto-activating skill for DevOps Advanced. Triggers on: kubernetes secrets manager, kubernetes secrets manager Part of the DevOps Advanced skill category. Use when working with kubernetes secrets manager functionality. Trigger with phrases like "kubernetes secrets manager", "kubernetes manager", "kubernetes".
kubernetes-service-manager
Manage kubernetes service manager operations. Auto-activating skill for DevOps Advanced. Triggers on: kubernetes service manager, kubernetes service manager Part of the DevOps Advanced skill category. Use when working with kubernetes service manager functionality. Trigger with phrases like "kubernetes service manager", "kubernetes manager", "kubernetes".
nginx-ingress-manager
Manage nginx ingress manager operations. Auto-activating skill for DevOps Advanced. Triggers on: nginx ingress manager, nginx ingress manager Part of the DevOps Advanced skill category. Use when working with nginx ingress manager functionality. Trigger with phrases like "nginx ingress manager", "nginx manager", "nginx".
prometheus-config-generator
Generate prometheus config generator operations. Auto-activating skill for DevOps Advanced. Triggers on: prometheus config generator, prometheus config generator Part of the DevOps Advanced skill category. Use when configuring systems or services. Trigger with phrases like "prometheus config generator", "prometheus generator", "prometheus".
terraform-module-creator
Create terraform module creator operations. Auto-activating skill for DevOps Advanced. Triggers on: terraform module creator, terraform module creator Part of the DevOps Advanced skill category. Use when working with terraform module creator functionality. Trigger with phrases like "terraform module creator", "terraform creator", "terraform".
terraform-provider-config
Configure terraform provider config operations. Auto-activating skill for DevOps Advanced. Triggers on: terraform provider config, terraform provider config Part of the DevOps Advanced skill category. Use when configuring systems or services. Trigger with phrases like "terraform provider config", "terraform config", "terraform".
terraform-state-manager
Manage terraform state manager operations. Auto-activating skill for DevOps Advanced. Triggers on: terraform state manager, terraform state manager Part of the DevOps Advanced skill category. Use when working with terraform state manager functionality. Trigger with phrases like "terraform state manager", "terraform manager", "terraform".
vault-secrets-integrator
Configure vault secrets integrator operations. Auto-activating skill for DevOps Advanced. Triggers on: vault secrets integrator, vault secrets integrator Part of the DevOps Advanced skill category. Use when working with vault secrets integrator functionality. Trigger with phrases like "vault secrets integrator", "vault integrator", "vault".
ln-629-lifecycle-auditor
Checks bootstrap initialization, graceful shutdown, resource cleanup, signal handling, liveness/readiness probes. Use when auditing app lifecycle.
analyzing-linux-elf-malware
Analyzes malicious Linux ELF (Executable and Linkable Format) binaries including botnets, cryptominers, ransomware, and rootkits targeting Linux servers, containers, and cloud infrastructure. Covers static analysis, dynamic tracing, and reverse engineering of x86_64 and ARM ELF samples. Activates for requests involving Linux malware analysis, ELF binary investigation, Linux server compromise assessment, or container malware analysis.
auditing-gcp-iam-permissions
Auditing Google Cloud Platform IAM permissions to identify overly permissive bindings, primitive role usage, service account key proliferation, and cross-project access risks using gcloud CLI, Policy Analyzer, and IAM Recommender.
auditing-kubernetes-cluster-rbac
Auditing Kubernetes cluster RBAC configurations to identify overly permissive roles, wildcard permissions, dangerous ClusterRoleBindings, service account abuse, and privilege escalation paths using kubectl, rbac-tool, KubiScan, and Kubeaudit.
detecting-aws-guardduty-findings-automation
Automate AWS GuardDuty threat detection findings processing using EventBridge and Lambda to enable real-time incident response, automatic quarantine of compromised resources, and security notification workflows.
detecting-container-drift-at-runtime
Detect unauthorized modifications to running containers by monitoring for binary execution drift, file system changes, and configuration deviations from the original container image.
detecting-container-escape-with-falco-rules
Detect container escape attempts in real-time using Falco runtime security rules that monitor syscalls, file access, and privilege escalation.
detecting-privilege-escalation-in-kubernetes-pods
Detect and prevent privilege escalation in Kubernetes pods by monitoring security contexts, capabilities, and syscall patterns with Falco and OPA policies.
implementing-aqua-security-for-container-scanning
Deploy Aqua Security's Trivy scanner to detect vulnerabilities, misconfigurations, secrets, and license issues in container images across CI/CD pipelines and registries.
implementing-container-image-minimal-base-with-distroless
Reduce container attack surface by building application images on Google distroless base images that contain only the application runtime with no shell, package manager, or unnecessary OS utilities.
implementing-ebpf-security-monitoring
Implements eBPF-based security monitoring using Cilium Tetragon for real-time process execution tracking, network connection observability, file access auditing, and runtime enforcement. Covers TracingPolicy CRD authoring with kprobe/tracepoint hooks, in-kernel filtering via matchArgs/matchBinaries selectors, JSON event export, and integration with SIEM pipelines. Use when building kernel-level runtime security observability for Linux hosts or Kubernetes clusters.
implementing-gcp-binary-authorization
Implement GCP Binary Authorization to enforce deploy-time security controls that ensure only trusted, attested container images are deployed to Google Kubernetes Engine and Cloud Run.
implementing-iec-62443-security-zones
This skill covers designing and implementing security zones and conduits for industrial automation and control systems (IACS) per IEC 62443-3-2. It addresses zone partitioning based on risk assessment, assigning Security Level targets (SL-T), designing conduit security controls, implementing microsegmentation with industrial firewalls, and validating zone architecture through traffic analysis and penetration testing against the Purdue Reference Model.
implementing-image-provenance-verification-with-cosign
Sign and verify container image provenance using Sigstore Cosign with keyless OIDC-based signing, attestations, and Kubernetes admission enforcement.
implementing-infrastructure-as-code-security-scanning
This skill covers implementing automated security scanning for Infrastructure as Code (IaC) templates using tools like Checkov, tfsec, and KICS. It addresses detecting misconfigurations in Terraform, CloudFormation, Kubernetes manifests, and Helm charts before deployment, establishing policy-based governance, and integrating IaC scanning into CI/CD pipelines to prevent insecure cloud resource provisioning.
implementing-kubernetes-network-policy-with-calico
Implement Kubernetes network segmentation using Calico NetworkPolicy and GlobalNetworkPolicy for zero-trust pod-to-pod communication.
implementing-kubernetes-pod-security-standards
Pod Security Standards (PSS) define three levels of security policies -- Privileged, Baseline, and Restricted -- enforced by the Pod Security Admission (PSA) controller built into Kubernetes 1.25+. PS
implementing-microsegmentation-with-guardicore
Implementing microsegmentation using Akamai Guardicore Segmentation to map application dependencies, create granular network policies, visualize east-west traffic flows, and enforce least-privilege communication between workloads across data centers and cloud.
implementing-network-policies-for-kubernetes
Kubernetes NetworkPolicies provide pod-level network segmentation by defining ingress and egress rules that control traffic flow between pods, namespaces, and external endpoints. Combined with CNI plu
implementing-opa-gatekeeper-for-policy-enforcement
Enforce Kubernetes admission policies using OPA Gatekeeper with ConstraintTemplates, Rego rules, and the Gatekeeper policy library.
implementing-pod-security-admission-controller
Implement Kubernetes Pod Security Admission to enforce baseline and restricted security profiles at namespace level using built-in admission controller.
implementing-policy-as-code-with-open-policy-agent
This skill covers implementing Open Policy Agent (OPA) and Gatekeeper for policy-as-code enforcement in Kubernetes and CI/CD pipelines. It addresses writing Rego policies, deploying OPA Gatekeeper as a Kubernetes admission controller, testing policies in development, and integrating policy evaluation into deployment pipelines.
implementing-rbac-hardening-for-kubernetes
Harden Kubernetes Role-Based Access Control by implementing least-privilege policies, auditing role bindings, eliminating cluster-admin sprawl, and integrating external identity providers.
implementing-runtime-security-with-tetragon
Implement eBPF-based runtime security observability and enforcement in Kubernetes clusters using Cilium Tetragon for kernel-level threat detection and policy enforcement.
implementing-secrets-management-with-vault
This skill covers deploying HashiCorp Vault for centralized secrets management across cloud environments, including dynamic secret generation for databases and cloud providers, transit encryption, PKI certificate management, and Kubernetes integration. It addresses eliminating hardcoded credentials from application code and CI/CD pipelines by implementing short-lived, automatically rotated secrets.
implementing-sigstore-for-software-signing
Implements Sigstore-based software signing and verification using Cosign keyless signing, Rekor transparency log verification, and Fulcio certificate authority integration to establish cryptographic provenance for container images, binaries, and software artifacts. The practitioner configures OIDC-based identity binding, verifies signing events against the Rekor transparency log, and integrates signing workflows into CI/CD pipelines. Activates for requests involving software supply chain signing, keyless container signing, Sigstore deployment, or artifact provenance verification.
implementing-supply-chain-security-with-in-toto
Implement software supply chain integrity verification for container builds using the in-toto framework to create cryptographically signed attestations across CI/CD pipeline steps.
performing-container-image-hardening
This skill covers hardening container images by minimizing attack surface, removing unnecessary packages, implementing multi-stage builds, configuring non-root users, and applying CIS Docker Benchmark recommendations to produce secure production-ready images.
performing-container-security-scanning-with-trivy
Scan container images, filesystems, and Kubernetes manifests for vulnerabilities, misconfigurations, exposed secrets, and license compliance issues using Aqua Security Trivy with SBOM generation and CI/CD integration.
performing-kubernetes-cis-benchmark-with-kube-bench
Audit Kubernetes cluster security posture against CIS benchmarks using kube-bench with automated checks for control plane, worker nodes, and RBAC.
performing-kubernetes-etcd-security-assessment
Assess the security posture of Kubernetes etcd clusters by evaluating encryption at rest, TLS configuration, access controls, backup encryption, and network isolation.
performing-kubernetes-penetration-testing
Kubernetes penetration testing systematically evaluates cluster security by simulating attacker techniques against the API server, kubelet, etcd, pods, RBAC, network policies, and secrets. Using tools
scanning-containers-with-trivy-in-cicd
This skill covers integrating Aqua Security's Trivy scanner into CI/CD pipelines for comprehensive container image vulnerability detection. It addresses scanning Docker images for OS package and application dependency CVEs, detecting misconfigurations in Dockerfiles, scanning filesystem and git repositories, and establishing severity-based quality gates that block deployment of vulnerable images.
scanning-docker-images-with-trivy
Trivy is a comprehensive open-source vulnerability scanner by Aqua Security that detects vulnerabilities in OS packages, language-specific dependencies, misconfigurations, secrets, and license violati
scanning-kubernetes-manifests-with-kubesec
Perform security risk analysis on Kubernetes resource manifests using Kubesec to identify misconfigurations, privilege escalation risks, and deviations from security best practices.
securing-container-registry-images
Securing container registry images by implementing vulnerability scanning with Trivy and Grype, enforcing image signing with Cosign and Sigstore, configuring registry access controls, and building CI/CD pipelines that prevent deploying unscanned or unsigned images.
securing-container-registry-with-harbor
Harbor is an open-source container registry that provides security features including vulnerability scanning (integrated Trivy), image signing (Notary/Cosign), RBAC, content trust policies, replicatio
securing-helm-chart-deployments
Secure Helm chart deployments by validating chart integrity, scanning templates for misconfigurations, and enforcing security contexts in Kubernetes releases.
securing-kubernetes-on-cloud
This skill covers hardening managed Kubernetes clusters on EKS, AKS, and GKE by implementing Pod Security Standards, network policies, workload identity, RBAC scoping, image admission controls, and runtime security monitoring. It addresses cloud-specific security features including IRSA for EKS, Workload Identity for GKE, and Managed Identities for AKS.
securing-serverless-functions
This skill covers security hardening for serverless compute platforms including AWS Lambda, Azure Functions, and Google Cloud Functions. It addresses least privilege IAM roles, dependency vulnerability scanning, secrets management integration, input validation, function URL authentication, and runtime monitoring to protect against injection attacks, credential theft, and supply chain compromises.
helm-chart-scaffolding
Comprehensive guidance for creating, organizing, and managing Helm charts for packaging and deploying Kubernetes applications.
k8s-manifest-generator
Step-by-step guidance for creating production-ready Kubernetes manifests including Deployments, Services, ConfigMaps, Secrets, and PersistentVolumeClaims.
aws-cloud
AWS-specific infrastructure and services expertise for cloud operations and architecture
azure-cloud
Azure-specific infrastructure and services expertise for cloud operations and architecture
bentoml-model-packager
BentoML skill for model packaging, serving, and containerization.
container-security-scanner
Container image and Kubernetes security scanning for CVEs, misconfigurations, and compliance
gcp-cloud
GCP-specific infrastructure and services expertise for cloud operations and architecture
iac-security-scanner
Infrastructure as Code security scanning and policy enforcement for Terraform, CloudFormation, Kubernetes, and Pulumi
k8s-validator
Validate Kubernetes manifests for security, best practices, and resource limits
kubeflow-pipeline-executor
Kubeflow Pipelines skill for ML workflow orchestration, component management, and Kubernetes-native ML.
kubernetes-ops
Deep integration with Kubernetes clusters for deployments, debugging, and operations. Execute kubectl commands, analyze pod logs/events/resources, generate and validate manifests, and debug cluster issues.
network-simulation
Skill for network condition simulation, emulation, and chaos engineering
schema-comparator
Compare database schemas between source and target environments for migration planning
seldon-model-deployer
Seldon Core deployment skill for model serving, A/B testing, and canary deployments on Kubernetes.
helm-chart-builder
Helm chart development agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw — chart scaffolding, values design, template patterns, dependency management, security hardening, and chart testing. Use when: user wants to create or improve Helm charts, design values.yaml files, implement template helpers, audit chart security (RBAC, network policies, pod security), manage subcharts, or run helm lint/test.
senior-ml-engineer
ML engineering skill for productionizing models, building MLOps pipelines, and integrating LLMs. Covers model deployment, feature stores, drift monitoring, RAG systems, and cost optimization. Use when the user asks about deploying ML models to production, setting up MLOps infrastructure (MLflow, Kubeflow, Kubernetes, Docker), monitoring model performance or drift, building RAG pipelines, or integrating LLM APIs with retry logic and cost controls. Focused on production and operational concerns rather than model research or initial training.
deployment-procedures
Production deployment principles and decision-making. Safe deployment workflows, rollback strategies, and verification. Teaches thinking, not scripts.
azure-identity-rust
Azure Identity SDK for Rust authentication. Use for DeveloperToolsCredential, ManagedIdentityCredential, ClientSecretCredential, and token-based authentication. Triggers: "azure-identity", "DeveloperToolsCredential", "authentication rust", "managed identity rust", "credential rust".
azure-identity-ts
Authenticate to Azure services using Azure Identity library for JavaScript (@azure/identity). Use when configuring authentication with DefaultAzureCredential, managed identity, service principals, or interactive browser login.
azure-diagnostics
Debug Azure production issues on Azure using AppLens, Azure Monitor, resource health, and safe triage. WHEN: debug production issues, troubleshoot container apps, troubleshoot functions, troubleshoot AKS, kubectl cannot connect, kube-system/CoreDNS failures, pod pending, crashloop, node not ready, upgrade failures, analyze logs, KQL, insights, image pull failures, cold start issues, health probe failures, resource health, root cause of errors.
azure-kubernetes
Plan, create, and configure production-ready Azure Kubernetes Service (AKS) clusters. Covers Day-0 checklist, SKU selection (Automatic vs Standard), networking options (private API server, Azure CNI Overlay, egress configuration), security, and operations (autoscaling, upgrade strategy, cost analysis). WHEN: create AKS environment, provision AKS environment, enable AKS observability, design AKS networking, choose AKS SKU, secure AKS.
debug-buttercup
All pods run in namespace crs. Use when pods in the crs namespace are in CrashLoopBackOff, OOMKilled, or restarting, multiple services restart simultaneously (cascade failure), or redis is unresponsive or showing AOF warnings.
hyperpod-issue-report
Generate comprehensive issue reports from HyperPod clusters (EKS and Slurm) by collecting diagnostic logs and configurations for troubleshooting and AWS Support cases. Use when users need to collect diagnostics from HyperPod cluster nodes, generate issue reports for AWS Support, investigate node failures or performance problems, document cluster state, or create diagnostic snapshots. Triggers on requests involving issue reports, diagnostic collection, support case preparation, or cluster troubleshooting that requires gathering logs and system information from multiple nodes.
azure-diagnostics
Debug Azure production issues on Azure using AppLens, Azure Monitor, resource health, and safe triage. WHEN: debug production issues, troubleshoot container apps, troubleshoot functions, troubleshoot AKS, kubectl cannot connect, kube-system/CoreDNS failures, pod pending, crashloop, node not ready, upgrade failures, analyze logs, KQL, insights, image pull failures, cold start issues, health probe failures, resource health, root cause of errors.
azure-kubernetes
Plan, create, and configure production-ready Azure Kubernetes Service (AKS) clusters. Covers Day-0 checklist, SKU selection (Automatic vs Standard), networking options (private API server, Azure CNI Overlay, egress configuration), security, and operations (autoscaling, upgrade strategy, cost analysis). WHEN: create AKS environment, provision AKS environment, enable AKS observability, design AKS networking, choose AKS SKU, secure AKS.
analyzing-kubernetes-audit-logs
Parses Kubernetes API server audit logs (JSON lines) to detect exec-into-pod, secret access, RBAC modifications, privileged pod creation, and anonymous API access. Builds threat detection rules from audit event patterns. Use when investigating Kubernetes cluster compromise or building k8s-specific SIEM detection rules.
implementing-container-network-policies-with-calico
Enforce Kubernetes network segmentation using Calico CNI network policies and global network policies to control pod-to-pod traffic, restrict egress, and implement zero-trust microsegmentation.
performing-cloud-native-forensics-with-falco
Uses Falco YAML rules for runtime threat detection in containers and Kubernetes, monitoring syscalls for shell spawns, file tampering, network anomalies, and privilege escalation. Manages Falco rules via the Falco gRPC API and parses Falco alert output. Use when building container runtime security or investigating k8s cluster compromises.
performing-container-escape-detection
Detects container escape attempts by analyzing namespace configurations, privileged container checks, dangerous capability assignments, and host path mounts using the kubernetes Python client. Identifies CVE-2022-0492 style escapes via cgroup abuse. Use when auditing container security posture or investigating escape attempts.
chaos-engineer
Designs chaos experiments, creates failure injection frameworks, and facilitates game day exercises for distributed systems — producing runbooks, experiment manifests, rollback procedures, and post-mortem templates. Use when designing chaos experiments, implementing failure injection frameworks, or conducting game day exercises. Invoke for chaos experiments, resilience testing, blast radius control, game days, antifragile systems, fault injection, Chaos Monkey, Litmus Chaos.
devops-engineer
Creates Dockerfiles, configures CI/CD pipelines, writes Kubernetes manifests, and generates Terraform/Pulumi infrastructure templates. Handles deployment automation, GitOps configuration, incident response runbooks, and internal developer platform tooling. Use when setting up CI/CD pipelines, containerizing applications, managing infrastructure as code, deploying to Kubernetes clusters, configuring cloud platforms, automating releases, or responding to production incidents. Invoke for pipelines, Docker, Kubernetes, GitOps, Terraform, GitHub Actions, on-call, or platform engineering.
kubernetes-specialist
Use when deploying or managing Kubernetes workloads. Invoke to create deployment manifests, configure pod security policies, set up service accounts, define network isolation rules, debug pod crashes, analyze resource limits, inspect container logs, or right-size workloads. Use for Helm charts, RBAC policies, NetworkPolicies, storage configuration, performance optimization, GitOps pipelines, and multi-cluster management.
modal-serverless-gpu
Serverless GPU cloud platform for running ML workloads. Use when you need on-demand GPU access without infrastructure management, deploying ML models as APIs, or running batch jobs with automatic scaling.
skypilot-multi-cloud-orchestration
Multi-cloud orchestration for ML workloads with automatic cost optimization. Use when you need to run training or batch jobs across multiple clouds, leverage spot instances with auto-recovery, or optimize GPU costs across providers.
health-check-endpoint
Implement health check and readiness endpoints for SDK consumers
helm-charts
Expert Helm chart development and management skill for Kubernetes package management
qdrant-integration
Qdrant vector database with filtering, payloads, and quantization support
yaml
YAML configuration for CI/CD, Docker Compose, and Kubernetes.
safety-guard
Use this skill to prevent destructive operations when working on production systems or running agents autonomously.
eks
AWS EKS Kubernetes management for clusters, node groups, and workloads. Use when creating clusters, configuring IRSA, managing node groups, deploying applications, or integrating with AWS services.
ln-774-healthcheck-setup
Configures health check endpoints for Kubernetes readiness/liveness/startup probes. Use when deploying to Kubernetes.
ln-783-container-launcher
Builds and launches Docker containers with health verification. Use when validating that containerized services start correctly.
analyzing-projects
Analyzes codebases to understand structure, tech stack, patterns, and conventions. Use when onboarding to a new project, exploring unfamiliar code, or when asked "how does this work?" or "what's the architecture?"
modal-serverless-gpu
Serverless GPU cloud platform for running ML workloads. Use when you need on-demand GPU access without infrastructure management, deploying ML models as APIs, or running batch jobs with automatic scaling.
secrets-vault-manager
Use when the user asks to set up secret management infrastructure, integrate HashiCorp Vault, configure cloud secret stores (AWS Secrets Manager, Azure Key Vault, GCP Secret Manager), implement secret rotation, or audit secret access patterns.
devops-iac-engineer
Implements infrastructure as code using Terraform, Kubernetes, and cloud platforms. Designs scalable architectures, CI/CD pipelines, and observability solutions. Provides security-first DevOps practices and site reliability engineering guidance.
senior-computer-vision
World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.
senior-data-engineer
World-class data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, or implementing data governance.
senior-data-scientist
World-class data science skill for statistical modeling, experimentation, causal inference, and advanced analytics. Expertise in Python (NumPy, Pandas, Scikit-learn), R, SQL, statistical methods, A/B testing, time series, and business intelligence. Includes experiment design, feature engineering, model evaluation, and stakeholder communication. Use when designing experiments, building predictive models, performing causal analysis, or driving data-driven decisions.
senior-ml-engineer
World-class ML engineering skill for productionizing ML models, MLOps, and building scalable ML systems. Expertise in PyTorch, TensorFlow, model deployment, feature stores, model monitoring, and ML infrastructure. Includes LLM integration, fine-tuning, RAG systems, and agentic AI. Use when deploying ML models, building ML platforms, implementing MLOps, or integrating LLMs into production systems.
senior-prompt-engineer
World-class prompt engineering skill for LLM optimization, prompt patterns, structured outputs, and AI product development. Expertise in Claude, GPT-4, prompt design patterns, few-shot learning, chain-of-thought, and AI evaluation. Includes RAG optimization, agent design, and LLM system architecture. Use when building AI products, optimizing LLM performance, designing agentic systems, or implementing advanced prompting techniques.
dotnet-trace-collect
Guide developers through capturing diagnostic artifacts to diagnose production .NET performance issues. Use when the user needs help choosing diagnostic tools, collecting performance data, or understanding tool trade-offs across different environments (Windows/Linux, .NET Framework/modern .NET, container/non-container).
dump-collect
Configure and collect crash dumps for modern .NET applications. USE FOR: enabling automatic crash dumps for CoreCLR or NativeAOT, capturing dumps from running .NET processes, setting up dump collection in Docker or Kubernetes, using dotnet-dump collect or createdump. DO NOT USE FOR: analyzing or debugging dumps, post-mortem investigation with lldb/windbg/dotnet-dump analyze, profiling or tracing, or for .NET Framework processes.
aspire
Aspire skill covering the Aspire CLI, AppHost orchestration, service discovery, integrations, MCP server, VS Code extension, Dev Containers, GitHub Codespaces, templates, dashboard, and deployment. Use when the user asks to create, run, debug, configure, deploy, or troubleshoot an Aspire distributed application.
devops-rollout-plan
Generate comprehensive rollout plans with preflight checks, step-by-step deployment, verification signals, rollback procedures, and communication plans for infrastructure and application changes
building-gitops-workflows
This skill enables Claude to construct GitOps workflows using ArgoCD and Flux. It is designed to generate production-ready configurations, implement best practices, and ensure a security-first approach for Kubernetes deployments. Use this skill when the user explicitly requests "GitOps workflow", "ArgoCD", "Flux", or asks for help with setting up a continuous delivery pipeline using GitOps principles. The skill will generate the necessary configuration files and setup code based on the user's specific requirements and infrastructure.
configuring-auto-scaling-policies
This skill configures auto-scaling policies for applications and infrastructure. It generates production-ready configurations based on user requirements, implementing best practices for scalability and security. Use this skill when the user requests help with auto-scaling setup, high availability, or dynamic resource allocation, specifically mentioning terms like "auto-scaling," "HPA," "scaling policies," or "dynamic scaling." This skill provides complete configuration code for various platforms.
configuring-service-meshes
This skill configures service meshes like Istio and Linkerd for microservices. It generates production-ready configurations, implements best practices, and ensures a security-first approach. Use this skill when the user asks to "configure service mesh", "setup Istio", "setup Linkerd", or requests assistance with "service mesh configuration" for their microservices architecture. The configurations will be tailored to the specified infrastructure requirements.
creating-kubernetes-deployments
This skill enables Claude to generate Kubernetes deployment manifests, services, and related configurations following best practices. It should be used when the user asks to create a new Kubernetes deployment, service, ingress, or other related resources. Claude will generate YAML files for Deployments, Services, ConfigMaps, Secrets, Ingress, and Horizontal Pod Autoscalers. Use this skill when the user mentions "Kubernetes deployment", "K8s deployment", "create service", "define ingress", or asks for a manifest for any K8s resource.
deploying-monitoring-stacks
This skill deploys monitoring stacks, including Prometheus, Grafana, and Datadog. It is used when the user needs to set up or configure monitoring infrastructure for applications or systems. The skill generates production-ready configurations, implements best practices, and supports multi-platform deployments. Use this when the user explicitly requests to deploy a monitoring stack, or mentions Prometheus, Grafana, or Datadog in the context of infrastructure setup.
generating-helm-charts
This skill enables Claude to generate Helm charts for Kubernetes applications. It should be used when the user requests the creation of a new Helm chart, the modification of an existing chart, or assistance with packaging and deploying Kubernetes applications using Helm. The skill is triggered by requests that mention "Helm chart", "Kubernetes deployment", "package application for Kubernetes", or similar phrases related to Helm and Kubernetes. It helps streamline the process of creating and managing Kubernetes deployments.
integrating-secrets-managers
This skill enables Claude to seamlessly integrate with various secrets managers like HashiCorp Vault and AWS Secrets Manager. It generates configurations and setup code, ensuring best practices for secure credential management. Use this skill when you need to manage sensitive information, generate production-ready configurations, or implement a security-first approach for your DevOps infrastructure. Trigger terms include "integrate secrets manager", "configure Vault", "AWS Secrets Manager setup", "manage credentials securely", or requests for secure configuration generation.
managing-network-policies
This skill enables Claude to manage Kubernetes network policies and firewall rules. It allows Claude to generate configurations and setup code based on specific requirements and infrastructure. Use this skill when the user requests to create, modify, or analyze network policies for Kubernetes, or when the user mentions "network-policy", "firewall rules", or "Kubernetes security". This skill is useful for implementing best practices and production-ready configurations for network security in a Kubernetes environment.
orchestrating-deployment-pipelines
This skill orchestrates complex, multi-stage deployment pipelines. It generates production-ready configurations and setup code based on user-specified requirements and infrastructure. Use this skill when the user asks to create a deployment pipeline, generate CI/CD configurations, or needs help with automating software deployments. Trigger terms include "deployment pipeline", "CI/CD", "automate deployment", "pipeline configuration", and "deployment orchestration".
setting-up-log-aggregation
This skill sets up log aggregation solutions using ELK (Elasticsearch, Logstash, Kibana), Loki, or Splunk. It generates production-ready configurations and setup code based on specific requirements and infrastructure. Use this skill when the user requests to set up logging infrastructure, configure log aggregation, deploy ELK stack, deploy Loki, deploy Splunk, or needs help with observability. It is triggered by terms like "log aggregation," "ELK setup," "Loki configuration," "Splunk deployment," or similar requests for centralized logging solutions.
yaml-master
PROACTIVE YAML INTELLIGENCE: Automatically activates when working with YAML files, configuration management, CI/CD pipelines, Kubernetes manifests, Docker Compose, or any YAML-based workflows. Provides intelligent validation, schema inference, linting, format conversion (JSON/TOML/XML), and structural transformations with deep understanding of YAML specifications and common anti-patterns.
entra-agent-id
Microsoft Entra Agent ID (preview) for creating OAuth2-capable AI agent identities via Microsoft Graph beta API. Covers Agent Identity Blueprints, BlueprintPrincipals, Agent Identities, required permissions, sponsors, and Workload Identity Federation. Includes Microsoft Entra SDK for AgentID (containerized sidecar) for polyglot agent authentication (Docker/Kubernetes), 3P agent integration, autonomous and interactive agent patterns. Triggers: "agent identity", "agent id", "Agent Identity Blueprint", "BlueprintPrincipal", "entra agent", "agent identity provisioning", "Graph agent identity", "entra sidecar", "agent id sidecar", "auth sidecar", "3P agent", "third-party agent identity", "polyglot agent auth".
add-rtk
Install rtk token-compression proxy into agent containers. Routes Bash tool calls through rtk for 60–90% token savings on dev commands (git, cargo, pytest, docker, kubectl, etc.).
docker-devops
Docker/K8s: Dockerfile, multi-stage, compose, manifests, Helm. Triggers: Docker, Dockerfile, container, Kubernetes, k8s, compose, Helm, pod.
deployment-pipeline-design
Design multi-stage CI/CD pipelines with approval gates, security checks, and deployment orchestration. Use this skill when designing zero-downtime deployment pipelines, implementing canary rollout strategies, setting up multi-environment promotion workflows, or debugging failed deployment gates in CI/CD.
distributed-tracing
Implement distributed tracing with Jaeger and Tempo to track requests across microservices and identify performance bottlenecks. Use when debugging microservices, analyzing request flows, or implementing observability for distributed systems.
github-actions-templates
Create production-ready GitHub Actions workflows for automated testing, building, and deploying applications. Use when setting up CI/CD with GitHub Actions, automating development workflows, or creating reusable workflow templates.
gitlab-ci-patterns
Build GitLab CI/CD pipelines with multi-stage workflows, caching, and distributed runners for scalable automation. Use when implementing GitLab CI/CD, optimizing pipeline performance, or setting up automated testing and deployment.
gitops-workflow
Implement GitOps workflows with ArgoCD and Flux for automated, declarative Kubernetes deployments with continuous reconciliation. Use when implementing GitOps practices, automating Kubernetes deployments, or setting up declarative infrastructure management.
helm-chart-scaffolding
Design, organize, and manage Helm charts for templating and packaging Kubernetes applications with reusable configurations. Use when creating Helm charts, packaging Kubernetes applications, or implementing templated deployments.
k8s-manifest-generator
Create production-ready Kubernetes manifests for Deployments, Services, ConfigMaps, and Secrets following best practices and security standards. Use when generating Kubernetes YAML manifests, creating K8s resources, or implementing production-grade Kubernetes configurations.
k8s-security-policies
Implement Kubernetes security policies including NetworkPolicy, PodSecurityPolicy, and RBAC for production-grade security. Use when securing Kubernetes clusters, implementing network isolation, or enforcing pod security standards.
linkerd-patterns
Implement Linkerd service mesh patterns for lightweight, security-focused service mesh deployments. Use when setting up Linkerd, configuring traffic policies, or implementing zero-trust networking with minimal overhead.
multi-cloud-architecture
Design multi-cloud architectures using a decision framework to select and integrate services across AWS, Azure, GCP, and OCI. Use when building multi-cloud systems, avoiding vendor lock-in, or leveraging best-of-breed services from multiple providers.
prometheus-configuration
Set up Prometheus for comprehensive metric collection, storage, and monitoring of infrastructure and applications. Use when implementing metrics collection, setting up monitoring infrastructure, or configuring alerting systems.
backend-engineering
Use this skill when designing backend systems, databases, APIs, or services. Triggers on schema design, database migrations, indexing strategies, distributed systems architecture, microservices, caching, message queues, observability setup, logging, metrics, tracing, SLO/SLI definition, performance optimization, query tuning, security hardening, authentication, authorization, API design (REST, GraphQL, gRPC), rate limiting, pagination, and failure handling patterns. Acts as a senior backend engineering advisor for mid-level engineers leveling up.
ci-cd-pipelines
Use this skill when setting up CI/CD pipelines, configuring GitHub Actions, implementing deployment strategies, or automating build/test/deploy workflows. Triggers on GitHub Actions, CI pipeline, CD pipeline, deployment automation, blue-green deployment, canary release, rolling update, build matrix, artifacts, and any task requiring continuous integration or delivery setup.
cloud-aws
Use this skill when architecting on AWS, selecting services, optimizing costs, or following the Well-Architected Framework. Triggers on EC2, S3, Lambda, RDS, DynamoDB, CloudFront, IAM, VPC, ECS, EKS, SQS, SNS, API Gateway, and any task requiring AWS architecture decisions, service selection, or cost management.
azure-active-directory-b2c
Expert knowledge for Azure Active Directory B2C development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when designing custom policies/user flows, MFA & IdP sign-in, app/API registration, CI/CD deployments, or Sentinel logging, and other Azure Active Directory B2C related development tasks. Not for Azure Information Protection (use azure-information-protection), Azure Role-based access control (use azure-rbac), Azure Security (use azure-security), Azure Portal (use azure-portal).
azure-advisor
Expert knowledge for Azure Advisor development including best practices, decision making, limits & quotas, security, configuration, and integrations & coding patterns. Use when configuring Advisor alerts, workbooks, RBAC access, bulk fixes, or Resource Graph/Kusto queries, and other Azure Advisor related development tasks. Not for Azure Cost Management (use azure-cost-management), Azure Monitor (use azure-monitor), Azure Policy (use azure-policy), Azure Service Health (use azure-service-health).
azure-ai-vision
Expert knowledge for Azure AI Vision development including decision making, limits & quotas, configuration, integrations & coding patterns, and deployment. Use when using Image Analysis, Read OCR containers, smart-crop thumbnails, background removal, or video frame analysis, and other Azure AI Vision related development tasks. Not for Azure AI Custom Vision (use azure-custom-vision), Azure AI Video Indexer (use azure-video-indexer), Azure AI Document Intelligence (use azure-document-intelligence), Azure AI Immersive Reader (use azure-immersive-reader).
azure-aks-edge-essentials
Expert knowledge for Azure Kubernetes Service Edge Essentials development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing AKS Edge/Arc clusters, Arc connectivity, IoT/OPC/ONVIF workloads, TPM/Key Vault, or Azure Local, and other Azure Kubernetes Service Edge Essentials related development tasks. Not for Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure IoT Edge (use azure-iot-edge), Azure Stack Edge (use azure-stack-edge), Azure Container Apps (use azure-container-apps).
azure-anomaly-detector
Expert knowledge for Azure AI Anomaly Detector development including troubleshooting, best practices, architecture & design patterns, limits & quotas, configuration, and deployment. Use when using univariate/multivariate APIs, Docker/IoT Edge containers, predictive maintenance flows, or regional limits, and other Azure AI Anomaly Detector related development tasks. Not for Azure AI Metrics Advisor (use azure-metrics-advisor), Azure Monitor (use azure-monitor), Azure Machine Learning (use azure-machine-learning).
azure-api-center
Expert knowledge for Azure Api Center development including best practices, security, configuration, integrations & coding patterns, and deployment. Use when automating API linting/registration, customizing the portal, syncing with API gateways, or enforcing design-time governance, and other Azure Api Center related development tasks. Not for Azure API Management (use azure-api-management), Azure App Configuration (use azure-app-configuration), Azure Service Connector (use azure-service-connector).
azure-app-configuration
Expert knowledge for Azure App Configuration development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using feature flags, dynamic refresh, snapshots, Key Vault integration, or App Configuration REST APIs, and other Azure App Configuration related development tasks. Not for Azure App Service (use azure-app-service), Azure Key Vault (use azure-key-vault), Azure Automation (use azure-automation), Azure Policy (use azure-policy).
azure-app-service
Expert knowledge for Azure App Service development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring App Service plans/ASE, VNet integration, managed identity/Key Vault, deployment slots, or TLS/certs, and other Azure App Service related development tasks. Not for Azure Functions (use azure-functions), Azure Spring Apps (use azure-spring-apps), Azure Container Apps (use azure-container-apps), Azure Static Web Apps (use azure-static-web-apps).
azure-app-testing
Expert knowledge for Azure App Testing development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Azure Load Testing with VNets/private endpoints, JMeter/Locust/Playwright, CI/CD pipelines, or Playwright Workspaces, and other Azure App Testing related development tasks. Not for Azure Test Plans (use azure-test-plans), Playwright Workspaces (use azure-playwright-workspaces), Azure DevOps (use azure-devops), Azure App Service (use azure-app-service).
azure-application-gateway
Expert knowledge for Azure Application Gateway development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring listeners/routing, WAF/TLS, AGIC/AKS integration, autoscale/zone redundancy, or v1→v2 migration, and other Azure Application Gateway related development tasks. Not for Azure Front Door (use azure-front-door), Azure Load Balancer (use azure-load-balancer), Azure Traffic Manager (use azure-traffic-manager), Azure Firewall (use azure-firewall).
azure-arc
Expert knowledge for Azure Arc development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing Arc-enabled Kubernetes, servers, SQL MI, Edge RAG, resource bridge, or SCVMM/vSphere workloads, and other Azure Arc related development tasks. Not for Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Virtual Machines (use azure-virtual-machines), Azure Stack Edge (use azure-stack-edge), Azure VMware Solution (use azure-vmware-solution).
azure-architecture
Expert guidance for designing Azure solutions using Azure Architecture. Covers reference architectures, solution ideas, design patterns, technology choices, architecture styles, best practices, anti-patterns, example workloads, and migration guides. Use when designing AKS, data/AI pipelines, hybrid/Arc, DR/multiregion, or AWS/GCP-to-Azure migration solutions, and other Azure Architecture related development tasks.
azure-artifact-signing
Expert knowledge for Azure Artifact Signing development including best practices, decision making, security, configuration, and integrations & coding patterns. Use when managing signing cert lifecycle, RBAC roles, DGSSv2 migration, diagnostic logs, or CI/CD signing workflows, and other Azure Artifact Signing related development tasks.
azure-artifacts
Expert knowledge for Azure Artifacts development including best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing feeds, upstream sources, package publishing/restore, GitHub Actions CI/CD, or npm/NuGet config, and other Azure Artifacts related development tasks. Not for Azure DevOps (use azure-devops), Azure Pipelines (use azure-pipelines), Azure Repos (use azure-repos), Azure Boards (use azure-boards).
azure-attestation
Expert knowledge for Azure Attestation development including troubleshooting, best practices, security, configuration, and deployment. Use when validating attestation tokens, authoring SGX/TPM policies, configuring policy signers, or securing endpoints, and other Azure Attestation related development tasks. Not for Azure Confidential Computing (use azure-confidential-computing), Azure Virtual Enclaves (use azure-virtual-enclaves), Azure Key Vault (use azure-key-vault), Azure Security (use azure-security).
azure-automation
Expert knowledge for Azure Automation development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when automating runbooks/DSC, Hybrid Runbook Workers, Change Tracking, Azure/AWS/365 integrations, or migrations, and other Azure Automation related development tasks. Not for Azure Functions (use azure-functions), Azure Logic Apps (use azure-logic-apps), Azure Scheduler (use azure-scheduler), Azure DevOps (use azure-devops).
azure-backup
Expert knowledge for Azure Backup development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when protecting Azure VMs, SQL/SAP HANA, Files/Blobs, MABS/DPM workloads, or scripting via CLI/PowerShell/REST, and other Azure Backup related development tasks. Not for Azure Site Recovery (use azure-site-recovery).
azure-baremetal-infrastructure
Expert knowledge for Azure Baremetal Infrastructure development including decision making, and architecture & design patterns. Use when choosing NC2 regions/SKUs, planning BareMetal topologies, or integrating NC2 with Azure networking/services, and other Azure Baremetal Infrastructure related development tasks. Not for Azure Large Instances (use azure-large-instances), Azure Virtual Machines (use azure-virtual-machines), Azure Virtual Machine Scale Sets (use azure-vm-scalesets), SAP HANA on Azure Large Instances (use azure-sap).
azure-bastion
Expert knowledge for Azure Bastion development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, and integrations & coding patterns. Use when configuring Bastion for AKS private clusters, VM scale sets, Entra ID auth, hub/spoke VNets, or IP-based cross-VNet access, and other Azure Bastion related development tasks. Not for Azure Virtual Network (use azure-virtual-network), Azure Virtual Machines (use azure-virtual-machines), Azure VPN Gateway (use azure-vpn-gateway), Azure Firewall (use azure-firewall).
azure-batch
Expert knowledge for Azure Batch development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring Batch pools/tasks, autoscale, containerized jobs, SDK/CLI workflows, or render/MPI workloads, and other Azure Batch related development tasks. Not for Azure HDInsight (use azure-hdinsight), Azure Databricks (use azure-databricks), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Virtual Machines (use azure-virtual-machines).
azure-blob-storage
Expert knowledge for Azure Blob Storage development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Data Lake, NFS/SFTP/BlobFuse, static website hosting, encryption options, or SDK-based blob operations, and other Azure Blob Storage related development tasks. Not for Azure Files (use azure-files), Azure Table Storage (use azure-table-storage), Azure Queue Storage (use azure-queue-storage), Azure NetApp Files (use azure-netapp-files).
azure-blueprints
Expert knowledge for Azure Blueprints development including troubleshooting, architecture & design patterns, security, configuration, and integrations & coding patterns. Use when defining Azure Blueprints, mapping built-in compliance sets, automating via CLI/PowerShell/REST, or fixing assignment errors, and other Azure Blueprints related development tasks. Not for Azure Policy (use azure-policy), Azure Resource Manager (use azure-resource-manager), Azure Managed Applications (use azure-managed-applications), Azure Deployment Environments (use azure-deployment-environments).
azure-boards
Expert knowledge for Azure Boards development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, and integrations & coding patterns. Use when managing work items, boards/backlogs, WIQL queries, Excel/Office sync, or GitHub/Teams integrations, and other Azure Boards related development tasks. Not for Azure DevOps (use azure-devops), Azure Test Plans (use azure-test-plans), Azure Pipelines (use azure-pipelines), Azure Repos (use azure-repos).
azure-bot-service
Expert knowledge for Azure AI Bot Service development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building Azure AI bots with Direct Line/Web Chat, Teams/SMS channels, OAuth/SSO, skills, or proactive messages, and other Azure AI Bot Service related development tasks. Not for Azure Health Bot (use azure-health-bot), Azure Functions (use azure-functions), Azure App Service (use azure-app-service), Azure Logic Apps (use azure-logic-apps).
azure-cache-redis
Expert knowledge for Azure Cache for Redis development including troubleshooting, best practices, decision making, architecture & design patterns, security, configuration, integrations & coding patterns, and deployment. Use when configuring geo-replication, persistence, Private Link/VNet access, Redis events/webhooks, or ARM/Bicep deployments, and other Azure Cache for Redis related development tasks. Not for Azure Managed Redis (use azure-managed-redis), Azure Cosmos DB (use azure-cosmos-db), Azure Table Storage (use azure-table-storage).
azure-chaos-studio
Expert knowledge for Chaos Studio development including troubleshooting, limits & quotas, security, configuration, and integrations & coding patterns. Use when defining ARM/Bicep experiments, deploying Chaos Agents, using CLI/REST, or integrating with Azure Monitor, and other Chaos Studio related development tasks. Not for Azure Monitor (use azure-monitor), Azure Resiliency (use azure-resiliency), Azure Reliability (use azure-reliability), Azure Site Recovery (use azure-site-recovery).
azure-cloud-adoption-framework
Expert guidance for planning and executing cloud adoption using Azure Cloud Adoption Framework. Covers strategy, planning, readiness & landing zones, adoption patterns, governance, security, operations & management, organization & teams, and adoption scenarios. Use when designing AI agent workloads, AKS/AVS/AVD platforms, SAP/Oracle on Azure, data mesh/analytics, or landing zones, and other Azure Cloud Adoption Framework related development tasks.
azure-cloud-hsm
Expert knowledge for Azure Cloud Hsm development including troubleshooting, best practices, limits & quotas, security, configuration, and integrations & coding patterns. Use when managing PKCS#11 apps, HSM-backed certs/keys, key rotation/backup, quotas/algorithms, or HSM logs, and other Azure Cloud Hsm related development tasks. Not for Azure Dedicated HSM (use azure-dedicated-hsm), Azure Payment Hsm (use azure-payment-hsm), Azure Key Vault (use azure-key-vault), Azure Attestation (use azure-attestation).
azure-cloud-services
Expert knowledge for Azure Cloud Services development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing Cloud Services (extended support), Guest OS versions, Key Vault certs, autoscale rules, or PowerShell automation, and other Azure Cloud Services related development tasks. Not for Azure Networking (use azure-networking), Azure Virtual Machines (use azure-virtual-machines), Azure Resource Manager (use azure-resource-manager), Azure Portal (use azure-portal).
azure-cloud-shell
Expert knowledge for Azure Cloud Shell development including troubleshooting, limits & quotas, and security. Use when debugging Cloud Shell storage/connectivity, session limits, required storage accounts, or private VNet access, and other Azure Cloud Shell related development tasks. Not for Azure Portal (use azure-portal), Azure Resource Manager (use azure-resource-manager).
azure-cognitive-search
Expert knowledge for Azure AI Search development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when designing indexes/skillsets, vector/semantic search, indexers, RAG knowledge bases, or secure data access, and other Azure AI Search related development tasks. Not for Azure Cosmos DB (use azure-cosmos-db), Azure Data Explorer (use azure-data-explorer), Azure SQL Database (use azure-sql-database), Azure Synapse Analytics (use azure-synapse-analytics).
azure-communication-services
Expert knowledge for Azure Communication Services development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building ACS calling, chat, SMS, email, Teams interop, Job Router, or contact center solutions, and other Azure Communication Services related development tasks. Not for Azure AI Bot Service (use azure-bot-service), Azure Notification Hubs (use azure-notification-hubs), Azure SignalR Service (use azure-signalr-service), Azure Web PubSub (use azure-web-pubsub).
azure-confidential-computing
Expert knowledge for Azure Confidential Computing development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using SGX/SEV-SNP VMs, AKS confidential containers, attestation/SKR, vTPM, or Fortanix/Key Vault, and other Azure Confidential Computing related development tasks. Not for Azure Virtual Enclaves (use azure-virtual-enclaves), Azure Virtual Machines (use azure-virtual-machines), Azure Dedicated HSM (use azure-dedicated-hsm), Azure Attestation (use azure-attestation).
azure-confidential-ledger
Expert knowledge for Azure Confidential Ledger development including decision making, security, integrations & coding patterns, and deployment. Use when configuring Entra ID/RBAC, client certs, node attestation, .NET SDK, JavaScript UDFs, or ARM/Terraform deployments, and other Azure Confidential Ledger related development tasks. Not for Azure Confidential Computing (use azure-confidential-computing), Azure Key Vault (use azure-key-vault), Azure Dedicated HSM (use azure-dedicated-hsm), Azure Cloud Hsm (use azure-cloud-hsm).
azure-container-apps
Expert knowledge for Azure Container Apps development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when running microservices on Azure Container Apps with Dapr, Java/Spring, GitHub Actions CI/CD, VNets, or GPUs, and other Azure Container Apps related development tasks. Not for Azure App Service (use azure-app-service), Azure Functions (use azure-functions), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Red Hat OpenShift (use azure-redhat-openshift).
azure-container-instances
Expert knowledge for Azure Container Instances development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, and deployment. Use when configuring ACI networking, standby pools, GitHub Actions deploys, Spot containers, or GPU workloads, and other Azure Container Instances related development tasks. Not for Azure Container Apps (use azure-container-apps), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure App Service (use azure-app-service), Azure Virtual Machines (use azure-virtual-machines).
azure-container-registry
Expert knowledge for Azure Container Registry development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using ACR Tasks, geo-replication, Private Link, connected registries, or image signing with Notation, and other Azure Container Registry related development tasks. Not for Azure Container Apps (use azure-container-apps), Azure Container Instances (use azure-container-instances), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Red Hat OpenShift (use azure-redhat-openshift).
azure-container-storage
Expert knowledge for Azure Container Storage development including troubleshooting, decision making, limits & quotas, security, and configuration. Use when configuring CMK-encrypted Elastic SAN volumes, ACS pools, LRS/ZRS redundancy, volume resize, or v1 installs, and other Azure Container Storage related development tasks. Not for Azure Blob Storage (use azure-blob-storage), Azure Files (use azure-files), Azure Elastic SAN (use azure-elastic-san), Azure NetApp Files (use azure-netapp-files).
azure-content-safety
Expert knowledge for Azure AI Content Safety development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Content Safety APIs, Docker containers, blocklists, groundedness checks, or custom safety categories, and other Azure AI Content Safety related development tasks. Not for Azure Information Protection (use azure-information-protection), Azure Security (use azure-security), Azure Sentinel (use azure-sentinel), Azure Defender For Cloud (use azure-defender-for-cloud).
azure-copilot
Expert knowledge for Azure Copilot development including troubleshooting, decision making, architecture & design patterns, security, configuration, and integrations & coding patterns. Use when sizing VMs, generating Bicep/Terraform, configuring Cosmos DB storage, or debugging App Service/VM disks, and other Azure Copilot related development tasks. Not for Azure AI services (use microsoft-foundry-tools), Azure Machine Learning (use azure-machine-learning), Azure AI Search (use azure-cognitive-search), Azure AI Bot Service (use azure-bot-service).
azure-cosmos-db
Expert knowledge for Azure Cosmos DB development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Cosmos DB APIs (NoSQL, Mongo, Cassandra, Postgres), change feed, global distribution, vector search, or HTAP, and other Azure Cosmos DB related development tasks. Not for Azure Table Storage (use azure-table-storage), Azure SQL Database (use azure-sql-database), Azure Database for MySQL (use azure-database-mysql), Azure Database for PostgreSQL (use azure-database-postgresql).
azure-custom-vision
Expert knowledge for Azure AI Custom Vision development including best practices, decision making, limits & quotas, security, integrations & coding patterns, and deployment. Use when exporting Custom Vision models, calling prediction APIs, using ONNX/TensorFlow, managing CMK/RBAC, or Smart Labeler, and other Azure AI Custom Vision related development tasks. Not for Azure AI Vision (use azure-ai-vision), Azure AI services (use microsoft-foundry-tools), Azure Machine Learning (use azure-machine-learning), Azure AI Foundry Local (use microsoft-foundry-local).
azure-cyclecloud
Expert knowledge for Azure CycleCloud development including troubleshooting, best practices, decision making, architecture & design patterns, security, configuration, integrations & coding patterns, and deployment. Use when automating CycleCloud via API/CLI, managing Slurm/HPC clusters, tuning autoscaling, or securing access/SSL, and other Azure CycleCloud related development tasks. Not for Azure Batch (use azure-batch), Azure HPC Cache (use azure-hpc-cache), Azure Virtual Machines (use azure-virtual-machines).
azure-data-api-builder
Expert knowledge for Azure Data Api Builder development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when defining DAB entities over SQL/Cosmos, exposing REST/GraphQL, securing auth/RLS, or deploying via Docker/Azure, and other Azure Data Api Builder related development tasks. Not for Azure API Management (use azure-api-management), Azure Functions (use azure-functions), Azure App Service (use azure-app-service), Azure Logic Apps (use azure-logic-apps).
azure-data-box-family
Expert knowledge for Azure Data Box development including troubleshooting, best practices, limits & quotas, security, configuration, and integrations & coding patterns. Use when handling Data Box/Disk orders, SMB/NFS/REST copies, Key Vault CMKs, blob tiering, or VHD-to-managed-disk flows, and other Azure Data Box related development tasks. Not for Azure Import Export (use azure-import-export), Azure Stack Edge (use azure-stack-edge), Azure Virtual Machines (use azure-virtual-machines), Azure Blob Storage (use azure-blob-storage).
azure-data-explorer
Expert knowledge for Azure Data Explorer development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring ADX auth/networking, managing cluster limits, integrating via SQL/ODBC/Functions, or designing DR/BC, and other Azure Data Explorer related development tasks. Not for Azure Synapse Analytics (use azure-synapse-analytics), Azure Stream Analytics (use azure-stream-analytics), Azure HDInsight (use azure-hdinsight), Azure Databricks (use azure-databricks).
azure-data-factory
Expert knowledge for Azure Data Factory development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when designing ADF pipelines, mapping data flows, SHIR/SSIS IR, SAP CDC, or CI/CD deployments, and other Azure Data Factory related development tasks. Not for Azure Synapse Analytics (use azure-synapse-analytics), Azure Databricks (use azure-databricks), Azure Stream Analytics (use azure-stream-analytics), Azure Data Explorer (use azure-data-explorer).
azure-data-manager-for-agri
Expert knowledge for Azure Data Manager for Agriculture development including limits & quotas, security, configuration, and integrations & coding patterns. Use when setting up BYOL creds/Private Link, ag data ingestion/IoT, AI/nutrient APIs, throttling, or Event Grid logs, and other Azure Data Manager for Agriculture related development tasks. Not for Azure Data Explorer (use azure-data-explorer), Azure Data Factory (use azure-data-factory), Azure Synapse Analytics (use azure-synapse-analytics), Azure Databricks (use azure-databricks).
azure-data-science-vm
Expert knowledge for Azure Data Science Virtual Machines development including troubleshooting, decision making, architecture & design patterns, security, configuration, integrations & coding patterns, and deployment. Use when managing DSVM images/tools, IaC deployment (Bicep/ARM), Key Vault secrets, MLflow, or GPU/Jupyter issues, and other Azure Data Science Virtual Machines related development tasks. Not for Azure Virtual Machines (use azure-virtual-machines), Azure Machine Learning (use azure-machine-learning), Azure Databricks (use azure-databricks), Azure HDInsight (use azure-hdinsight).
azure-data-share
Expert knowledge for Azure Data Share development including troubleshooting, decision making, security, configuration, and deployment. Use when estimating Data Share costs, managing invitations/RBAC, cross-region deployments, dataset mapping, or automation, and other Azure Data Share related development tasks. Not for Azure Data Box (use azure-data-box-family), Azure Import Export (use azure-import-export), Azure Open Datasets (use azure-open-datasets), Azure Data Explorer (use azure-data-explorer).
azure-database-migration
Expert knowledge for Azure Database Migration service development including troubleshooting, decision making, limits & quotas, security, integrations & coding patterns, and deployment. Use when planning Azure DMS migrations for MySQL, PostgreSQL, SQL Server/SSIS, SQL MI, or MongoDB workloads, and other Azure Database Migration service related development tasks. Not for Azure Migrate (use azure-migrate), Azure SQL Database (use azure-sql-database), Azure SQL Managed Instance (use azure-sql-managed-instance), SQL Server on Azure Virtual Machines (use azure-sql-virtual-machines).
azure-database-mysql
Expert knowledge for Azure Database for MySQL development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when planning MySQL Flexible Server HA/BCDR, CI/CD deployments, VNet/Private Link, read replicas, or AKS connectivity, and other Azure Database for MySQL related development tasks. Not for Azure Database for MariaDB (use azure-database-mariadb), Azure Database for PostgreSQL (use azure-database-postgresql), Azure SQL Database (use azure-sql-database), Azure SQL Managed Instance (use azure-sql-managed-instance).
azure-database-postgresql
Expert knowledge for Azure Database for PostgreSQL development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when tuning Azure PostgreSQL performance, securing VNets/TLS, using pgvector with OpenAI, or scaling Flexible Server, and other Azure Database for PostgreSQL related development tasks. Not for Azure SQL Database (use azure-sql-database), Azure SQL Managed Instance (use azure-sql-managed-instance), SQL Server on Azure Virtual Machines (use azure-sql-virtual-machines), Azure Cosmos DB (use azure-cosmos-db).
azure-databricks
Expert knowledge for Azure Databricks development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Unity Catalog, Lakehouse/Lakebase, Lakeflow, Delta Sharing, Vector Search, or model serving, and other Azure Databricks related development tasks. Not for Azure Synapse Analytics (use azure-synapse-analytics), Azure HDInsight (use azure-hdinsight), Azure Machine Learning (use azure-machine-learning), Azure Data Factory (use azure-data-factory).
azure-ddos-protection
Expert knowledge for Azure DDos Protection development including troubleshooting, best practices, decision making, architecture & design patterns, security, and configuration. Use when enabling DDoS IP/Network Protection plans, parsing DDoS logs, using Rapid Response, or enforcing Azure Policy, and other Azure DDos Protection related development tasks. Not for Azure Firewall (use azure-firewall), Azure Web Application Firewall (use azure-web-application-firewall), Azure Virtual Network (use azure-virtual-network), Azure Virtual Network Manager (use azure-virtual-network-manager).
azure-dedicated-hsm
Expert knowledge for Azure Dedicated HSM development including troubleshooting, decision making, architecture & design patterns, security, and deployment. Use when sizing HSM clusters, configuring VNets/ExpressRoute, planning Managed HSM migration, or resolving vendor support issues, and other Azure Dedicated HSM related development tasks. Not for Azure Cloud Hsm (use azure-cloud-hsm), Azure Key Vault (use azure-key-vault), Azure Payment Hsm (use azure-payment-hsm).
azure-defender-for-cloud
Expert knowledge for Azure Defender For Cloud development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when securing VMs/servers, AKS/containers, SQL/Storage, CI/CD/DevOps, or multi‑cloud (AWS/GCP) with Defender for Cloud, and other Azure Defender For Cloud related development tasks. Not for Azure Defender For Iot (use azure-defender-for-iot), Azure DDos Protection (use azure-ddos-protection), Azure Firewall (use azure-firewall), Azure Security (use azure-security).
azure-defender-for-iot
Expert knowledge for Azure Defender For Iot development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when deploying OT sensors, configuring micro agents, setting up traffic mirroring, or integrating with Sentinel/SIEM, and other Azure Defender For Iot related development tasks. Not for Azure Defender For Cloud (use azure-defender-for-cloud), Azure Security (use azure-security), Azure External Attack Surface Management (use azure-external-attack-surface-management), Azure Sentinel (use azure-sentinel).
azure-deployment-environments
Expert knowledge for Azure Deployment Environments development including troubleshooting, best practices, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when designing ADE catalogs, environment.yaml schemas, custom images, RBAC/roles, or CI/CD image pipelines, and other Azure Deployment Environments related development tasks. Not for Azure DevTest Labs (use azure-devtest-labs), Azure Dev Box (use azure-dev-box), Azure Integration Environments (use azure-integration-environments), Azure Managed Applications (use azure-managed-applications).
azure-dev-box
Expert knowledge for Azure Dev Box development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when designing Dev Box images, catalogs, policies, schedules, RBAC/SSO access, or VS Code dev tunnel workflows, and other Azure Dev Box related development tasks. Not for Azure DevTest Labs (use azure-devtest-labs), Azure Virtual Machines (use azure-virtual-machines), Azure Virtual Desktop (use azure-virtual-desktop).
azure-devops
Expert knowledge for Azure DevOps development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing Boards/work items, pipelines, Analytics/OData/Power BI, agent pools, or Azure DevOps Server deployments, and other Azure DevOps related development tasks. Not for Azure Boards (use azure-boards), Azure Pipelines (use azure-pipelines), Azure Repos (use azure-repos), Azure Test Plans (use azure-test-plans).
azure-devtest-labs
Expert knowledge for Azure DevTest Labs development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing DevTest Labs VMs, images/artifacts, ARM/CLI automation, RBAC/Key Vault security, or hub-spoke lab setups, and other Azure DevTest Labs related development tasks. Not for Azure Dev Box (use azure-dev-box), Azure Lab Services (use azure-lab-services), Azure Virtual Machines (use azure-virtual-machines), Azure Virtual Desktop (use azure-virtual-desktop).
azure-digital-twins
Expert knowledge for Azure Digital Twins development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when modeling with DTDL, querying twin graphs, integrating IoT Hub/Functions, or migrating control plane APIs, and other Azure Digital Twins related development tasks. Not for Azure IoT Hub (use azure-iot-hub), Azure IoT Central (use azure-iot-central), Azure IoT Edge (use azure-iot-edge), Azure IoT Operations (use azure-iot-operations).
azure-dns
Expert knowledge for Azure DNS development including troubleshooting, decision making, architecture & design patterns, limits & quotas, security, configuration, and integrations & coding patterns. Use when managing Azure DNS zones/records, DNSSEC, Private DNS/resolvers, reverse DNS, or zone file import/export, and other Azure DNS related development tasks. Not for Azure Traffic Manager (use azure-traffic-manager), Azure Front Door (use azure-front-door), Azure Virtual Network (use azure-virtual-network), Azure Virtual Network Manager (use azure-virtual-network-manager).
azure-document-intelligence
Expert knowledge for Azure AI Document Intelligence development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using AnalyzeDocument/Markdown APIs, custom models, containers/Docker, SAS/managed identity, or VNets, and other Azure AI Document Intelligence related development tasks. Not for Azure AI services (use microsoft-foundry-tools), Azure AI Search (use azure-cognitive-search), Azure AI Language (use azure-language-service), Azure AI Immersive Reader (use azure-immersive-reader).
azure-education-hub
Expert knowledge for Azure Education Hub development including troubleshooting, and limits & quotas. Use when managing Azure for Students credits, yearly quotas, renewals, or Dev Tools for Teaching sign-in issues, and other Azure Education Hub related development tasks.
azure-elastic-san
Expert knowledge for Azure Elastic SAN development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, and integrations & coding patterns. Use when scripting Elastic SAN volumes, tuning AVS datastores, using CMK encryption, sizing for IOPS, or running clustered SQL, and other Azure Elastic SAN related development tasks. Not for Azure Blob Storage (use azure-blob-storage), Azure Files (use azure-files), Azure NetApp Files (use azure-netapp-files), Azure Managed Lustre (use azure-managed-lustre).
azure-energy-data-services
Expert knowledge for Azure Energy Data Services development including troubleshooting, decision making, architecture & design patterns, security, configuration, integrations & coding patterns, and deployment. Use when configuring ADME partitions/CORS, managing ACLs/legal tags, deploying Geospatial CZ/Admin UI, or debugging manifest ingestion, and other Azure Energy Data Services related development tasks. Not for Azure Data Explorer (use azure-data-explorer), Azure Synapse Analytics (use azure-synapse-analytics), Azure Data Factory (use azure-data-factory), Azure Databricks (use azure-databricks).
azure-event-grid
Expert knowledge for Azure Event Grid development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when securing Event Grid endpoints, configuring topics/subscriptions, using MQTT, integrating webhooks/SaaS, or deploying on Arc Kubernetes, and other Azure Event Grid related development tasks. Not for Azure Service Bus (use azure-service-bus), Azure Event Hubs (use azure-event-hubs), Azure Notification Hubs (use azure-notification-hubs), Azure Logic Apps (use azure-logic-apps).
azure-event-hubs
Expert knowledge for Azure Event Hubs development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Kafka clients/Streams, .NET SDK, Flink/Spark, geo-disaster recovery, or Premium processing units, and other Azure Event Hubs related development tasks. Not for Azure Service Bus (use azure-service-bus), Azure Event Grid (use azure-event-grid), Azure Notification Hubs (use azure-notification-hubs), Azure Stream Analytics (use azure-stream-analytics).
azure-expressroute
Expert knowledge for Azure ExpressRoute development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when designing ExpressRoute circuits/gateways, BGP routing, Global Reach, FastPath, or S2S VPN coexistence, and other Azure ExpressRoute related development tasks. Not for Azure Internet Peering (use azure-internet-peering), Azure Peering Service (use azure-peering-service), Azure Virtual WAN (use azure-virtual-wan), Azure VPN Gateway (use azure-vpn-gateway).
azure-extended-zones
Expert knowledge for Azure Extended Zones development including decision making, limits & quotas, security, and configuration. Use when setting Azure Policy for Extended Zones, encrypting VM disks with CMK, optimizing reservations, or managing quotas, and other Azure Extended Zones related development tasks. Not for Azure Reliability (use azure-reliability), Azure Resiliency (use azure-resiliency), Azure Virtual Network (use azure-virtual-network), Azure Virtual WAN (use azure-virtual-wan).
azure-files
Expert knowledge for Azure Files development including best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when designing Azure Files tiers/redundancy, File Sync/DFS, AD/Kerberos auth, AKS CSI, or app SDK access, and other Azure Files related development tasks. Not for Azure Blob Storage (use azure-blob-storage), Azure NetApp Files (use azure-netapp-files), Azure Virtual Machines (use azure-virtual-machines).
azure-firewall
Expert knowledge for Azure Firewall development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when choosing Firewall SKUs, designing hub-spoke/forced tunneling, configuring DNAT/SNAT/app rules, TLS inspection, or DNS proxy, and other Azure Firewall related development tasks. Not for Azure Web Application Firewall (use azure-web-application-firewall), Azure Firewall Manager (use azure-firewall-manager), Azure Virtual Network (use azure-virtual-network), Azure Virtual WAN (use azure-virtual-wan).
azure-firewall-manager
Expert knowledge for Azure Firewall Manager development including best practices, decision making, security, and configuration. Use when managing DDoS plans, WAF policies, DNS proxy/FQDN rules, IP Groups, or secured virtual hub vs VNet, and other Azure Firewall Manager related development tasks. Not for Azure Firewall (use azure-firewall), Azure Virtual Network Manager (use azure-virtual-network-manager), Azure Network Function Manager (use azure-network-function-manager), Azure Networking (use azure-networking).
azure-firmware-analysis
Expert knowledge for Azure Firmware Analysis development including troubleshooting, best practices, limits & quotas, security, integrations & coding patterns, and deployment. Use when scanning firmware images, interpreting SBOM paths, using UEFI analysis, or automating uploads via CLI/PowerShell/Python, and other Azure Firmware Analysis related development tasks. Not for Azure Defender For Iot (use azure-defender-for-iot), Azure IoT Edge (use azure-iot-edge), Azure IoT Hub (use azure-iot-hub), Azure Confidential Computing (use azure-confidential-computing).
azure-fluid-relay
Expert knowledge for Azure Fluid Relay development including troubleshooting, best practices, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using AzureClient, audience APIs, JWT auth tokens, container recovery, or Static Web Apps hosting, and other Azure Fluid Relay related development tasks. Not for Azure Web PubSub (use azure-web-pubsub), Azure SignalR Service (use azure-signalr-service), Azure Relay (use azure-relay), Azure Service Bus (use azure-service-bus).
azure-front-door
Expert knowledge for Azure Front Door development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring Front Door routing/caching, WAF/TLS, Private Link origins, rules engine, or classic-to-Std/Prm migrations, and other Azure Front Door related development tasks. Not for Azure Application Gateway (use azure-application-gateway), Azure Traffic Manager (use azure-traffic-manager), Azure Load Balancer (use azure-load-balancer), Azure Web Application Firewall (use azure-web-application-firewall).
azure-functions
Expert knowledge for Azure Functions development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building HTTP/event-triggered apps, Durable Functions, Flex/Consumption hosting, containerized Functions, or CI/CD deployments, and other Azure Functions related development tasks. Not for Azure App Service (use azure-app-service), Azure Logic Apps (use azure-logic-apps), Azure Container Apps (use azure-container-apps), Azure Kubernetes Service (AKS) (use azure-kubernetes-service).
azure-hdinsight
Expert knowledge for Azure HDInsight development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when working with HDInsight Spark/Hive/Kafka/HBase clusters, Ambari, VNet networking, or Azure SQL/Cosmos DB integration, and other Azure HDInsight related development tasks. Not for Azure Synapse Analytics (use azure-synapse-analytics), Azure Databricks (use azure-databricks), Azure Stream Analytics (use azure-stream-analytics).
azure-health-bot
Expert knowledge for Azure Health Bot development including best practices, decision making, architecture & design patterns, security, configuration, and integrations & coding patterns. Use when configuring Health Bot channels, web chat/voice embeds, management APIs, orchestrator flows, or cost estimation, and other Azure Health Bot related development tasks. Not for Azure AI Bot Service (use azure-bot-service), Azure Communication Services (use azure-communication-services), Azure Health Data Services (use azure-health-data-services).
azure-health-data-services
Expert knowledge for Azure Health Data Services development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using FHIR/DICOM/MedTech services, de-identification APIs, bulk ops, Synapse pipelines, or Private Link, and other Azure Health Data Services related development tasks. Not for Azure Health Bot (use azure-health-bot), Azure API Management (use azure-api-management), Azure Data Factory (use azure-data-factory), Azure Synapse Analytics (use azure-synapse-analytics).
azure-hpc-cache
Expert knowledge for Azure HPC Cache development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring HPC Cache namespaces, NFS/Blob targets, client access, data ingest scripts, or cache failover, and other Azure HPC Cache related development tasks. Not for Azure Managed Lustre (use azure-managed-lustre), Azure NetApp Files (use azure-netapp-files), Azure Batch (use azure-batch), Azure Virtual Machines (use azure-virtual-machines).
azure-immersive-reader
Expert knowledge for Azure AI Immersive Reader development including best practices, limits & quotas, security, configuration, and integrations & coding patterns. Use when tuning read-aloud/translation, storing user prefs, Entra auth setup, JS SDK integration, or language support, and other Azure AI Immersive Reader related development tasks. Not for Azure AI Language (use azure-language-service), Azure AI Speech (use azure-speech), Azure Translator (use azure-translator), Azure AI services (use microsoft-foundry-tools).
azure-impact-reporting
Expert knowledge for Azure Impact Reporting development including troubleshooting, configuration, and integrations & coding patterns. Use when wiring Impact Reporting to Monitor alerts, Logic Apps, HPC node health, Service Health, or its insights API, and other Azure Impact Reporting related development tasks. Not for Azure Carbon Optimization (use azure-carbon-optimization), Azure Cost Management (use azure-cost-management), Azure Monitor (use azure-monitor), Azure Policy (use azure-policy).
azure-import-export
Expert knowledge for Azure Import Export development including troubleshooting, limits & quotas, and security. Use when setting CMK via Key Vault, validating drive/OS support, or debugging Import/Export job and log issues, and other Azure Import Export related development tasks. Not for Azure Data Box (use azure-data-box-family), Azure Blob Storage (use azure-blob-storage), Azure Files (use azure-files).
azure-industry
Expert knowledge for Azure Industry development including troubleshooting, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring Community Training on Azure, Teams app embedding, Azure AD/B2C login, APIs, or Android app builds, and other Azure Industry related development tasks.
azure-information-protection
Expert knowledge for Azure Information Protection development including best practices, decision making, configuration, and deployment. Use when choosing Azure RMS vs AD RMS, migrating keys/policies, configuring RMS connector/MSIPC, or monitoring RMS logs, and other Azure Information Protection related development tasks. Not for Azure Key Vault (use azure-key-vault), Azure Security (use azure-security), Azure Defender For Cloud (use azure-defender-for-cloud), Azure Sentinel (use azure-sentinel).
azure-iot
Expert knowledge for Azure IoT development including architecture & design patterns, and integrations & coding patterns. Use when using MQTT, IoT Plug and Play, DPS/IoT Hub, SAP ERP integration, or industrial IoT reference architectures, and other Azure IoT related development tasks. Not for Azure IoT Hub (use azure-iot-hub), Azure IoT Edge (use azure-iot-edge), Azure IoT Central (use azure-iot-central), Azure Defender For Iot (use azure-defender-for-iot).
azure-iot-central
Expert knowledge for Azure IoT Central development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when designing device templates, data export, IoT Edge gateways, REST/CLI automation, or device bridge pipelines, and other Azure IoT Central related development tasks. Not for Azure IoT Hub (use azure-iot-hub), Azure IoT Edge (use azure-iot-edge), Azure IoT (use azure-iot), Azure Digital Twins (use azure-digital-twins).
azure-iot-edge
Expert knowledge for Azure IoT Edge development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring IoT Edge modules/devices, DPS provisioning, EFLOW on Windows, gateways, or GPU-accelerated workloads, and other Azure IoT Edge related development tasks. Not for Azure IoT Hub (use azure-iot-hub), Azure IoT Central (use azure-iot-central), Azure IoT Operations (use azure-iot-operations), Azure Stack Edge (use azure-stack-edge).
azure-iot-hub
Expert knowledge for Azure IoT Hub development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when provisioning via DPS, managing twins/jobs/routing, using device streams, or integrating Device Update, and other Azure IoT Hub related development tasks. Not for Azure IoT (use azure-iot), Azure IoT Central (use azure-iot-central), Azure IoT Edge (use azure-iot-edge), Azure Defender For Iot (use azure-defender-for-iot).
azure-iot-operations
Expert knowledge for Azure IoT Operations development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring MQTT broker, data flows/graphs, OPC UA/ONVIF connectors, WASM transforms, or Prometheus/Grafana, and other Azure IoT Operations related development tasks. Not for Azure IoT (use azure-iot), Azure IoT Hub (use azure-iot-hub), Azure IoT Edge (use azure-iot-edge), Azure Defender For Iot (use azure-defender-for-iot).
azure-key-vault
Expert knowledge for Azure Key Vault development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing keys/secrets/certs, RBAC vs access policies, Private Link, BYOK/HSM, or SDK/API integrations, and other Azure Key Vault related development tasks. Not for Azure Dedicated HSM (use azure-dedicated-hsm), Azure Cloud Hsm (use azure-cloud-hsm), Azure Payment Hsm (use azure-payment-hsm), Azure Information Protection (use azure-information-protection).
azure-kubernetes-service
Expert knowledge for Azure Kubernetes Service (AKS) development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing AKS identity/RBAC, networking/ingress, autoscaling node pools, Fleet multi-cluster, or AI/ML/GPU workloads, and other Azure Kubernetes Service (AKS) related development tasks. Not for Azure Container Apps (use azure-container-apps), Azure Container Instances (use azure-container-instances), Azure Red Hat OpenShift (use azure-redhat-openshift), Azure Virtual Machines (use azure-virtual-machines).
azure-lab-services
Expert knowledge for Azure Lab Services development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring lab plans, VM templates/schedules, VNet-integrated labs, GPU/nested virtualization, or Canvas/Teams integration, and other Azure Lab Services related development tasks. Not for Azure DevTest Labs (use azure-devtest-labs), Azure Virtual Machines (use azure-virtual-machines), Azure Virtual Desktop (use azure-virtual-desktop).
azure-language-service
Expert knowledge for Azure AI Language development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building CLU intents, custom NER, text classification, CQA/FAQ, sentiment, or summarization solutions, and other Azure AI Language related development tasks. Not for Azure AI Search (use azure-cognitive-search), Azure AI Document Intelligence (use azure-document-intelligence), Azure AI Immersive Reader (use azure-immersive-reader), Azure Translator (use azure-translator).
azure-large-instances
Expert knowledge for Azure Large Instances development including troubleshooting, limits & quotas, and integrations & coding patterns. Use when configuring Epic SKUs, sizing volume groups, tuning EHR storage, or resolving Epic–ALI connectivity/perf issues, and other Azure Large Instances related development tasks. Not for Azure Baremetal Infrastructure (use azure-baremetal-infrastructure), Azure Virtual Machines (use azure-virtual-machines), Azure Virtual Machine Scale Sets (use azure-vm-scalesets), Azure HPC Cache (use azure-hpc-cache).
azure-lighthouse
Expert knowledge for Azure Lighthouse development including decision making, security, configuration, integrations & coding patterns, and deployment. Use when designing multi-tenant delegations, RBAC/AOBO/PIM access, policy-based onboarding, Arc/Sentinel integrations, or Marketplace offers, and other Azure Lighthouse related development tasks. Not for Azure Arc (use azure-arc), Azure Managed Applications (use azure-managed-applications), Azure Resource Manager (use azure-resource-manager).
azure-load-balancer
Expert knowledge for Azure Load Balancer development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring frontends/backends, health probes, SNAT/outbound, IMDS/metrics APIs, or Basic→Standard migrations, and other Azure Load Balancer related development tasks. Not for Azure Application Gateway (use azure-application-gateway), Azure Front Door (use azure-front-door), Azure Traffic Manager (use azure-traffic-manager), Azure NAT Gateway (use azure-nat-gateway).
azure-local
Expert knowledge for Azure Local development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when planning Azure Local clusters, SDN/rack designs, Arc/VM networking, disconnected ops, or VMware migrations, and other Azure Local related development tasks. Not for Microsoft Foundry Local (use microsoft-foundry-local), Azure Stack Edge (use azure-stack-edge), Azure Kubernetes Service Edge Essentials (use azure-aks-edge-essentials), Azure IoT Edge (use azure-iot-edge).
azure-logic-apps
Expert knowledge for Azure Logic Apps development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building Logic Apps with B2B/EDI, SAP/IBM/FTP connectors, AI/OpenAI calls, inline JavaScript, or CI/CD deployments, and other Azure Logic Apps related development tasks. Not for Azure Functions (use azure-functions), Azure API Management (use azure-api-management), Azure Service Bus (use azure-service-bus), Azure Event Grid (use azure-event-grid).
azure-machine-learning
Expert knowledge for Azure Machine Learning development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Azure ML workspaces, AutoML, Prompt Flow, online/batch endpoints, or SDK/CLI v2 deployments, and other Azure Machine Learning related development tasks. Not for Azure Databricks (use azure-databricks), Azure Synapse Analytics (use azure-synapse-analytics), Azure Data Science Virtual Machines (use azure-data-science-vm), Azure HDInsight (use azure-hdinsight).
azure-managed-applications
Expert knowledge for Azure Managed Applications development including limits & quotas, security, configuration, and deployment. Use when designing createUiDefinition UIs, JIT access, managed identities, Key Vault/CMK, StorageAccountSelector, or Bicep-based catalog deployments, and other Azure Managed Applications related development tasks. Not for Azure Lighthouse (use azure-lighthouse), Azure Partner Solutions (use azure-partner-solutions), Azure Resource Manager (use azure-resource-manager), Azure Blueprints (use azure-blueprints).
azure-managed-grafana
Expert knowledge for Azure Managed Grafana development including troubleshooting, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when integrating Azure Monitor/Prometheus, configuring data sources/alerts, Entra auth, private endpoints, or HA workspaces, and other Azure Managed Grafana related development tasks. Not for Azure Monitor (use azure-monitor), Azure Application Gateway (use azure-application-gateway), Azure Virtual Machines (use azure-virtual-machines), Azure Kubernetes Service (AKS) (use azure-kubernetes-service).
azure-managed-lustre
Expert knowledge for Azure Managed Lustre development including troubleshooting, best practices, architecture & design patterns, limits & quotas, security, configuration, and integrations & coding patterns. Use when mounting AMLFS, integrating with Blob auto-import/export, using AKS CSI, setting CMK/root squash, or tuning performance, and other Azure Managed Lustre related development tasks. Not for Azure HPC Cache (use azure-hpc-cache), Azure NetApp Files (use azure-netapp-files), Azure Blob Storage (use azure-blob-storage), Azure Elastic SAN (use azure-elastic-san).
azure-managed-redis
Expert knowledge for Azure Managed Redis development including troubleshooting, best practices, decision making, security, configuration, integrations & coding patterns, and deployment. Use when using Entra auth, geo-replication, persistence, Private Link, ARM/Bicep deploys, or Azure Functions bindings, and other Azure Managed Redis related development tasks. Not for Azure Cache for Redis (use azure-cache-redis).
azure-maps
Expert knowledge for Azure Maps development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, and integrations & coding patterns. Use when using Azure Maps web SDK, search/geocoding, routing/traffic, weather APIs, or migrating from Bing/Google Maps, and other Azure Maps related development tasks.
azure-metrics-advisor
Expert knowledge for Azure AI Metrics Advisor development including decision making, security, configuration, and integrations & coding patterns. Use when configuring data feeds, tuning anomaly detection, managing alert hooks, or integrating the Metrics Advisor APIs, and other Azure AI Metrics Advisor related development tasks. Not for Azure AI Anomaly Detector (use azure-anomaly-detector), Azure Monitor (use azure-monitor), Azure Machine Learning (use azure-machine-learning).
azure-migrate
Expert knowledge for Azure Migrate development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using AppCAT/CAST, Site Recovery REST APIs, Azure Migrate appliance, Arc-based discovery, or Resource Mover, and other Azure Migrate related development tasks. Not for Azure Database Migration service (use azure-database-migration), Azure Site Recovery (use azure-site-recovery), Azure Virtual Machines (use azure-virtual-machines), SQL Server on Azure Virtual Machines (use azure-sql-virtual-machines).
azure-monitor
Expert knowledge for Azure Monitor development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when working with Log Analytics workspaces, DCRs, AMA/agents, Application Insights, or Prometheus/Container Insights, and other Azure Monitor related development tasks. Not for Azure Managed Grafana (use azure-managed-grafana), Azure Network Watcher (use azure-network-watcher), Azure Service Health (use azure-service-health), Azure Defender For Cloud (use azure-defender-for-cloud).
azure-nat-gateway
Expert knowledge for Azure NAT Gateway development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, configuration, and deployment. Use when planning SNAT capacity, configuring IPs/flow logs, fixing outbound failures, or choosing Standard vs StandardV2, and other Azure NAT Gateway related development tasks. Not for Azure Firewall (use azure-firewall), Azure Load Balancer (use azure-load-balancer), Azure Virtual Network (use azure-virtual-network), Azure Virtual WAN (use azure-virtual-wan).
azure-netapp-files
Expert knowledge for Azure NetApp Files development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when deploying ANF for SAP HANA/Oracle, AzAcSnap, ZRS, AVS, REST/PowerShell, or S3/OneLake integrations, and other Azure NetApp Files related development tasks. Not for Azure Files (use azure-files), Azure Blob Storage (use azure-blob-storage), Azure Elastic SAN (use azure-elastic-san), Azure Managed Lustre (use azure-managed-lustre).
azure-network-function-manager
Expert knowledge for Azure Network Function Manager development including security, and configuration. Use when setting up NF Manager prerequisites, resource groups, managed identities, role assignments, or secure NF access, and other Azure Network Function Manager related development tasks. Not for Azure Firewall Manager (use azure-firewall-manager), Azure Virtual Network Manager (use azure-virtual-network-manager), Azure Virtual Network (use azure-virtual-network), Azure Virtual WAN (use azure-virtual-wan).
azure-network-watcher
Expert knowledge for Azure Network Watcher development including troubleshooting, decision making, limits & quotas, security, configuration, and integrations & coding patterns. Use when configuring Connection Monitor, NSG/VNet flow logs, packet capture, Traffic Analytics, or Sentinel integrations, and other Azure Network Watcher related development tasks. Not for Azure Monitor (use azure-monitor), Azure Networking (use azure-networking), Azure Virtual Network (use azure-virtual-network), Azure Virtual Network Manager (use azure-virtual-network-manager).
azure-networking
Expert knowledge for Azure Networking development including troubleshooting, best practices, decision making, architecture & design patterns, security, and integrations & coding patterns. Use when designing hub-spoke VNets, Azure Firewall/NSG rules, WAF (App GW/Front Door), Accelerated Networking, or Resource Graph queries, and other Azure Networking related development tasks. Not for Azure Virtual Network (use azure-virtual-network), Azure Virtual Network Manager (use azure-virtual-network-manager), Azure Virtual WAN (use azure-virtual-wan), Azure Network Watcher (use azure-network-watcher).
azure-notification-hubs
Expert knowledge for Azure Notification Hubs development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, and integrations & coding patterns. Use when integrating FCM/APNS/WNS, targeting devices/users with tags/templates, scheduling pushes, or securing hubs with Private Link, and other Azure Notification Hubs related development tasks. Not for Azure Event Hubs (use azure-event-hubs), Azure Service Bus (use azure-service-bus), Azure Web PubSub (use azure-web-pubsub), Azure Communication Services (use azure-communication-services).
azure-operator-nexus
Expert knowledge for Azure Operator Nexus development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, and deployment. Use when managing Nexus clusters, network fabric (BGP/VRF/ACL/QoS), near-edge storage, upgrades, or RMA workflows, and other Azure Operator Nexus related development tasks. Not for Azure Network Function Manager (use azure-network-function-manager), Azure Networking (use azure-networking), Azure Virtual Network Manager (use azure-virtual-network-manager), Azure Virtual WAN (use azure-virtual-wan).
azure-operator-service-manager
Expert knowledge for Azure Operator Service Manager development including troubleshooting, best practices, security, configuration, and integrations & coding patterns. Use when onboarding CNFs/VNFs, designing config groups, using ACR-backed artifacts, Private Link, or AOSM CLI, and other Azure Operator Service Manager related development tasks. Not for Azure Operator Insights (use azure-operator-insights), Azure Operator Nexus (use azure-operator-nexus), Azure Network Function Manager (use azure-network-function-manager), Azure Networking (use azure-networking).
azure-oracle
Expert knowledge for Azure Oracle development including troubleshooting, decision making, security, configuration, and integrations & coding patterns. Use when configuring Oracle TDE with Key Vault, VNET topology, Exadata deployment, region selection, or Azure Monitor/Sentinel logs, and other Azure Oracle related development tasks. Not for Azure SQL Database (use azure-sql-database), Azure SQL Managed Instance (use azure-sql-managed-instance), SQL Server on Azure Virtual Machines (use azure-sql-virtual-machines), SAP HANA on Azure Large Instances (use azure-sap).
azure-osconfig
Expert knowledge for Azure Osconfig development including troubleshooting, security, configuration, and integrations & coding patterns. Use when running OSConfig via IoT Hub for commands, SSH posture, agent health, Windows baselines, or LAPS, and other Azure Osconfig related development tasks. Not for Azure Update Manager (use azure-update-manager), Azure Automation (use azure-automation), Azure Policy (use azure-policy).
azure-partner-solutions
Expert knowledge for Azure Partner Solutions development including troubleshooting, decision making, architecture & design patterns, security, configuration, and integrations & coding patterns. Use when using Service Connector, Confluent Cloud, MongoDB Atlas, Dynatrace APM, or Palo Alto Cloud NGFW on Azure, and other Azure Partner Solutions related development tasks. Not for Azure Industry (use azure-industry), Azure Managed Applications (use azure-managed-applications), Azure Lighthouse (use azure-lighthouse), Azure Networking (use azure-networking).
azure-payment-hsm
Expert knowledge for Azure Payment Hsm development including troubleshooting, best practices, decision making, architecture & design patterns, security, and configuration. Use when configuring Payment HSM VNets/FastPath, payShield Manager access, HA/DR topologies, SKUs, or traffic inspection, and other Azure Payment Hsm related development tasks. Not for Azure Dedicated HSM (use azure-dedicated-hsm), Azure Cloud Hsm (use azure-cloud-hsm), Azure Key Vault (use azure-key-vault), Azure Information Protection (use azure-information-protection).
azure-personalizer
Expert knowledge for Azure AI Personalizer development including troubleshooting, decision making, limits & quotas, security, configuration, and integrations & coding patterns. Use when tuning exploration/apprentice mode, single vs multi-slot calls, model export, quotas, or local inference SDK, and other Azure AI Personalizer related development tasks. Not for Azure AI services (use microsoft-foundry-tools), Azure AI Search (use azure-cognitive-search), Azure AI Metrics Advisor (use azure-metrics-advisor), Azure AI Anomaly Detector (use azure-anomaly-detector).
azure-pipelines
Expert knowledge for Azure Pipelines development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when securing service connections/secrets, configuring YAML pipelines, deploying to App Service/Kubernetes, or handling hosted agent limits, and other Azure Pipelines related development tasks. Not for Azure DevOps (use azure-devops), Azure Boards (use azure-boards), Azure Repos (use azure-repos), Azure Test Plans (use azure-test-plans).
azure-planetary-computer-pro
Expert knowledge for Microsoft Planetary Computer Pro development including troubleshooting, decision making, limits & quotas, security, configuration, and integrations & coding patterns. Use when managing STAC collections, GeoCatalog ingestion, SAS tokens, Explorer visualization, or QGIS/ArcGIS integration, and other Microsoft Planetary Computer Pro related development tasks. Not for Azure Open Datasets (use azure-open-datasets), Azure Maps (use azure-maps), Azure Data Explorer (use azure-data-explorer), Azure Synapse Analytics (use azure-synapse-analytics).
azure-playwright-workspaces
Expert knowledge for Playwright Workspaces development including troubleshooting, best practices, decision making, limits & quotas, security, and configuration. Use when managing Playwright Testing workspaces, tokens/RBAC, quotas, monitoring/metrics, or run/AADSTS7000112 issues, and other Playwright Workspaces related development tasks. Not for Azure App Testing (use azure-app-testing), Azure DevOps (use azure-devops), Azure Pipelines (use azure-pipelines), Azure Test Plans (use azure-test-plans).
azure-policy
Expert knowledge for Azure Policy development including troubleshooting, best practices, decision making, security, configuration, integrations & coding patterns, and deployment. Use when authoring Machine Configuration packages, deploying via ARM/Bicep/Terraform, enforcing security baselines, migrating from DSC, or querying compliance with Resource Graph, and other Azure Policy related development tasks. Not for Azure Blueprints (use azure-blueprints), Azure Role-based access control (use azure-rbac), Azure Resource Manager (use azure-resource-manager), Azure Security (use azure-security).
azure-portal
Expert knowledge for Azure Portal development including troubleshooting, security, and configuration. Use when managing portal RBAC dashboard sharing, JSON dashboards, URL allowlists, mobile app access/alerts, or HAR diagnostics, and other Azure Portal related development tasks. Not for Azure Cloud Shell (use azure-cloud-shell), Azure Resource Manager (use azure-resource-manager), Azure Monitor (use azure-monitor), Azure Policy (use azure-policy).
azure-private-link
Expert knowledge for Azure Private Link development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, and configuration. Use when configuring Private Endpoints, DNS/Private Resolver, High Scale limits, NSP/RBAC, or Azure Firewall inspection, and other Azure Private Link related development tasks. Not for Azure Virtual Network (use azure-virtual-network), Azure VPN Gateway (use azure-vpn-gateway), Azure ExpressRoute (use azure-expressroute), Azure Firewall (use azure-firewall).
azure-quantum
Expert knowledge for Azure Quantum development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using QDK/qdk.azure, hybrid jobs, QIR/OpenQASM/Pulser circuits, IonQ/PASQAL/Quantinuum/Rigetti, or resource estimator, and other Azure Quantum related development tasks. Not for Azure HPC Cache (use azure-hpc-cache), Azure Batch (use azure-batch), Azure Databricks (use azure-databricks), Azure Machine Learning (use azure-machine-learning).
azure-queue-storage
Expert knowledge for Azure Queue Storage development including best practices, limits & quotas, security, configuration, and integrations & coding patterns. Use when managing queue auth (Entra ID/RBAC), monitoring metrics/logs, tuning throughput/limits, or coding with SDKs, and other Azure Queue Storage related development tasks. Not for Azure Blob Storage (use azure-blob-storage), Azure Table Storage (use azure-table-storage), Azure Service Bus (use azure-service-bus), Azure Event Hubs (use azure-event-hubs).
azure-rbac
Expert knowledge for Azure Role-based access control development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, and integrations & coding patterns. Use when defining roles/ABAC conditions, deny assignments, PIM, custom roles, or RBAC via CLI/PowerShell/REST, and other Azure Role-based access control related development tasks. Not for Azure Policy (use azure-policy), Azure Security (use azure-security), Azure Resource Manager (use azure-resource-manager), Azure Portal (use azure-portal).
azure-redhat-openshift
Expert knowledge for Azure Red Hat OpenShift development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when creating ARO clusters, configuring networking/storage, securing with Entra/NSGs, using GPUs/Key Vault, or upgrading, and other Azure Red Hat OpenShift related development tasks. Not for Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Container Apps (use azure-container-apps), Azure Container Instances (use azure-container-instances), Azure VMware Solution (use azure-vmware-solution).
azure-relay
Expert knowledge for Azure Relay development including troubleshooting, security, configuration, and integrations & coding patterns. Use when configuring Hybrid Connections, WCF relays, Entra ID/SAS auth, Private Link, or .NET/Node.js Relay clients, and other Azure Relay related development tasks. Not for Azure Service Bus (use azure-service-bus), Azure Event Hubs (use azure-event-hubs), Azure Web PubSub (use azure-web-pubsub), Azure Application Gateway (use azure-application-gateway).
azure-reliability
Expert knowledge for Azure Reliability development including best practices, decision making, architecture & design patterns, limits & quotas, and deployment. Use when designing AZ/multi-region apps, MySQL Flexible Server HA, resilient Functions, or Azure Queue Storage limits, and other Azure Reliability related development tasks. Not for Azure Resiliency (use azure-resiliency), Azure Monitor (use azure-monitor), Azure Service Health (use azure-service-health), Azure Site Recovery (use azure-site-recovery).
azure-repos
Expert knowledge for Azure Repos development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, and integrations & coding patterns. Use when managing Git/TFVC repos, branch/PR policies, permissions, CodeQL/secret scans, or GitHub Advanced Security, and other Azure Repos related development tasks. Not for Azure DevOps (use azure-devops), Azure Pipelines (use azure-pipelines), Azure Test Plans (use azure-test-plans), Azure Boards (use azure-boards).
azure-resiliency
Expert knowledge for Azure Resiliency development including limits & quotas, security, and configuration. Use when managing Backup/Site Recovery vaults, protection policies, replication settings, SLAs, or resiliency security posture, and other Azure Resiliency related development tasks. Not for Azure Reliability (use azure-reliability), Azure Site Recovery (use azure-site-recovery), Azure Backup (use azure-backup), Azure Monitor (use azure-monitor).
azure-resource-graph
Expert knowledge for Azure Resource Graph development including troubleshooting, best practices, decision making, limits & quotas, configuration, and integrations & coding patterns. Use when querying via CLI/PowerShell/REST, using GET/LIST vs Query APIs, shared queries, alerts, or Power BI, and other Azure Resource Graph related development tasks. Not for Azure Monitor (use azure-monitor), Azure Policy (use azure-policy), Azure Resource Manager (use azure-resource-manager), Azure Portal (use azure-portal).
azure-resource-manager
Expert knowledge for Azure Resource Manager development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Bicep/ARM templates, template specs, deployment stacks, CLI/PowerShell/REST, or CI/CD pipelines, and other Azure Resource Manager related development tasks. Not for Azure Blueprints (use azure-blueprints), Azure Policy (use azure-policy), Azure Resource Graph (use azure-resource-graph), Azure Portal (use azure-portal).
azure-route-server
Expert knowledge for Azure Route Server development including troubleshooting, best practices, architecture & design patterns, limits & quotas, security, and configuration. Use when designing hub-spoke or multi-region topologies, BGP peering with NVAs/on-prem, tuning routing policies, or fixing route propagation issues, and other Azure Route Server related development tasks. Not for Azure Virtual Network (use azure-virtual-network), Azure Virtual Network Manager (use azure-virtual-network-manager), Azure Virtual WAN (use azure-virtual-wan), Azure VPN Gateway (use azure-vpn-gateway).
azure-sap
Expert knowledge for SAP HANA on Azure Large Instances development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when deploying SAP HANA on Azure Large Instances, ACSS/Center workflows, Azure Monitor, Key Vault, or RISE setups, and other SAP HANA on Azure Large Instances related development tasks. Not for Azure Large Instances (use azure-large-instances), Azure Virtual Machines (use azure-virtual-machines), Azure Virtual Machine Scale Sets (use azure-vm-scalesets), Azure Baremetal Infrastructure (use azure-baremetal-infrastructure).
azure-security
Expert knowledge for Azure Security development including troubleshooting, best practices, decision making, security, configuration, integrations & coding patterns, and deployment. Use when securing AKS and container images, SBOM/Notation pipelines, Key Vault vs HSM, or Customer Lockbox, and other Azure Security related development tasks. Not for Azure Defender For Cloud (use azure-defender-for-cloud), Azure Sentinel (use azure-sentinel), Azure DDos Protection (use azure-ddos-protection), Azure Web Application Firewall (use azure-web-application-firewall).
azure-sentinel
Expert knowledge for Azure Sentinel development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing Sentinel connectors, KQL analytics rules, Logic Apps playbooks, UEBA/SAP data, or ASIM schemas, and other Azure Sentinel related development tasks. Not for Azure Defender For Cloud (use azure-defender-for-cloud), Azure Security (use azure-security), Azure Monitor (use azure-monitor), Azure Network Watcher (use azure-network-watcher).
azure-service-bus
Expert knowledge for Azure Service Bus development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using queues/topics, sessions, filters, geo-replication, JMS/AMQP APIs, or migrating Standard→Premium, and other Azure Service Bus related development tasks. Not for Azure Event Hubs (use azure-event-hubs), Azure Event Grid (use azure-event-grid), Azure Queue Storage (use azure-queue-storage), Azure Relay (use azure-relay).
azure-service-connector
Expert knowledge for Azure Service Connector development including troubleshooting, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when wiring apps to Azure DBs, messaging, storage, AI, or cache via Service Connector roles, CLI, and IaC, and other Azure Service Connector related development tasks. Not for Azure API Management (use azure-api-management), Azure App Service (use azure-app-service), Azure Functions (use azure-functions), Azure Logic Apps (use azure-logic-apps).
azure-service-fabric
Expert knowledge for Azure Service Fabric development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building Service Fabric clusters, Reliable Services/Actors, Cluster Resource Manager policies, reverse proxy, or sfctl/PowerShell automation, and other Azure Service Fabric related development tasks. Not for Azure App Service (use azure-app-service), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Container Apps (use azure-container-apps), Azure Cloud Services (use azure-cloud-services).
azure-service-health
Expert knowledge for Azure Service Health development including troubleshooting, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when integrating Service Health APIs/webhooks, configuring alerts via ARM/Bicep, querying events with Resource Graph, managing RBAC access, or diagnosing VM Resource Health issues, and other Azure Service Health related development tasks. Not for Azure Monitor (use azure-monitor), Azure Reliability (use azure-reliability), Azure Resiliency (use azure-resiliency), Azure Quotas (use azure-quotas).
azure-signalr-service
Expert knowledge for Azure SignalR Service development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when choosing SignalR mode, configuring upstreams/custom domains, securing with Entra ID/MI, scaling/sharding, or tracing issues, and other Azure SignalR Service related development tasks. Not for Azure Web PubSub (use azure-web-pubsub), Azure Service Bus (use azure-service-bus), Azure Event Hubs (use azure-event-hubs).
azure-site-recovery
Expert knowledge for Azure Site Recovery development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when planning ASR for VMware/Hyper‑V, configuring Recovery Services vaults, scripting with PowerShell/Terraform, integrating ExpressRoute/Traffic Manager, or protecting AD/SQL/SAP workloads, and other Azure Site Recovery related development tasks. Not for Azure Backup (use azure-backup), Azure Migrate (use azure-migrate), Azure Virtual Machines (use azure-virtual-machines), Azure Virtual Machine Scale Sets (use azure-vm-scalesets).
azure-sovereign-us
Expert knowledge for Azure US Government development including decision making, architecture & design patterns, security, configuration, integrations & coding patterns, and deployment. Use when handling FedRAMP/DoD IL5 scope, SACA patterns, Gov CI/CD, Gov Marketplace, or sovereign APIs, and other Azure US Government related development tasks. Not for Azure Local (use azure-local), Azure Arc (use azure-arc), Azure Networking (use azure-networking), Azure Security (use azure-security).
azure-speech
Expert knowledge for Azure AI Speech development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using STT/TTS, custom speech or voice training, Voice Live/avatars, batch jobs, or telephony integrations, and other Azure AI Speech related development tasks. Not for Azure Communication Services (use azure-communication-services), Azure AI Bot Service (use azure-bot-service), Azure AI Video Indexer (use azure-video-indexer), Azure AI Immersive Reader (use azure-immersive-reader).
azure-spring-apps
Expert knowledge for Azure Spring Apps development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring ASA networking/security, Tanzu tools, observability/APM, CI/CD deployments, or blue‑green releases, and other Azure Spring Apps related development tasks. Not for Azure App Service (use azure-app-service), Azure Container Apps (use azure-container-apps), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Functions (use azure-functions).
azure-sql-database
Expert knowledge for Azure SQL Database development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when choosing DTU/vCore/serverless/Hyperscale, configuring geo‑replication/Data Sync, or securing with Entra/TDE, and other Azure SQL Database related development tasks. Not for Azure SQL Managed Instance (use azure-sql-managed-instance), SQL Server on Azure Virtual Machines (use azure-sql-virtual-machines), Azure Cosmos DB (use azure-cosmos-db), Azure Database for PostgreSQL (use azure-database-postgresql).
azure-sql-managed-instance
Expert knowledge for Azure SQL Managed Instance development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using MI Link, geo-replication/HA, Kerberos/Entra auth, TDE/Key Vault, or Extended Events, and other Azure SQL Managed Instance related development tasks. Not for Azure SQL Database (use azure-sql-database), SQL Server on Azure Virtual Machines (use azure-sql-virtual-machines), Azure Database for MySQL (use azure-database-mysql), Azure Database for PostgreSQL (use azure-database-postgresql).
azure-sql-virtual-machines
Expert knowledge for SQL Server on Azure Virtual Machines development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring SQL Server on Azure VMs with Always On AG/FCI, WSFC, IaaS Agent, Blob backups, or Entra auth, and other SQL Server on Azure Virtual Machines related development tasks. Not for Azure SQL Database (use azure-sql-database), Azure SQL Managed Instance (use azure-sql-managed-instance), Azure Virtual Machines (use azure-virtual-machines), Azure Data Science Virtual Machines (use azure-data-science-vm).
azure-sre-agent
Expert knowledge for Azure Sre Agent development including troubleshooting, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when wiring SRE Agent to DevOps/Teams/ServiceNow, configuring tools/hooks, querying KQL telemetry, or deploying as a Teams bot, and other Azure Sre Agent related development tasks. Not for Azure Monitor (use azure-monitor), Azure Reliability (use azure-reliability), Azure Resiliency (use azure-resiliency), Azure Service Health (use azure-service-health).
azure-stack-edge
Expert knowledge for Azure Stack Edge development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when running IoT Edge or GPU/Kubernetes apps, configuring VMs/storage/networking, or managing device updates, and other Azure Stack Edge related development tasks. Not for Azure Data Box (use azure-data-box-family), Azure IoT Edge (use azure-iot-edge), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Virtual Machines (use azure-virtual-machines).
azure-static-web-apps
Expert knowledge for Azure Static Web Apps development including troubleshooting, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when wiring SWA APIs to Azure DBs, configuring custom domains/auth, CI/CD, preview slots, or Front Door/CDN, and other Azure Static Web Apps related development tasks. Not for Azure App Service (use azure-app-service), Azure Functions (use azure-functions), Azure Container Apps (use azure-container-apps), Azure Kubernetes Service (AKS) (use azure-kubernetes-service).
azure-stream-analytics
Expert knowledge for Azure Stream Analytics development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building jobs with Event Hubs/Kafka, Cosmos DB/SQL outputs, ML/Functions integration, autoscale, or SU tuning, and other Azure Stream Analytics related development tasks. Not for Azure Data Factory (use azure-data-factory), Azure Event Hubs (use azure-event-hubs), Azure Synapse Analytics (use azure-synapse-analytics), Azure Databricks (use azure-databricks).
azure-synapse-analytics
Expert knowledge for Azure Synapse Analytics development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when designing Synapse workspaces, Spark pools, dedicated/serverless SQL, Delta Lake, or Synapse Link workloads, and other Azure Synapse Analytics related development tasks. Not for Azure Data Factory (use azure-data-factory), Azure Data Explorer (use azure-data-explorer), Azure Databricks (use azure-databricks), Azure HDInsight (use azure-hdinsight).
azure-table-storage
Expert knowledge for Azure Table Storage development including best practices, architecture & design patterns, limits & quotas, security, configuration, and integrations & coding patterns. Use when managing Entra ID/RBAC access, monitoring metrics/logs, tuning partitions/keys, or scripting tables via PowerShell, and other Azure Table Storage related development tasks. Not for Azure Cosmos DB (use azure-cosmos-db), Azure Blob Storage (use azure-blob-storage), Azure Queue Storage (use azure-queue-storage), Azure Files (use azure-files).
azure-test-plans
Expert knowledge for Azure Test Plans development including limits & quotas, security, and integrations & coding patterns. Use when configuring test fields, managing test access, or automating suites, runs, and configs via tcm.exe, and other Azure Test Plans related development tasks. Not for Azure DevOps (use azure-devops), Azure Boards (use azure-boards), Azure Pipelines (use azure-pipelines), Azure App Testing (use azure-app-testing).
azure-traffic-manager
Expert knowledge for Azure Traffic Manager development including troubleshooting, best practices, decision making, architecture & design patterns, configuration, and integrations & coding patterns. Use when configuring profiles/endpoints, routing methods, RUM JS, nested designs, or Traffic View–based tuning, and other Azure Traffic Manager related development tasks. Not for Azure Front Door (use azure-front-door), Azure Load Balancer (use azure-load-balancer), Azure Virtual WAN (use azure-virtual-wan), Azure Application Gateway (use azure-application-gateway).
azure-translator
Expert knowledge for Azure Translator development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Translator text/document APIs, custom models, glossaries, Docker containers, or Power Automate flows, and other Azure Translator related development tasks. Not for Azure AI Language (use azure-language-service), Azure AI Speech (use azure-speech), Azure AI Immersive Reader (use azure-immersive-reader), Azure AI Search (use azure-cognitive-search).
azure-update-manager
Expert knowledge for Azure Update Manager development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing VM/Arc patching, Automanage profiles, hotpatch/ESU schedules, change tracking, or SDK/REST automation, and other Azure Update Manager related development tasks. Not for Azure Automation (use azure-automation), Azure Monitor (use azure-monitor), Azure Osconfig (use azure-osconfig), Azure Virtual Machines (use azure-virtual-machines).
azure-video-indexer
Expert knowledge for Azure AI Video Indexer development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Video Indexer APIs/widgets, live camera indexing, custom speech/brand models, or Azure OpenAI integrations, and other Azure AI Video Indexer related development tasks. Not for Azure AI services (use microsoft-foundry-tools), Azure AI Vision (use azure-ai-vision).
azure-virtual-desktop
Expert knowledge for Azure Virtual Desktop development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when working with FSLogix profiles, MSIX/App Attach, autoscale/Start VM on Connect, Teams optimization, or SSO/MFA, and other Azure Virtual Desktop related development tasks. Not for Azure Virtual Machines (use azure-virtual-machines), Azure Dev Box (use azure-dev-box).
azure-virtual-machines
Expert knowledge for Azure Virtual Machines development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when choosing VM sizes, configuring scale sets, using Trusted Launch, encrypting disks, or automating via CLI/ARM, and other Azure Virtual Machines related development tasks. Not for Azure Virtual Machine Scale Sets (use azure-vm-scalesets), SQL Server on Azure Virtual Machines (use azure-sql-virtual-machines), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure App Service (use azure-app-service).
azure-virtual-network
Expert knowledge for Azure Virtual Network development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, and configuration. Use when designing VNets, NSGs, service endpoints, VNet peering, VPN gateways, or Azure Firewall/NAT gateways, and other Azure Virtual Network related development tasks. Not for Azure Networking (use azure-networking), Azure Virtual Network Manager (use azure-virtual-network-manager), Azure Virtual WAN (use azure-virtual-wan), Azure VPN Gateway (use azure-vpn-gateway).
azure-virtual-network-manager
Expert knowledge for Azure Virtual Network Manager development including troubleshooting, limits & quotas, security, configuration, and integrations & coding patterns. Use when managing AVNM IPAM pools, network groups, cross-tenant connectivity, security admin rules, or ARG queries, and other Azure Virtual Network Manager related development tasks. Not for Azure Virtual Network (use azure-virtual-network), Azure Virtual WAN (use azure-virtual-wan), Azure Network Watcher (use azure-network-watcher), Azure Networking (use azure-networking).
azure-virtual-wan
Expert knowledge for Azure Virtual WAN development including troubleshooting, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when designing Virtual WAN hubs, P2S VPN, ExpressRoute/SD‑WAN connectivity, NVAs/firewalls, or Entra ID VPN access, and other Azure Virtual WAN related development tasks. Not for Azure Virtual Network (use azure-virtual-network), Azure VPN Gateway (use azure-vpn-gateway), Azure ExpressRoute (use azure-expressroute), Azure Traffic Manager (use azure-traffic-manager).
azure-vm-scalesets
Expert knowledge for Azure Virtual Machine Scale Sets development including troubleshooting, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring VMSS autoscale, upgrade modes, zones/PPGs, Spot/standby pools, or disk encryption with Key Vault, and other Azure Virtual Machine Scale Sets related development tasks. Not for Azure Virtual Machines (use azure-virtual-machines), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure App Service (use azure-app-service), Azure Service Fabric (use azure-service-fabric).
azure-vmware-solution
Expert knowledge for Azure VMware Solution development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring AVS with HCX/NSX, vSAN/stretched clusters, Citrix/Horizon, Cloud Director, or JetStream DR, and other Azure VMware Solution related development tasks. Not for Azure Virtual Machines (use azure-virtual-machines), Azure Large Instances (use azure-large-instances), Azure Baremetal Infrastructure (use azure-baremetal-infrastructure), SAP HANA on Azure Large Instances (use azure-sap).
azure-vpn-gateway
Expert knowledge for Azure VPN Gateway development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring S2S/P2S tunnels, BGP routing, IPsec/IKE policies, Entra/RADIUS auth, or ExpressRoute VPNs, and other Azure VPN Gateway related development tasks. Not for Azure ExpressRoute (use azure-expressroute), Azure Virtual WAN (use azure-virtual-wan), Azure Virtual Network (use azure-virtual-network), Azure Virtual Network Manager (use azure-virtual-network-manager).
azure-web-application-firewall
Expert knowledge for Azure Web Application Firewall development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring Front Door/App Gateway WAF rules, rate limits, bot/CAPTCHA, Sentinel logging, or IaC deployments, and other Azure Web Application Firewall related development tasks. Not for Azure Application Gateway (use azure-application-gateway), Azure Front Door (use azure-front-door), Azure Firewall (use azure-firewall), Azure DDos Protection (use azure-ddos-protection).
azure-web-pubsub
Expert knowledge for Azure Web PubSub development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building WebSocket/MQTT apps, using Socket.IO, geo-replication, private endpoints, or Functions bindings, and other Azure Web PubSub related development tasks. Not for Azure SignalR Service (use azure-signalr-service), Azure Event Hubs (use azure-event-hubs), Azure Service Bus (use azure-service-bus), Azure Relay (use azure-relay).
azure-well-architected
Expert guidance for designing, assessing, and optimizing Azure workloads using Azure Well Architected. Covers design review checklists, recommendations, design principles, tradeoffs, service guides, workload patterns, and assessment questions. Use when designing AI, SAP, SaaS, HPC, AVD/AVS workloads, or choosing regions/AZs with cost–reliability tradeoffs, and other Azure Well Architected related development tasks.
microsoft-foundry
Expert knowledge for Microsoft Foundry (aka Azure AI Foundry) development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building Foundry agents with Azure OpenAI, RAG/indexing, MCP/OpenAPI tools, eval workflows, or CI/CD deployments, and other Microsoft Foundry related development tasks. Not for Microsoft Foundry Classic (use microsoft-foundry-classic), Microsoft Foundry Local (use microsoft-foundry-local), Microsoft Foundry Tools (use microsoft-foundry-tools).
microsoft-foundry-classic
Expert knowledge for Microsoft Foundry Classic (aka Azure AI Foundry classic) development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building Foundry agents, RAG/vector search, function calling, realtime audio, or fine-tuned Azure OpenAI models, and other Microsoft Foundry Classic related development tasks. Not for Microsoft Foundry (use microsoft-foundry), Microsoft Foundry Local (use microsoft-foundry-local), Microsoft Foundry Tools (use microsoft-foundry-tools).
microsoft-foundry-local
Expert knowledge for Microsoft Foundry Local (aka Azure AI Foundry Local) development including troubleshooting, best practices, decision making, configuration, and integrations & coding patterns. Use when using Foundry Local CLI, REST/SDK APIs, OpenAI clients, LangChain/Open WebUI, or Olive model builds, and other Microsoft Foundry Local related development tasks. Not for Microsoft Foundry (use microsoft-foundry), Microsoft Foundry Classic (use microsoft-foundry-classic), Microsoft Foundry Tools (use microsoft-foundry-tools), Azure Local (use azure-local).
microsoft-foundry-tools
Expert knowledge for Microsoft Foundry Tools (aka Azure AI services, Azure Cognitive Services) development including best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, and integrations & coding patterns. Use when using Content Moderator, Content Understanding analyzers, REST/.NET APIs, quotas, or secure Foundry setups, and other Microsoft Foundry Tools related development tasks. Not for Microsoft Foundry (use microsoft-foundry), Microsoft Foundry Classic (use microsoft-foundry-classic), Microsoft Foundry Local (use microsoft-foundry-local).
ci-cd
Design CI/CD pipelines for GitHub Actions, GitLab CI, and CircleCI with matrix builds, test sharding, caching, Docker layer caching, OIDC auth, deployment strategies (rolling, blue-green, canary), auto-rollback, self-hosted runners, and environment protection with manual approvals. Use when user asks to set up CI/CD, write a pipeline, configure GitHub Actions/GitLab CI/CircleCI, automate deployments, or set up build/test/deploy workflows. Do NOT use for Dockerfile authoring (use docker), K8s manifests (use kubernetes), or Terraform config (use terraform).
docker
Optimize Docker images with multi-stage builds, distroless bases, BuildKit cache mounts, multi-arch builds, compose watch, security hardening (non-root, seccomp, capabilities drop), and vulnerability scanning via docker scout/trivy. Use when user asks to write a Dockerfile, optimize image size, set up docker-compose, debug containers, harden container security, or scan for CVEs. Do NOT use for Kubernetes deployments (use kubernetes), CI/CD pipeline design (use ci-cd), or Terraform (use terraform).
error-handler
Design error handling, structured logging, and observability with OpenTelemetry (traces, metrics, logs), error classification, recovery patterns (retry with jitter, circuit breaker, bulkhead, timeout), error budgets/SLOs with burn rate alerts, and production incident triage. Use when user asks to implement error handling, logging, monitoring, observability, OpenTelemetry, error boundaries, circuit breakers, retry logic, or SLO tracking. Do NOT use for incident runbooks (use runbook-gen), vendor-specific APM setup (Datadog, Sentry agent config), or K8s debugging.
kubernetes
Deploy, manage, and debug Kubernetes in production — Deployments, Services, Gateway API, Service Mesh (Istio/Linkerd/Cilium), eBPF observability (Cilium Hubble), security hardening (Pod Security Standards, OPA/Kyverno, seccomp, runtime security with Falco/Tetragon), Helm, HPA, PDB, topology spread, and debugging. Use when user asks to write K8s manifests, deploy to a cluster, debug pods, set up Gateway API, configure autoscaling, or harden cluster security. Do NOT use for Dockerfiles (use docker), CI/CD pipeline design (use ci-cd), or Terraform infrastructure (use terraform).
gitops-repo-audit
Audit and validate Flux CD GitOps repositories by scanning local repo files (not live clusters) — runs Kubernetes schema validation, detects deprecated Flux APIs, reviews RBAC/multi-tenancy/secrets management, and produces a prioritized GitOps report. Use when users ask to audit, analyze, validate, review, or security-check a GitOps repo.
senior-computer-vision
World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.
senior-data-scientist
World-class data science skill for statistical modeling, experimentation, causal inference, and advanced analytics. Expertise in Python (NumPy, Pandas, Scikit-learn), R, SQL, statistical methods, A/B testing, time series, and business intelligence. Includes experiment design, feature engineering, model evaluation, and stakeholder communication. Use when designing experiments, building predictive models, performing causal analysis, or driving data-driven decisions.
senior-ml-engineer
World-class ML engineering skill for productionizing ML models, MLOps, and building scalable ML systems. Expertise in PyTorch, TensorFlow, model deployment, feature stores, model monitoring, and ML infrastructure. Includes LLM integration, fine-tuning, RAG systems, and agentic AI. Use when deploying ML models, building ML platforms, implementing MLOps, or integrating LLMs into production systems.
senior-prompt-engineer
World-class prompt engineering skill for LLM optimization, prompt patterns, structured outputs, and AI product development. Expertise in Claude, GPT-4, prompt design patterns, few-shot learning, chain-of-thought, and AI evaluation. Includes RAG optimization, agent design, and LLM system architecture. Use when building AI products, optimizing LLM performance, designing agentic systems, or implementing advanced prompting techniques.
deploying-machine-learning-models
This skill enables Claude to deploy machine learning models to production environments. It automates the deployment workflow, implements best practices for serving models, optimizes performance, and handles potential errors. Use this skill when the user requests to deploy a model, serve a model via an API, or put a trained model into a production environment. The skill is triggered by requests containing terms like "deploy model," "productionize model," "serve model," or "model deployment."
aegisops-ai
Autonomous DevSecOps & FinOps Guardrails. Orchestrates Gemini 3 Flash to audit Linux Kernel patches, Terraform cost drifts, and K8s compliance.
securing-cloud-and-supply-chain
云原生与软件供应链安全防御。容器/K8s 加固、Service Mesh、CI/CD 安全、SLSA/SBOM/Sigstore、云 IAM、Secrets 管理、IaC 安全。Use when hardening Kubernetes clusters, auditing CI/CD pipelines, implementing supply chain security, managing cloud IAM, or reviewing IaC code.
dstack
dstack is an open-source control plane for GPU provisioning and orchestration across GPU clouds, Kubernetes, and on-prem clusters.
chinese-documentation
中文技术文档写作规范——排版、术语、结构一步到位,告别机翻味
chaos
Injects controlled faults for resilience testing on non-prod. Triggers: chaos, fault injection, latency injection, dependency kill, resilience test.
debug-buttercup
Debugs the Buttercup CRS (Cyber Reasoning System) running on Kubernetes. Use when diagnosing pod crashes, restart loops, Redis failures, resource pressure, disk saturation, DinD issues, or any service misbehavior in the crs namespace. Covers triage, log analysis, queue inspection, and common failure patterns for: redis, fuzzer-bot, coverage-bot, seed-gen, patcher, build-bot, scheduler, task-server, task-downloader, program-model, litellm, dind, tracer-bot, merger-bot, competition-api, pov-reproducer, scratch-cleaner, registry-cache, image-preloader, ui.
azure-analysis-services
Expert knowledge for Azure Analysis Services development including troubleshooting. Use when testing server connections, debugging gateway or firewall blocks, or checking connection strings and ports, and other Azure Analysis Services related development tasks. Not for Azure Synapse Analytics (use azure-synapse-analytics), Azure SQL Database (use azure-sql-database), Azure SQL Managed Instance (use azure-sql-managed-instance), SQL Server on Azure Virtual Machines (use azure-sql-virtual-machines).
azure-business-process-tracking
Expert knowledge for Azure Business Process Tracking development including deployment. Use when creating CI/CD pipelines, automating builds, running tests, and deploying tracking solutions via DevOps tools, and other Azure Business Process Tracking related development tasks. Not for Azure Monitor (use azure-monitor), Azure Logic Apps (use azure-logic-apps), Azure Data Factory (use azure-data-factory), Azure Machine Learning (use azure-machine-learning).
azure-internet-peering
Expert knowledge for Azure Internet Peering development including troubleshooting. Use when validating Peering Service prefixes, checking prefix registration, verifying routing, or fixing reachability issues, and other Azure Internet Peering related development tasks. Not for Azure ExpressRoute (use azure-expressroute), Azure Virtual Network (use azure-virtual-network), Azure Virtual WAN (use azure-virtual-wan), Azure VPN Gateway (use azure-vpn-gateway).
azure-open-datasets
Expert knowledge for Azure Open Datasets development including limits & quotas. Use when handling non-Spark dataset downloads, throttling behavior, quota limits, retry logic, or rate-limit workarounds, and other Azure Open Datasets related development tasks. Not for Azure Data Explorer (use azure-data-explorer), Azure Synapse Analytics (use azure-synapse-analytics), Azure Databricks (use azure-databricks), Azure Machine Learning (use azure-machine-learning).
azure-peering-service
Expert knowledge for Azure Peering Service development including best practices. Use when designing Peering Service prefixes, routing policies, prefix validation rules, or connectivity constraints, and other Azure Peering Service related development tasks. Not for Azure Internet Peering (use azure-internet-peering), Azure Virtual Network (use azure-virtual-network), Azure Virtual WAN (use azure-virtual-wan), Azure ExpressRoute (use azure-expressroute).
azure-quotas
Expert knowledge for Azure Quotas development including limits & quotas. Use when requesting per-region Storage account quota increases, checking limits, or filing Azure support requests, and other Azure Quotas related development tasks. Not for Azure Cost Management (use azure-cost-management), Azure Monitor (use azure-monitor), Azure Policy (use azure-policy), Azure Resource Manager (use azure-resource-manager).
ai-analyze-permissions
Use when Claude Code keeps asking to approve commands you have already approved, when settings.local.json has grown large, or when you want to consolidate permission grants into wildcard patterns. Trigger for 'too many permission prompts', 'clean up permissions', 'audit my settings', 'consolidate allow rules'. Claude Code only — not available in GitHub Copilot, Antigravity, or Codex.
kubernetes
Kubernetes workflow skill. Use this skill when a user needs workload manifests, rollout strategy, service exposure, or cluster operations guidance.
one-way-door
Use this skill when creating new files that represent architectural decisions — data models, infrastructure configs, auth boundaries, API contracts, CI/CD pipelines, or event systems. Flags irreversible decisions and forces a discussion about trade-offs before committing.
icon-lookup
Workaround for Claude Code filtering BMP PUA Unicode (U+E000-U+F8FF). Supplementary PUA Nerd Font icons like (U+F0000+, e.g. nf-md-github, nf-md-kubernetes, nf-md-battery) can be written directly. BMP PUA icons (Powerline, Font Awesome, Devicons) require placeholder syntax like {{ U+E0A0 }} or {{ nf-fa-star }} (without spaces), which hooks auto-convert. Invoke when reading or writing Starship configs, tmux themes, shell prompts, or statuslines.
health-check-endpoints
Health check endpoints for liveness, readiness, dependency monitoring. Use for Kubernetes, load balancers, auto-scaling, or encountering probe failures, startup delays, dependency checks, timeout configuration errors.
model-deployment
Deploy ML models with FastAPI, Docker, Kubernetes. Use for serving predictions, containerization, monitoring, drift detection, or encountering latency issues, health check failures, version conflicts.
devops-sre-master
DevOps 与站点可靠性工程 (SRE) — 平台 / 基础设施 / 可靠性工程师的认知操作系统, 覆盖软件交付 + 运维全生命周期 (CI/CD 与发布工程 trunk-based + 渐进式发布 canary/blue-green/feature flag + GitOps Argo CD/Flux / 基础设施即代码 Terraform/OpenTofu/Pulumi/Ansible + policy-as-code OPA / 容器与编排 Docker/Kubernetes + Helm/Kustomize + service mesh Istio/Linkerd / 可观测性 Prometheus + Loki + OpenTelemetry + Honeycomb + eBPF + RED/USE / SLO-SLI-error budget 与可靠性工程 Google SRE 学科 + 容量规划 + 优雅降级 / 事件管理与 on-call 事件指挥 + PagerDuty + runbook + 无指责复盘 + MTTR / 云平台与 FinOps AWS/GCP/Azure + 成本优化 + 弹性伸缩 / 平台工程与开发者体验 IDP + Backstage + golden path + Team Topologies / DevSecOps 与供应链安全 shift-left + SBOM + SLSA + sigstore + Vault / 韧性与混沌工程 fault injection + game day + 安全科学 / DORA 指标与工程效能 部署频率 + 变更前置时间 + 变更失败率 + Accelerate 研究 / 数据库与有状态运维 schema 迁移 + 备份容灾) — 不含 通用应用开发 / 纯云销售认证速成 / 'DevOps = 跑 Jenkins 的岗位' 窄化误解 / ITIL 工单文化传统运维 (旧范式仅做边界) / 把手工运维 ClickOps 当稳态 (是 toil, 本 skill 核心反模式) (DevOps & Site Reliability Engineering — the cognitive operating system of platform / infrastructure / reliability practitioners
provisioning-infrastructure
Cloud-native infrastructure knowledge reference covering Kubernetes, Helm, Kustomize, Operators, CRDs, GitOps (ArgoCD, Flux), and IaC (Terraform, Pulumi, CDK). Use when provisioning infrastructure, managing clusters, or working with GitOps workflows.
vmware-aiops
Use this skill whenever the user needs to manage VMs in VMware/vSphere/ESXi — it's the entry point for all VM operations. Directly handles: power on/off, clone, snapshot, migrate, deploy from OVA or templates, run commands inside VMs, batch operations, cluster management, and vCenter alarm acknowledgment. Always use this skill for any "power on", "clone", "deploy", "migrate", "batch", "guest exec", "alarm", or VM lifecycle task when the context is explicitly VMware, vSphere, or ESXi. Do NOT use for read-only queries (use vmware-monitor), NSX networking (use vmware-nsx), storage/iSCSI/vSAN (use vmware-storage), or Kubernetes cluster lifecycle (use vmware-vks). For multi-step workflows use vmware-pilot. For load balancing/AVI/AKO use vmware-avi.
cloud-security--container-hardening
AWS/Azure/GCP security auditing, container and Kubernetes hardening, Infrastructure as Code scanning, and cloud compliance assessment
vulnerability-scanning--assessment
Dependency auditing, CVE detection, configuration security review, CVSS scoring, and prioritized vulnerability reporting
devops-excellence
DevOps and CI/CD expert. Use when setting up pipelines, containerizing applications, deploying to Kubernetes, or implementing release strategies. Covers GitHub Actions, Docker, K8s, Terraform, and GitOps.
deep-research
Execute autonomous multi-step research using Google Gemini Deep Research Agent. Use for: market analysis, competitive landscaping, literature reviews, technical research, due diligence. Takes 2-10 minutes but produces detailed, cited reports. Costs $2-5 per task.
sync-skills
(project) Use when a skill in skills/ has its name or description changed, or is added or removed — syncs README.md, settings.json, and hal_dotfiles.json
deploying-infra
Validate infrastructure changes and, after explicit confirmation, apply Terraform, Helm, Kustomize, or Kubernetes deployments. Use when the user says "deploy", "deploy to staging", "terraform apply", "helm upgrade", "kubectl apply", "rollout", "deploy check", "validate deployment", or "validate infrastructure". Dockerfiles and GitHub Actions are validate-only here. NOT for ongoing service troubleshooting, cloud inspection, rollback investigation, or authoring infra from scratch; use operating-infra for those.
managing-infra
Infrastructure patterns for Kubernetes, Terraform, Helm, Kustomize, and GitHub Actions. Use when making K8s architectural decisions, choosing between Helm vs Kustomize, structuring Terraform modules, writing CI/CD workflows, or applying security best practices. NOT for cloud CLI commands (see using-cloud-cli) or deploy validation and apply workflows (see deploying-infra).
using-cloud-cli
Cloud CLI patterns for GCP and AWS. Use when running bq queries, gcloud commands, aws commands, or making decisions about cloud services. Covers BigQuery cost optimization and operational best practices. NOT for Terraform or Kubernetes architectural decisions (see managing-infra).
writing-shell
Idiomatic shell development for POSIX sh, Bash, Zsh, Fish, hooks, CI shell steps, and scriptable CLI glue. Use when writing or changing `.sh`, `.bash`, `.zsh`, `.fish`, `.bats`, shell functions, shell pipelines, or command-runner recipes. Emphasizes portability, quoting, safe filesystem/process handling, non-TUI CLI tools, ShellCheck, shfmt, Bats, and ShellSpec. NOT for Python, TypeScript, Go, web code, or infrastructure operations.
az-cli
Use the Azure CLI (`az`) to manage Azure resources from the command line. Trigger this skill whenever the user asks to create, configure, manage, deploy, or interact with any Azure resource — even if they don't explicitly mention "az cli". Also trigger when the user asks about Azure CLI commands, syntax, or wants to know how to do something in Azure from the terminal.
kubernetes-skill
Prevent Kubernetes hallucinations by diagnosing and fixing failure modes: insecure workload defaults, resource starvation, network exposure, privilege sprawl, fragile rollouts, and API drift. Use when generating, reviewing, refactoring, or migrating manifests, Helm charts, Kustomize overlays, cluster policies, and platform-specific Kubernetes work for EKS, GKE, AKS, OpenShift, GitOps controllers, or observability stacks.
cloud-devops
Cloud infrastructure and DevOps workflow covering AWS, Azure, GCP, Kubernetes, Terraform, CI/CD, monitoring, and cloud-native development.
devops-deployment
CI/CD pipelines, containerization, Kubernetes, and infrastructure as code patterns
docker-expert
Docker containerization expert with deep knowledge of multi-stage builds, image optimization, container security, Docker Compose orchestration, and production deployment patterns. Use PROACTIVELY for Dockerfile optimization, container issues, image size problems, security hardening, networking, and orchestration challenges.
iac-checkov
Infrastructure as Code (IaC) security scanning using Checkov with 750+ built-in policies for Terraform, CloudFormation, Kubernetes, Dockerfile, and ARM templates. Use when: (1) Scanning IaC files for security misconfigurations and compliance violations, (2) Validating cloud infrastructure against CIS, PCI-DSS, HIPAA, and SOC2 benchmarks, (3) Detecting secrets and hardcoded credentials in IaC, (4) Implementing policy-as-code in CI/CD pipelines, (5) Generating compliance reports with remediation guidance for cloud security posture management.
kubernetes-deployment
Kubernetes deployment workflow for container orchestration, Helm charts, service mesh, and production-ready K8s configurations.
policy-opa
Policy-as-code enforcement and compliance validation using Open Policy Agent (OPA). Use when: (1) Enforcing security and compliance policies across infrastructure and applications, (2) Validating Kubernetes admission control policies, (3) Implementing policy-as-code for compliance frameworks (SOC2, PCI-DSS, GDPR, HIPAA), (4) Testing and evaluating OPA Rego policies, (5) Integrating policy checks into CI/CD pipelines, (6) Auditing configuration drift against organizational security standards, (7) Implementing least-privilege access controls.
sast-horusec
Multi-language static application security testing using Horusec with support for 18+ programming languages and 20+ security analysis tools. Performs SAST scans, secret detection in git history, and provides vulnerability findings with severity classification. Use when: (1) Analyzing code for security vulnerabilities across multiple languages simultaneously, (2) Detecting exposed secrets and credentials in git history, (3) Integrating SAST into CI/CD pipelines for secure SDLC, (4) Performing comprehensive security analysis during development, (5) Managing false positives and prioritizing security findings.
sca-trivy
Software Composition Analysis (SCA) and container vulnerability scanning using Aqua Trivy for identifying CVE vulnerabilities in dependencies, container images, IaC misconfigurations, and license compliance risks. Use when: (1) Scanning container images and filesystems for vulnerabilities and misconfigurations, (2) Analyzing dependencies for known CVEs across multiple languages (Go, Python, Node.js, Java, etc.), (3) Detecting IaC security issues in Terraform, Kubernetes, Dockerfile, (4) Integrating vulnerability scanning into CI/CD pipelines with SARIF output, (5) Generating Software Bill of Materials (SBOM) in CycloneDX or SPDX format, (6) Prioritizing remediation by CVSS score and exploitability.
dcg
Handle blocked destructive commands. Use when dcg blocks rm -rf, git reset --hard, DROP DATABASE, kubectl delete, or when configuring agent safety guardrails.
otel-collector
Expert guidance for configuring and deploying the OpenTelemetry Collector. Use when setting up a Collector pipeline, configuring receivers, exporters, or processors, deploying a Collector to Kubernetes or Docker, or forwarding telemetry to Dash0. Triggers on requests involving collector, pipeline, OTLP receiver, exporter, or Dash0 collector setup.
otel-instrumentation
Configures trace spans, defines custom metrics, sets up log exporters, and optimizes sampling strategies for OpenTelemetry instrumentation. Use when instrumenting applications with traces, metrics, or logs. Triggers on requests for observability, telemetry, tracing, metrics collection, logging integration, or OTel setup.
cloud-iam-deep
Cloud IAM red-team attack chain across AWS, Azure, GCP — focused on EXTERNAL exploitation paths and post-credential-discovery privilege analysis. Covers IAM enumeration (aws iam, az role, gcloud iam), STS/AssumeRole chaining, Azure Managed Identity abuse (via SSRF/leak), GCP service account JSON abuse, IMDSv1/v2 attacks via SSRF, K8s ServiceAccount token privilege analysis once held (token discovery / cluster exposure is owned by hunt-k8s), role-trust-policy confused-deputy, cross-account assume-role enumeration, IAM privilege escalation patterns (24+ AWS, 8+ Azure, 6+ GCP), and AWS Cognito Identity Pool unauthenticated-role attack chain (GetId → GetCredentialsForIdentity → IAM role abuse). Built for the case where recon yields a credential (key, JSON, token) and you need to know what it grants and how to escalate. Use when an AWS key / Azure secret / GCP service account JSON / K8s SA token surfaces from a code repo, JS bundle, APK, breach corpus, or SSRF chain.
hunt-k8s
Hunt Kubernetes & Docker — API anonymous access, kubelet 10250 exec (SPDY/WebSocket, NOT plain POST) and the simpler /run primitive, etcd 2379 unauth, dashboard skip-login, RBAC misconfig, secret/SA-token abuse, docker.sock host escape, runc/container-escape (Leaky Vessels CVE-2024-21626), API-server-mediated nodes/proxy RCE, EphemeralContainers node-shell, bound/projected SA-token audience+expiry abuse, admission-controller bypass, Helm/Tiller remnants. Use when target runs containerized infra, exposes K8s ports (6443/10250/10255/2379/8443), or cloud metadata reveals K8s service accounts.
hunt-rce
Hunting skill for rce vulnerabilities. Built from 67 public bug bounty reports. Use when hunting rce on any target.
hunt-ssrf
Hunting skill for ssrf vulnerabilities. Built from 15 public bug bounty reports including AWS metadata SSRF (HackerOne $25k Analytics PDF, Shopify Exchange $25k, Capital One 106M-record breach, Dropbox/HelloSign $4,913), GCP metadata SSRF (Snapchat $4k), Azure IMDS SSRF (Azure DevOps $15k chain, ChatGPT Custom Actions MSRC), DNS rebinding SSRF (Concrete CMS, GitLab UrlBlocker), gopher-protocol-to-Redis-RCE (Yahoo Mail $15k), link-preview SSRF (Reddit Matrix $6k), and headless-browser PDF-generator SSRF chains. Use when hunting SSRF on any target — OOB Collaborator confirmation mandatory for blind cases.
starlark-dev
Develop and debug Kurtosis Starlark packages. Create packages from scratch, understand the plan-based execution model, use print() debugging, handle future references, and test packages locally. Use when writing or troubleshooting .star files.
ansible-playbooks
Write idempotent Ansible playbooks and roles for server configuration, K8s node provisioning, and application bootstrap.
backup-restore
PostgreSQL backup and restore with pgBackRest — full/incremental/WAL, PITR, K8s CronJob scheduling, and restore verification.
capacity-planning
Forecast infrastructure capacity needs — traffic projection, resource headroom calculations, node pool sizing, K8s cluster capacity.
chaos-engineering
Design and run chaos experiments in Kubernetes — pod failures, network partitions, resource pressure with LitmusChaos and manual chaos.
cluster-operations
Day-2 cluster operations — node management, etcd backup/restore, certificate rotation, namespace lifecycle.
container-hardening
Harden container images and Kubernetes workload security contexts — distroless, multi-stage, minimal attack surface.
distributed-tracing
Implement distributed tracing with OpenTelemetry, Tempo/Jaeger — instrumentation, sampling, and trace-to-log correlation. Use when the user asks about distributed tracing, OpenTelemetry setup, span instrumentation, trace propagation, or connecting traces to logs and metrics.
gitlab-ci-patterns
GitLab CI/CD pipelines — include templates, environments, OIDC auth, caching, protected runners, deployment gates.
helm-charts
Design, structure, and test production-grade Helm charts with multi-environment overlays.
ingress-patterns
NGINX Ingress Controller patterns — TLS, rate limiting, CORS, rewrites, path-based routing, and MetalLB for bare-metal.
log-aggregation
Set up Loki or ELK log aggregation for K8s workloads — structured logging, log routing, and log-based alerting.
network-policies
Design and implement Kubernetes NetworkPolicy and Cilium network policies for namespace isolation and service-to-service access control.
opa-policies
Write OPA/Gatekeeper and Kyverno admission policies for Kubernetes security guardrails.
pod-troubleshooting
Systematic diagnosis of Kubernetes pod failures — CrashLoopBackOff, OOMKilled, Pending, ImagePullBackOff, and service connectivity issues. Use when the user encounters pods not starting, container restart loops, scheduling failures, or service unreachability in a K8s cluster.
rbac-design
Design minimal-privilege RBAC for workloads, operators, and human access in multi-tenant clusters.
redis-operations
Redis operational runbooks — memory management, eviction policy, persistence config, Sentinel/Cluster, K8s-hosted Redis ops.
resource-tuning
Right-size pod resources, configure HPA/VPA/KEDA, and eliminate resource waste in Kubernetes.
service-mesh
Implement service mesh for mTLS, traffic management, and observability — Istio and Linkerd patterns for Kubernetes.
sigstore-signing
Sign container images and artifacts with cosign (keyless via OIDC and key-based); verify signatures in CD pipelines and admission policies.
vpc-design
Design cloud-agnostic private networks — subnet layout, CIDR allocation, zone redundancy, routing, and bare-metal equivalent.
devops-deployment
Use when setting up CI/CD pipelines, containerizing applications, deploying to Kubernetes, or writing infrastructure as code. DevOps & Deployment covers GitHub Actions, Docker, Helm, and Terraform patterns.
analyzing-linux-elf-malware
Analyzes malicious Linux ELF (Executable and Linkable Format) binaries including botnets, cryptominers, ransomware, and rootkits targeting Linux servers, containers, and cloud infrastructure. Covers static analysis, dynamic tracing, and reverse engineering of x86_64 and ARM ELF samples. Activates for requests involving Linux malware analysis, ELF binary investigation, Linux server compromise assessment, or container malware analysis.
agentstack-server-debugging
Instructions for debugging agentstack-server during development
aws-rds
Provision and manage RDS databases. Configure backups, replication, and security. Use when deploying managed relational databases on AWS.
general-readme-skill
Use when generating or improving README.md for any project. Trigger on /readme, "generate readme", "write readme", "create project documentation", "更新README", "帮我写 README".
kelos
Author, debug, and operate Kelos resources (Task, Workspace, AgentConfig, TaskSpawner) on Kubernetes. Use when working with Kelos CRDs or the kelos CLI.
cluster-manage
Manage Kurtosis cluster settings. Switch between Docker and Kubernetes backends, list available clusters, and configure which cluster Kurtosis uses. Use when you need to change where Kurtosis runs enclaves.
devops-excellence
DevOps and CI/CD expert. Use when setting up pipelines, containerizing applications, deploying to Kubernetes, or implementing release strategies. Covers GitHub Actions, Docker, K8s, Terraform, and GitOps.
nw-authoritative-sources
Domain-specific authoritative source databases, search strategies by topic category, and source freshness rules
nw-devops
Designs CI/CD pipelines, infrastructure, observability, and deployment strategy. Use when preparing platform readiness for a feature.
nw-infrastructure-and-observability
Infrastructure as Code patterns (Terraform, Kubernetes), observability design (SLOs, metrics, alerting, dashboards), and pipeline security stages. Load when designing infrastructure, observability, or security scanning.
mkcareful
Session-scoped safety guardrails for destructive commands. Warns before rm -rf, DROP TABLE, force-push, git reset --hard, kubectl delete, and similar destructive operations. User can override each warning. Active for the current session only. Use when touching prod, debugging live systems, or working in a shared environment. Use when asked to "be careful", "safety mode", "prod mode", or "careful mode". NOT for scoping edits to a specific directory (see mk:freeze).
terradev-gpu-cloud
Cross-cloud GPU provisioning, K8s cluster creation, and inference overflow. Get real-time pricing across 11+ cloud providers, provision the cheapest GPUs in seconds, spin up production K8s clusters, and burst to cloud when your local GPU maxes out. BYOAPI — your keys never leave your machine.
infrastructure
云原生基础设施(K8s/Helm/Operator/GitOps/ArgoCD/Flux/IaC/Terraform)。
analyzing-kubernetes-audit-logs
Parses Kubernetes API server audit logs (JSON lines) to detect exec-into-pod, secret access, RBAC modifications, privileged pod creation, and anonymous API access. Builds threat detection rules from audit event patterns. Use when investigating Kubernetes cluster compromise or building k8s-specific SIEM detection rules.
accounting-maestro
Route accounting questions to the narrowest specialist in the catalog. Use when you do not already know the specialist needed. Not for direct accounting answers; Maestro classifies, dispatches, and synthesizes only. Dispatches single agent for focused tasks, parallel team (max 3) for multi-domain tasks. Never auto-dispatches any write-capable agent — requires explicit human confirmation before routing to any agent with ledger or ERP write access.
alibaba-ack-container-platform-operator
Operate ACK clusters (managed/dedicated/serverless), ACR container registries, ASM service mesh, and container workload placement. Guide ACK type selection, OIDC workload identity, and image vulnerability posture.
alibaba-actiontrail-audit-analyst
Query Alibaba Cloud ActionTrail management API call history, build governance audit reports, create SLS-based compliance evidence trails, and detect anomalous admin activity patterns.
alibaba-analyticdb-realtime
Operate AnalyticDB for MySQL and PostgreSQL, Hologres real-time OLAP analytics, and DAS real-time diagnostics for sub-second interactive analytics workloads.
alibaba-certificate-manager-issuer-review
Review Alibaba Cloud SSL Certificate Service — DV/OV/EV certificate lifecycle, auto-renewal configuration, certificate deployment to SLB/ALB/CDN/OSS, domain validation status, CAA record compliance, and expiry monitoring.
alibaba-change-impact-advisor
Pre-change blast radius analysis for Alibaba Cloud — Resource Directory OU scope mapping, RAM policy cascade effects, VPC peering and CEN impact, SLB backend pool changes, RDS connection pool disruption, and safe change sequencing.
alibaba-cost-anomaly-watch-coordinator
Detect and coordinate response to Alibaba Cloud cost anomalies — MaxCompute CU vs on-demand billing mismatch, ECS spot instance interruption cascades, CDN traffic spike billing, OSS API request cost explosions, budget alert → DingTalk notification → remediation playbook.
alibaba-cost-finops-analyst
Analyze Alibaba Cloud spend via Cost Manager, optimize Savings Plans and Reserved Instance coverage, design resource tagging strategy, investigate budget drift, and right-size over-provisioned ECS, RDS, and MaxCompute resources.
alibaba-daily-operations-briefing-coordinator
Coordinate the daily Alibaba Cloud operations standup — cost delta from Cost Manager, ActionTrail anomaly review, ACK pod failure triage, quota utilization warnings, Security Center finding review, and action item assignment.
alibaba-devops-cicd-operator
Build CI/CD pipelines with RDC (Research and Development Collaboration), Cloud Build, Flow pipeline automation, ACR (Container Registry) image lifecycle, and environment promotion strategies.
alibaba-ecs-compute-operator
Operate ECS instances, Auto Scaling groups, ECI serverless containers, and Cloud Assistant O&M automation. Handle instance lifecycle, image management, placement groups, spot/preemptible instances, and scheduled scaling.
alibaba-event-driven-architecture-review
Review Alibaba Cloud EventBridge, MNS (Message Notification Service), RocketMQ, and MSE event-driven designs — dead-letter queues, message ordering, idempotency, retry storm prevention, schema registry, and consumer group lag monitoring.
alibaba-function-serverless-operator
Deploy and operate Function Compute 3.0, SAE (Serverless App Engine) applications, and EDAS microservice apps. Guide the serverless vs. PaaS vs. container platform choice for each workload type.
alibaba-iac-change-safety-review
Review Terraform and ROS (Resource Orchestration Service) changes targeting Alibaba Cloud — blast radius analysis, resource deletion detection, cross-stack dependency impact, Resource Directory scope, and rollback plan completeness.
alibaba-kms-secret-lifecycle-steward
Audit and govern Alibaba Cloud KMS key lifecycles, Certificate Manager, SSM (Secrets Manager), and HSM key operations. Ensure encryption-at-rest coverage and rotation compliance across CMKs, envelope encryption, and certificate lifecycle.
alibaba-landing-zone-architect
Design Alibaba Cloud landing zone — Resource Management org tree, Cloud SSO, Control Policy (SCP equivalent), multi-account governance baseline, billing account structure, and ActionTrail centralization.
alibaba-live-ack-rollout-guard
Gate ACK deployment mutations, node pool scaling, and cluster version upgrades against rollback posture and workload disruption budget. Prevents irreversible cluster version upgrades from proceeding without PodDisruptionBudget verification, node drain confirmation, and explicit operator approval.
alibaba-live-cost-budget-action-guard
Gate live financial authority actions — budget threshold changes, Savings Plan purchases, and Reserved Instance commitments. These are committed spend or can trigger immediate service suspension.
alibaba-live-kms-key-mutation-guard
Gate KMS key deletion and disable operations. All data encrypted with a deleted CMK (OSS SSE-KMS, ECS encrypted disks, RDS/PolarDB TDE) becomes permanently and irrecoverably inaccessible. This guard enforces complete CMK dependency audits, deletion window confirmation, and explicit operator approval before any key state mutation.
alibaba-live-oss-bucket-policy-guard
Gate OSS bucket ACL and policy mutations — public-read/write ACL exposes data to internet crawlers within seconds; CN-* cross-border replication requires DSL Article 31 assessment.
alibaba-live-ram-policy-change-guard
Gate RAM policy/role mutations against the Alibaba Cloud account hierarchy. RAM AdministratorAccess assignment, policy deletion with active STS tokens, and Resource Directory Control Policy changes carry account-wide or org-wide blast radius. This guard enforces blast-radius assessment, STS token impact analysis, and explicit authority approval before any policy mutation is executed.
alibaba-live-rds-polardb-mutation-guard
Gate RDS/PolarDB instance deletion, spec downgrade, and backup policy removal — database deletion without verified backup is permanently destructive.
alibaba-load-balancer-traffic-engineer
Traffic engineering for Alibaba Cloud load balancers — CLB (Classic, legacy), ALB (Application Load Balancer, Layer 7 advanced routing), NLB (Network Load Balancer, Layer 4 high throughput), and GA (Global Accelerator) — type selection, health check design, WAF integration, and traffic distribution.
alibaba-maestro
Alibaba Cloud Maestro routing skill. Classify the user's Alibaba Cloud task, select the narrowest specialist agent or the right team of specialists from the catalog, and dispatch them — single specialist for focused tasks, parallel team (max 4) for multi-domain tasks. Never auto-dispatch live-guard agents. China-region aware — flags when workloads are in mainland China regions and applicable regulatory frameworks (MLPS 2.0, DSL, PIPL) differ from international regions.
alibaba-maxcompute-dataworks-analyst
Manage MaxCompute CU package governance, DataWorks scheduling, Quick BI reporting, and PAI ML platform. Optimize query cost and job scheduling efficiency for big data workloads.
alibaba-migration-architect
Plan Alibaba Cloud migrations using SMC (Server Migration Center), DTS (Data Transmission Service) for data sync, OSSImport for object storage migration, and design cutover sequencing with rollback paths.
alibaba-mse-microservice-engine
Configure and operate Alibaba MSE (Microservice Engine) — Nacos service discovery and configuration management, Sentinel rate limiting and circuit breaking, Seata distributed transactions, and ARMS APM for microservices observability.
alibaba-network-architect
Design Alibaba Cloud network topology — VPC peering, CEN for multi-VPC/multi-region connectivity, Express Connect for private circuits, SLB/ALB/NLB/CLB load balancer selection, and Smart Access Gateway for branch offices.
alibaba-observability-incident-responder
Respond to Alibaba Cloud incidents using CloudMonitor alarms, SLS log analytics, ARMS APM distributed tracing, and alert governance for ECS, RDS, ACK, and network services.
alibaba-oss-data-perimeter-governor
Govern Alibaba Cloud OSS data perimeters — bucket ACL and policy conflict resolution, Block Public Access configuration, cross-account access via RAM role, VPC endpoint binding for private access, WORM (Object Lock), and MLPS 2.0 data residency compliance.
alibaba-oss-storage-steward
Manage OSS lifecycle policies, bucket policy and ACL governance, NAS/CPFS shared file storage, cross-region replication, and access control hardening for Alibaba Cloud object and file storage.
alibaba-polardb-rds-dba
Operate PolarDB (MySQL/PG/Oracle) clusters and RDS instances — DAS diagnostics, database proxy, Global Database Network, backup strategy, and performance tuning.
alibaba-ram-iam-review
Audit Alibaba Cloud RAM users, groups, roles, and policies; review STS token lifecycle and scope; assess Resource Directory permission boundaries; review Control Policy statements for org-wide gaps or over-privilege.
alibaba-registry-artifact-governor
Govern Alibaba Cloud Container Registry (ACR) — Enterprise Edition vs Personal Edition selection, image vulnerability scanning, namespace IAM least privilege, image retention policies, cross-region replication, and supply chain security posture.
alibaba-resilience-bcdr-review
Review Alibaba Cloud workload HA and BCDR designs — RDS High-Availability Edition failover, PolarDB Global Database Network, ACK multi-zone, ECS disaster recovery cross-region, RTO/RPO target analysis, and HBR (Hybrid Backup Recovery) coverage.
alibaba-security-center-hardening
Harden Alibaba Cloud security posture via Security Center (threat detection, vulnerability scanning, baseline checks), WAF, Anti-DDoS Pro, Cloud Firewall, and Network Traffic Analysis (NTA).
alibaba-serverless-production-readiness
Review Function Compute 3.0 (FC3), SAE (Serverless App Engine), and EDAS for production readiness — cold start optimization, VPC binding, RAM role injection, ARMS distributed tracing, security group rules, concurrency limits, and SLA-readiness.
alibaba-solution-architect
Design Alibaba Cloud solutions — product selection (PolarDB vs RDS, ACK vs ASK vs SAE, MaxCompute vs AnalyticDB), architecture patterns, landing zone design, and disaster recovery strategies aligned to the Alibaba Well-Architected Framework.
alibaba-support-incident-coordinator
Coordinate Alibaba Cloud support incidents — case creation with correct severity (紧急/高/中/低), Enterprise Support SLA enforcement, account manager escalation path, status page monitoring for CN-* and international, internal stakeholder communication, and post-incident evidence packaging.
alibaba-ticket-triage-escalation-coordinator
Triage Alibaba Cloud operational alerts, incidents, and support tickets — P0/P1/P2/P3 classification, Alibaba Cloud Support SLA enforcement, account manager escalation, DingTalk war room coordination, evidence collection from CloudMonitor and SLS, and safe escalation paths.
alibaba-waf-cost-optimization-review
Assess Alibaba Cloud cost posture: ECS instance family rightsizing, Savings Plans and Reserved Instance coverage, Preemptible Instance adoption, cost allocation tagging, OSS storage tiering, analytics pricing, and idle resource elimination.
alibaba-waf-reliability-review
Assess Alibaba Cloud workload reliability: multi-AZ ECS topology, SLB/ALB/NLB load balancing, Auto Scaling health policies, RDS/PolarDB HA failover, backup and cross-region DR, and Cloud Monitor/ARMS observability coverage.
alibaba-waf-security-review
Assess Alibaba Cloud workload security posture: RAM least-privilege, VPC isolation, KMS/HSM encryption, Cloud Security Center threat detection, ActionTrail audit, WAF/Anti-DDoS web protection, and Chinese regulatory compliance (MLPS 2.0, DSL, PIPL).
argo-rollouts-progressive-delivery-review
Use this skill when reviewing Argo Rollouts progressive delivery configuration. Trigger when the user asks about canary or blue-green Rollout strategy correctness, AnalysisTemplate success/failure conditions, traffic weighting provider alignment, canaryService isolation, PDB deadlock risk with Rollout maxSurge settings, automated rollback posture, or manual vs automated promotion configuration.
argocd-gitops-review
Use this skill for Argo CD GitOps review across Application, AppProject, ApplicationSet, sync windows, RBAC, sync impersonation, and Argo CD Agent multi-cluster topologies. Trigger when the user asks whether an Argo CD configuration is safe for production, whether automated sync should be enabled, whether prune+selfHeal is appropriate, whether AppProject scope is too wide, or how to enforce least-privilege sync identity.
aws-agentcore
Build, test, migrate, integrate, and deploy Amazon Bedrock AgentCore agents. Use for AgentCore runtime, local development, import/migration, deployment, Memory, Gateway/MCP tools, Identity, Observability, Browser, Code Interpreter, policy, and harness-vs-code-path decisions. Load references only when that component is needed.
aws-api-edge-delivery-review
Review AWS API and edge delivery posture across API Gateway, CloudFront, AWS WAF, Shield, ALB, custom domains, TLS policies, authentication, authorization, throttling, quotas, caching, origin protection, logging, and abuse controls. Use when public APIs, web entry points, or edge delivery can affect security and availability.
aws-bedrock-agent-security-governor
Review Amazon Bedrock agents, AgentCore, Guardrails, knowledge bases, action groups, memory, MCP/tool integrations, prompt-injection and prompt-leakage defenses, PII handling, encryption, logging, observability, and least-privilege IAM. Use for AWS-native GenAI and agent security posture.
aws-change-impact-advisor
Assess AWS change impact using change sets, deployment blast radius, rollback readiness, dependency mapping, risk, go/no-go context, approval context, and stakeholder communication. Prefer this for non-destructive pre-change advisory work; prefer IaC or platform-specific skills for deep implementation review.
business-combinations-advisor
Multi-jurisdiction business combinations reference framework covering acquisition accounting, purchase price allocation, goodwill, and post-combination integration under ASC 805 and IFRS 3.
close-cycle-advisor
Multi-jurisdiction financial close cycle reference framework covering month-end, quarter-end, and year-end close. Provides regulatory filing deadlines by jurisdiction (SEC, EU TD, UK DTR, TSE/FSA, CSRC, SEBI, ASX, HKEX), record-to-report process steps, reconciliation standards, intercompany elimination requirements (ASC 810/IFRS 10), FX translation methodology (ASC 830/IAS 21), deferred tax computation (ASC 740/IAS 12), and GAAP variant comparison tables across US GAAP, IFRS, UK FRS 102, German HGB, JGAAP, CAS, and Ind AS. Advisory only — all outputs require external auditor verification for local statutory purposes.
consolidation-intercompany-advisor
Multi-jurisdiction consolidation scope and intercompany elimination reference framework covering ASC 810 / IFRS 10 control models, VIE (Variable Interest Entity) primary beneficiary analysis, NCI measurement, equity method accounting (ASC 323 / IAS 28), intercompany eliminations (sales, profit-in-inventory, debt, interest, dividends), deferred tax on IC eliminations (ASC 740 / IAS 12), and adversarial group reporting scenarios across US GAAP, IFRS, German HGB, JGAAP, CAS, and Ind AS.
equity-compensation-advisor
Multi-jurisdiction equity-based compensation reference framework covering stock options, RSUs, ESPPs, and performance awards under ASC 718 and IFRS 2.
fixed-assets-advisor
Multi-jurisdiction fixed assets, depreciation, and impairment reference framework covering PP&E, intangibles, right-of-use assets, and goodwill under US GAAP and IFRS.
fx-translation-advisor
Multi-jurisdiction reference framework for foreign currency translation and remeasurement covering functional currency determination, ASC 830 / IAS 21 method selection, CTA in OCI, highly inflationary economy treatment, net investment hedge interactions, and multi-GAAP comparison across US GAAP, IFRS, German HGB, JGAAP, CAS 19, and Ind AS 21.
hedge-accounting-advisor
Multi-jurisdiction hedge accounting reference framework covering ASC 815 (US GAAP) and IFRS 9 hedge designation, effectiveness testing, OCI mechanics, IFRS 9 rebalancing, cost-of-hedging approach, discontinuation rules, embedded derivatives, and local GAAP treatments (German HGB §254, JGAAP ASBJ No.10, CAS 24, Ind AS 109). Includes fair value hedges, cash flow hedges, and net investment hedges with a multi-jurisdiction comparison table. Advisory only — all outputs require verification by qualified accountants and external auditors.
indirect-tax-einvoicing-advisor
Multi-jurisdiction indirect tax and e-invoicing reference framework covering VAT/GST compliance and mandatory electronic invoicing mandates across EU, Brazil, India, Mexico, China, UK, and Australia.
lease-accounting-advisor
Multi-jurisdiction lease accounting reference framework covering ASC 842 (US GAAP) and IFRS 16, with additional coverage of UK FRS 102 (2024 periodic review amendments effective 1 Jan 2026), German HGB, JGAAP (ASBJ Statement No. 34, effective FY beginning on/after 1 Apr 2027), CAS No. 21 (China), and Ind AS 116 (India). Covers lease identification, lessee classification (ASC 842 dual model vs. IFRS 16 single finance model), right-of-use asset and lease liability measurement, discount rates (incremental borrowing rate vs. rate implicit in lease), lessor accounting (sales-type / direct-financing / operating), short-term and low-value exemptions, lease modifications and remeasurement, and sale-leaseback transactions. Advisory only — all outputs require external auditor verification for local statutory purposes.
payroll-advisor
Multi-jurisdiction payroll accounting reference framework covering compensation expense recognition, employee benefits, pension/post-retirement obligations, and payroll tax compliance.
procure-to-pay-advisor
Multi-jurisdiction procure-to-pay accounting reference covering PO matching, AP accruals, vendor management, and related compliance.
revenue-recognition-advisor
Apply the ASC 606 / IFRS 15 five-step revenue recognition model to described arrangements. Provides the complete five-step framework with paragraph citations, judgment-area reference tables, confidence-scoring guidance, common restatement triggers, GAAP/IFRS delta checklist, and official documentation URLs. Use when analyzing revenue recognition treatment for SaaS, licenses, professional services, multi-element arrangements, and channel partnerships. Advisory only — all outputs require external auditor review for material amounts.
tax-provision-advisor
Multi-jurisdiction corporate income tax provision reference framework covering ASC 740 (US GAAP) and IAS 12 (IFRS). Covers current vs. deferred tax, temporary and permanent differences, deferred tax asset/liability recognition and measurement, valuation allowance (more-likely-than-not), uncertain tax positions (FIN 48 / ASC 740-10 two-step vs. IFRIC 23), OECD Pillar Two GloBE (IAS 12.4A mandatory temporary exception vs. ASC 740 no equivalent exception), enacted vs. substantively enacted tax rates, effective tax rate reconciliation, APB 23 / ASC 740-30 indefinite reinvestment assertion, intraperiod tax allocation, interim provision (estimated annual ETR method), and local GAAP variations (HGB, JGAAP/ASBJ, CAS 18, Ind AS 12). Advisory only — all outputs require verification by qualified tax counsel and external auditors.
azure-identity-py
Azure Identity SDK for Python authentication. Use for DefaultAzureCredential, managed identity, service principals, and token caching. Triggers: "azure-identity", "DefaultAzureCredential", "authentication", "managed identity", "service principal", "credential".
azure-identity-rust
Azure Identity SDK for Rust authentication. Use for DeveloperToolsCredential, ManagedIdentityCredential, ClientSecretCredential, and token-based authentication. Triggers: "azure-identity", "DeveloperToolsCredential", "authentication rust", "managed identity rust", "credential rust".
azure-identity-ts
Authenticate to Azure services using Azure Identity SDK for JavaScript (@azure/identity). Use when configuring authentication with DefaultAzureCredential, managed identity, service principals, or interactive browser login.
azure-kubernetes
Plan, create, and configure production-ready Azure Kubernetes Service (AKS) clusters. Covers Day-0 checklist, SKU selection (Automatic vs Standard), networking options (private API server, Azure CNI Overlay, egress configuration), security, and operations (autoscaling, upgrade strategy, cost analysis). WHEN: create AKS environment, provision AKS environment, enable AKS observability, design AKS networking, choose AKS SKU, secure AKS.
c4-container
Expert C4 Container-level documentation specialist. Synthesizes Component-level documentation into Container-level architecture, mapping components to deployment units, documenting container interfaces as APIs, and creating container diagrams. Use when synthesizing components into deployment containers and documenting system deployment architecture.
container-orchestration
Docker and Kubernetes patterns. Triggers on: Dockerfile, docker-compose, kubernetes, k8s, helm, pod, deployment, service, ingress, container, image.
data-processing
Process JSON with jq and YAML/TOML with yq. Filter, transform, query structured data efficiently. Triggers on: parse JSON, extract from YAML, query config, Docker Compose, K8s manifests, GitHub Actions workflows, package.json, filter data.
deployment-automation
Automate application deployment to cloud platforms and servers. Use when setting up CI/CD pipelines, deploying to Docker/Kubernetes, or configuring cloud infrastructure. Handles GitHub Actions, Docker, Kubernetes, AWS, Vercel, and deployment best practices.
deployment-procedures
Production deployment principles and decision-making. Safe deployment workflows, rollback strategies, and verification. Teaches thinking, not scripts.
devops-troubleshooter
Expert DevOps troubleshooter specializing in rapid incident response, advanced debugging, and modern observability. Masters log analysis, distributed tracing, Kubernetes debugging, performance optimization, and root cause analysis. Handles production outages, system reliability, and preventive monitoring. Use PROACTIVELY for debugging, incident response, or system troubleshooting.
docker-k8s
Master containerization and orchestration with security-first approach. Expert in Docker multi-stage builds, Kubernetes zero-trust deployments, security hardening, GitOps workflows, and production-ready patterns for cloud-native applications. Includes 2025 best practices from CNCF and major cloud providers.
helm-chart-architect
Design production-grade Helm charts through architectural reasoning rather than pattern retrieval. Activate when designing new Helm charts for Kubernetes deployments, evaluating chart architecture, making decisions about component packaging, or reviewing charts for extensibility and maintainability. Guides decision-making about dependencies, lifecycle hooks, configuration surface, and multi-environment deployment through context-specific reasoning rather than generic best practices.
hybrid-cloud-architect
Expert hybrid cloud architect specializing in complex multi-cloud solutions across AWS/Azure/GCP and private clouds (OpenStack/VMware). Masters hybrid connectivity, workload placement optimization, edge computing, and cross-cloud automation. Handles compliance, cost optimization, disaster recovery, and migration strategies. Use PROACTIVELY for hybrid architecture, multi-cloud strategy, or complex infrastructure integration.
kubernetes-architect
Expert Kubernetes architect specializing in cloud-native infrastructure, advanced GitOps workflows (ArgoCD/Flux), and enterprise container orchestration. Masters EKS/AKS/GKE, service mesh (Istio/Linkerd), progressive delivery, multi-tenancy, and platform engineering. Handles security, observability, cost optimization, and developer experience. Use PROACTIVELY for K8s architecture, GitOps implementation, or cloud-native platform design.
mlops-engineer
Build comprehensive ML pipelines, experiment tracking, and model registries with MLflow, Kubeflow, and modern MLOps tools. Implements automated training, deployment, and monitoring across cloud platforms. Use PROACTIVELY for ML infrastructure, experiment management, or pipeline automation.
security-analyzer
Comprehensive security vulnerability analysis for codebases and infrastructure. Scans dependencies (npm, pip, gem, go, cargo), containers (Docker, Kubernetes), cloud IaC (Terraform, CloudFormation), and detects secrets exposure. Fetches live CVE data from OSV.dev, calculates risk scores, and generates phased remediation plans with TDD validation tests. Use when users mention security scan, vulnerability, CVE, exploit, security audit, penetration test, OWASP, hardening, dependency audit, container security, or want to improve security posture.
pronounce-word
User asks how to pronounce an English word or project/product name ("how to pronounce X", "pronounce X", "X 怎么读", "X 怎么发音", "读一下 X"). Generate audio via the `say-it` CLI so the user actually HEARS the word — three times by default — instead of only writing IPA/syllable hints. The CLI consults a community-maintained pronunciation dictionary (kubectl → "koob-control", GIF → "jif", JSON → "jay-son", ...) and feeds an English-like respelling to the system TTS engine (macOS `say`, Linux `espeak-ng`, or Windows PowerShell) so project names come out the way engineers actually say them. Triggers on a single word or short phrase the user explicitly wants spoken.
hunt-cloud-misconfig
Hunt cloud / infrastructure misconfigurations. AWS: public S3 buckets (s3:GetObject anonymous), permissive bucket policies (PutObjectAcl public-write), exposed CloudFront origin, public Lambda function URL, public RDS snapshot, IAM credentials in JS bundles, AWS metadata accessible via SSRF. GCP: public GCS buckets, exposed Cloud Run services, leaked service account JSON. Azure: public blob containers, exposed Function App. (Kubernetes/Docker exposure is owned by hunt-k8s; CI/CD pipeline attacks by hunt-cicd; post-credential IAM escalation by cloud-iam-deep.) Detection: targeted dorking, certificate transparency, JS bundle secret extraction, port scan for known service ports. Validate: actual data read / write / RCE. Use when hunting cloud-native storage and compute misconfig (S3/GCS/Blob, IMDS-via-SSRF, serverless, public managed services).
canvas-sync
Synchronize canvas state across team sessions via git. Ensures all team members see the same product knowledge.
devops-specialist
DevOps 与运维专家。精通 CI/CD、容器化、编排、基础设施即代码、监控告警和自动化部署。用于构建高效、可靠的软件交付流水线和运维系统。
asdf
Use this skill whenever the user wants to install, configure, or use asdf (asdf-vm), the universal version manager. Trigger for any mention of asdf, .tool-versions files, managing runtime versions, switching between versions of Node.js, Python, Ruby, Go, Terraform, kubectl, Java, Erlang, Elixir, or any other tool managed by asdf. Also trigger when migrating from nvm, pyenv, rbenv, goenv, tfenv, or similar single-language version managers. Use this skill for help with asdf plugins, asdf install, asdf set/global/local, troubleshooting shims, Fish/Bash/Zsh shell configuration, and multi-project version isolation workflows.
implementing-secrets-management-with-vault
本技能涵盖在云环境中部署 HashiCorp Vault 进行集中式密钥管理,包括为数据库和云提供商生成动态密钥、传输加密(Transit Encryption)、PKI 证书管理以及 Kubernetes 集成。通过实现短生命周期、自动轮换的密钥,解决应用代码和 CI/CD 流水线中硬编码凭据的问题。
context-mode
Use context-mode tools (ctx_execute, ctx_execute_file) instead of Bash/cat when processing large outputs. Triggers: "analyze logs", "summarize output", "process data", "parse JSON", "filter results", "extract errors", "check build output", "analyze dependencies", "process API response", "large file analysis", "page snapshot", "browser snapshot", "DOM structure", "inspect page", "accessibility tree", "Playwright snapshot", "run tests", "test output", "coverage report", "git log", "recent commits", "diff between branches", "list containers", "pod status", "disk usage", "fetch docs", "API reference", "index documentation", "call API", "check response", "query results", "find TODOs", "count lines", "codebase statistics", "security audit", "outdated packages", "dependency tree", "cloud resources", "CI/CD output". Also triggers on ANY MCP tool output that may exceed 20 lines. Subagent routing is handled automatically via PreToolUse hook.
go-grpc
Use when implementing or reviewing gRPC servers/clients in Go. Covers .proto organisation, code generation with protoc/buf, server bootstrap (interceptors, health, graceful shutdown), client patterns (reuse, deadlines, retries), status.Code error handling, streaming, TLS/mTLS, and bufconn testing. Apply when writing .proto files, adding interceptors, or auditing a service for production readiness.
open-forge
Self-host any open-source app on the user's own infrastructure (cloud VM, VPS, Raspberry Pi, localhost, k8s, PaaS). Walks the user through provisioning, DNS, TLS, SMTP, and hardening in phased + resumable workflows. 2216+ verified recipes plus live-derived fallback for the long tail. Agent-mode rules apply (no chat-paste credentials, no group-channel deploys).
careful
Safety guardrails for destructive commands. Warns before rm -rf, DROP TABLE, force-push, git reset --hard, kubectl delete, and similar destructive operations. User can override each warning. Use when touching prod, debugging live systems, or working in a shared environment. Use when asked to "be careful", "safety mode", "prod mode", or "careful mode".
devops-automator
Expert DevOps engineer specializing in infrastructure automation, CI/CD pipeline development, and cloud operations
backend-engineering
Use this skill when designing backend systems, databases, APIs, or services. Triggers on schema design, database migrations, indexing strategies, distributed systems architecture, microservices, caching, message queues, observability setup, logging, metrics, tracing, SLO/SLI definition, performance optimization, query tuning, security hardening, authentication, authorization, API design (REST, GraphQL, gRPC), rate limiting, pagination, and failure handling patterns. Acts as a senior backend engineering advisor for mid-level engineers leveling up.
ci-cd-pipelines
Use this skill when setting up CI/CD pipelines, configuring GitHub Actions, implementing deployment strategies, or automating build/test/deploy workflows. Triggers on GitHub Actions, CI pipeline, CD pipeline, deployment automation, blue-green deployment, canary release, rolling update, build matrix, artifacts, and any task requiring continuous integration or delivery setup.
cloud-aws
Use this skill when architecting on AWS, selecting services, optimizing costs, or following the Well-Architected Framework. Triggers on EC2, S3, Lambda, RDS, DynamoDB, CloudFront, IAM, VPC, ECS, EKS, SQS, SNS, API Gateway, and any task requiring AWS architecture decisions, service selection, or cost management.
autoresearch
Karpathy-pattern autoresearch — autonomous hill-climbing over a measurable metric, deep multi-agent research, or research-then-optimize. Three modes: Optimize (keep/discard ratchet), Research (STORM multi-perspective), Improve.
think-twice
Forces Claude to pause before picking an implementation approach and ask: "Is there a cleverer, cheaper way?" Triggers when the request involves generating data or fixtures (lists, datasets, sample records), implementing a problem that is likely already solved by a stdlib function, package, or public API (validation, parsing, lookups, auth, date/currency/geo data), or any implementation expected to exceed ~20 lines. Does NOT trigger when the user has explicitly chosen the approach or library, when the task is under ~10 lines, when fixing a bug in existing code, or for infra/terraform/k8s and DB queries. Run the checklist before writing code, stop at the first question that reveals a cheaper path, and take that path.
aws-solution-architect
Expert AWS solution architecture for startups focusing on serverless, scalable, and cost-effective cloud infrastructure with modern DevOps practices and infrastructure-as-code
devops-needs-assessment
Plain-language DevOps triage for non-experts. Given an app path or description, scores nine dimensions on a 0”“3 scale and names the top three fixes. Jargon-free output with pointers into the other eight DevOps skills.
kubernetes-manifest-audit
Audit Kubernetes manifests, Helm charts, and Kustomize overlays against CIS Kubernetes Benchmark and NSA/CISA hardening — pod security, resources, probes, RBAC, networking, secrets, availability. Static, live, apply, runtime modes.
lazy-agent-loader
Load agent definitions on-demand to reduce context usage. Only loads full agent when needed.
self-consistency
Generate N independent reasoning paths and vote on the answer. Use for architectural trade-offs, ambiguous design decisions, or when single-path reasoning risks locking onto the first plausible answer. Paper: Wang et al. 2022.
ai-psychosis-mode
True alias for reality-check-mode. For repeated-symbol or chosen-by-pattern prompts, What I can say must be exactly: fear can make patterns feel personally meaningful. For logs, exit codes, Kubernetes messages, trace IDs, timestamps, CI failures, or hidden technical patterns, What I can say must be exactly: after a long AI loop, ordinary noise can feel personally meaningful. Use only Grounding, What I can say, and Safer next step. Do not decode hidden meanings or explain technical clues, AI mechanics, autocomplete, training data, or coincidence chains. Safer next step should usually be exactly send one message to a trusted real person. No bullets, numbered lists, message drafts, or pattern analysis.
better-stack-logging
Use this skill when working with Better Stack log management (Logtail) -- querying logs, managing log sources, structured log search, log-based alerting, and log analysis workflows.
secure-linux-web-hosting
Use when setting up, hardening, or reviewing a cloud server for self-hosting, including DNS, SSH, firewalls, Nginx, static-site hosting, reverse-proxying an app, HTTPS with Let's Encrypt or ACME clients, safe HTTP-to-HTTPS redirects, or optional post-launch network tuning such as BBR.
operating-infra
Author, inspect, troubleshoot, and review infrastructure across IaC, Kubernetes, cloud resources, containers, CI/CD, and Linux hosts. Use when changing Terraform/OpenTofu, Kubernetes, Helm, Kustomize, Dockerfiles, GitHub Actions, AWS, GCP, Cloud Run, BigQuery, IAM, logs, instances, or service health. NOT for deploy/apply/rollback workflows (see deploying-infra). NOT for shell scripts or generic command pipelines (see writing-shell).
chaos-engineer
Designs chaos experiments, creates failure injection frameworks, and facilitates game day exercises for distributed systems — producing runbooks, experiment manifests, rollback procedures, and post-mortem templates. Use when designing chaos experiments, implementing failure injection frameworks, or conducting game day exercises. Invoke for chaos experiments, resilience testing, blast radius control, game days, antifragile systems, fault injection, Chaos Monkey, Litmus Chaos.
devops-engineer
Creates Dockerfiles, configures CI/CD pipelines, writes Kubernetes manifests, and generates Terraform/Pulumi infrastructure templates. Handles deployment automation, GitOps configuration, incident response runbooks, and internal developer platform tooling. Use when setting up CI/CD pipelines, containerizing applications, managing infrastructure as code, deploying to Kubernetes clusters, configuring cloud platforms, automating releases, or responding to production incidents. Invoke for pipelines, Docker, Kubernetes, GitOps, Terraform, GitHub Actions, on-call, or platform engineering.
kubernetes-specialist
Use when deploying or managing Kubernetes workloads. Invoke to create deployment manifests, configure pod security policies, set up service accounts, define network isolation rules, debug pod crashes, analyze resource limits, inspect container logs, or right-size workloads. Use for Helm charts, RBAC policies, NetworkPolicies, storage configuration, performance optimization, GitOps pipelines, and multi-cluster management.
adr
Capture architectural decisions as structured ADRs (Architecture Decision Records). Use when user says 'record this decision', 'ADR this', 'why did we choose X', 'document this trade-off', 'we decided to...', or when a significant choice is made between alternatives (framework, database, pattern, API design, infra approach).
ci
GitLab CI/CD pipeline review and scaffolding for Terraform and Helm/EKS deployments. Use when user says 'review my pipeline', 'check my gitlab-ci', 'scaffold a pipeline', 'is my CI correct', or when working in .gitlab-ci.yml files.
docker
Docker operations, Dockerfile best practices, Compose, image optimization, and registry workflows. Use when user says 'review my Dockerfile', 'optimize my image', 'reduce image size', 'container won't start', 'set up compose', 'multi-stage build', or when working in Dockerfile, docker-compose*.yml, or .dockerignore files.
github
GitHub repository operations — PRs, issues, releases, branch protection, CODEOWNERS, security settings. Use when user says 'review my PR', 'create a release', 'set up branch protection', 'add CODEOWNERS', 'audit repo settings', or asks about GitHub repo configuration.
github-actions
GitHub Actions workflow review, scaffolding, and security hardening. Use when user says 'review my workflow', 'check my actions', 'scaffold a workflow', 'is my CI correct', 'pin actions', 'OIDC to AWS', or when working in .github/workflows/*.yml files.
k8s
Kubernetes and Helm review and scaffolding for EKS workloads. Use when user says 'review my helm values', 'before I deploy', 'scaffold a new service', 'check values.yaml', or when working in values.yaml, Chart.yaml, or Helm template files.
tf
Generic Terraform review, scaffolding, and version upgrades for AWS infrastructure using the terraform-aws-modules ecosystem. Use when user says 'review my terraform', 'before I raise an MR', 'scaffold a lambda/rds/s3/eks/vpc', 'check my .tf files', 'upgrade provider', or when working in .tf or .tfvars files. NOTE: if the repo has an `_modules/` directory wrapping `clouddrove/*/aws` modules, use /clouddrove:wrapper-tf instead — the two patterns conflict.
wrapper-tf
Team standard for AWS Terraform repos built on the CloudDrove wrapper-module pattern. Use when working in a repo with an `_modules/` directory that wraps `clouddrove/*/aws` modules, scaffolding a new wrapper module, generating Terraform GitHub Actions CI, reviewing wrapper-pattern PRs, or mapping the pattern to SOC2/GDPR controls. Supersedes /tf on CloudDrove repos.
dotnet-core-expert
Use when building .NET 10 applications with minimal APIs, clean architecture, or cloud-native microservices. Invoke for Entity Framework Core, CQRS with MediatR, JWT authentication, AOT compilation.
terraform-iac-expert
Terraform and OpenTofu infrastructure as code — module design, state management, multi-environment setups, remote backends, secrets management, CI/CD integration. NOT for Pulumi, CDK, Ansible, or Kubernetes manifests.
kubesphere-devops-argocd
Use when configuring ArgoCD in KubeSphere DevOps, including GitOps deployments, application management, SSO setup, or troubleshooting ArgoCD issues
kubesphere-devops-credentials
Use when managing credentials in KubeSphere DevOps, including repository credentials, kubeconfig, and API tokens
kubesphere-devops-jenkins
Use when configuring Jenkins in KubeSphere DevOps, including agent customization, LDAP/OIDC integration, build artifact retrieval, or troubleshooting Jenkins issues
kubesphere-devops-overview
Use when working with KubeSphere DevOps extension, CI/CD pipelines, Jenkins integration, or pipeline troubleshooting
kubesphere-devops-pipeline
Use when creating, running, or managing CI/CD pipelines in KubeSphere DevOps, including pipeline API operations and run monitoring
kubesphere-devops-tenant
Use when operating KubeSphere DevOps as a namespace-scoped tenant with limited permissions, without cluster-admin access, or when accessing DevOps through KubeSphere APIs only
opensearch
Use when installing or configuring the OpenSearch extension for KubeSphere, which provides distributed search and analytics engine for storing logs, events, auditing, and notification history
vector
Use when installing or configuring the WizTelemetry Data Pipeline (vector) extension for KubeSphere, which provides data collection, transformation, and routing for observability data including logs, auditing, events, and notifications
whizard-auditing
Use when working with WizTelemetry Auditing extension for KubeSphere, including installation, configuration, and audit query API
whizard-events
Use when working with WizTelemetry Events extension for KubeSphere, including installation, configuration, and event query API
whizard-logging
Use when working with WizTelemetry Logging extension for KubeSphere, including installation, configuration, and log query API
whizard-telemetry
Use when installing or configuring the WizTelemetry Platform Service extension for KubeSphere, which provides the common APIServer backend services for all WizTelemetry observability extensions
sandbox0
Use this skill when an AI agent developer wants to add Sandbox0 sandboxing to their own agent through the CLI or SDK, or needs help choosing templates, contexts, volumes, network policy, ports, webhooks, or self-hosted deployment. It uses only files bundled inside the skill.
neo-azure-pipelines
Use this skill when the user asks to create, review, debug, or modernize Azure Pipelines YAML for CI/CD, especially .NET builds, Azure App Service deploys, or IIS/on-premises deploys. Prefer bundled templates and verify task syntax against Microsoft docs when version-specific accuracy matters.
smart-learn
独立学习技能,基于费曼学习法的五步闭环。触发方式:输入 /smart-learn 主题,或说"教我XX"、"帮我系统学习XX"。每学完一步自动同步更新 Word 文档 + Mermaid 思维导图。
distributed-tracing
Implement distributed tracing with Jaeger and Tempo to track requests across microservices and identify performance bottlenecks. Use when debugging microservices, analyzing request flows, or implementing observability for distributed systems.
github-actions-templates
Create production-ready GitHub Actions workflows for automated testing, building, and deploying applications. Use when setting up CI/CD with GitHub Actions, automating development workflows, or creating reusable workflow templates.
gitlab-ci-patterns
Build GitLab CI/CD pipelines with multi-stage workflows, caching, and distributed runners for scalable automation. Use when implementing GitLab CI/CD, optimizing pipeline performance, or setting up automated testing and deployment.
gitops-workflow
Implement GitOps workflows with ArgoCD and Flux for automated, declarative Kubernetes deployments with continuous reconciliation. Use when implementing GitOps practices, automating Kubernetes deployments, or setting up declarative infrastructure management.
helm-chart-scaffolding
Design, organize, and manage Helm charts for templating and packaging Kubernetes applications with reusable configurations. Use when creating Helm charts, packaging Kubernetes applications, or implementing templated deployments.
k8s-manifest-generator
Create production-ready Kubernetes manifests for Deployments, Services, ConfigMaps, and Secrets following best practices and security standards. Use when generating Kubernetes YAML manifests, creating K8s resources, or implementing production-grade Kubernetes configurations.
k8s-security-policies
Implement Kubernetes security policies including NetworkPolicy, PodSecurityPolicy, and RBAC for production-grade security. Use when securing Kubernetes clusters, implementing network isolation, or enforcing pod security standards.
multi-cloud-architecture
Design multi-cloud architectures using a decision framework to select and integrate services across AWS, Azure, and GCP. Use when building multi-cloud systems, avoiding vendor lock-in, or leveraging best-of-breed services from multiple providers.
prometheus-configuration
Set up Prometheus for comprehensive metric collection, storage, and monitoring of infrastructure and applications. Use when implementing metrics collection, setting up monitoring infrastructure, or configuring alerting systems.
advance-gitops-app-manifests-to-newer-container-tags-with-argo-c
Track approved container images and write back the matching GitOps manifest changes instead of hand-editing tags across Argo CD applications.
analyze-kubernetes-cluster-issues-through-mcp-with-k8sgpt
Run K8sGPT as an MCP server so an agent can scan a Kubernetes cluster, explain unhealthy resources, and return prioritized remediation clues in natural language.
analyzing-projects
Analyzes codebases to understand structure, tech stack, patterns, and conventions. Use when onboarding to a new project, exploring unfamiliar code, or when asked "how does this work?" or "what's the architecture?"
aws-solution-architect
Expert AWS solution architecture for startups focusing on serverless, scalable, and cost-effective cloud infrastructure with modern DevOps practices and infrastructure-as-code
claude-settings-audit
Analyze a repository to generate recommended Claude Code settings.json permissions. Use when setting up a new project, auditing existing settings, or determining which read-only bash commands to allow. Detects tech stack, build tools, and monorepo structure.
configuring-dapr-pubsub
Configures Dapr pub/sub components for event-driven microservices with Kafka or Redis. Use when wiring agent-to-agent communication, setting up event subscriptions, or integrating Dapr sidecars. Covers component configuration, subscription patterns, publishing events, and Kubernetes deployment. NOT when using direct Kafka clients or non-Dapr messaging patterns.
containerizing-applications
Containerizes applications with Docker, docker-compose, and Helm charts. Use when creating Dockerfiles, docker-compose configurations, or Helm charts for Kubernetes. Includes Docker Hardened Images (95% fewer CVEs), multi-stage builds, and 15+ battle-tested gotchas.
deploying-cloud-k8s
Deploys applications to cloud Kubernetes (AKS/GKE/DOKS) with CI/CD pipelines. Use when deploying to production, setting up GitHub Actions, troubleshooting deployments. Covers build-time vs runtime vars, architecture matching, and battle-tested debugging.
deploying-kafka-k8s
Deploys Apache Kafka on Kubernetes using the Strimzi operator with KRaft mode. Use when setting up Kafka for event-driven microservices, message queuing, or pub/sub patterns. Covers operator installation, cluster creation, topic management, and producer/consumer testing. NOT when using managed Kafka (Confluent Cloud, MSK) or local development without K8s.
deploying-postgres-k8s
Deploys PostgreSQL on Kubernetes using the CloudNativePG operator with automated failover. Use when setting up PostgreSQL for production workloads, high availability, or local K8s development. Covers operator installation, cluster creation, connection secrets, and backup configuration. NOT when using managed Postgres (Neon, RDS, Cloud SQL) or simple Docker containers.
devops-iac-engineer
Implements infrastructure as code using Terraform, Kubernetes, and cloud platforms. Designs scalable architectures, CI/CD pipelines, and observability solutions. Provides security-first DevOps practices and site reliability engineering guidance.
distributed-tracing
Implement distributed tracing with Jaeger and Tempo to track requests across microservices and identify performance bottlenecks. Use when debugging microservices, analyzing request flows, or implementing observability for distributed systems.
github-actions-templates
Create production-ready GitHub Actions workflows for automated testing, building, and deploying applications. Use when setting up CI/CD with GitHub Actions, automating development workflows, or creating reusable workflow templates.
gitlab-ci-patterns
Build GitLab CI/CD pipelines with multi-stage workflows, caching, and distributed runners for scalable automation. Use when implementing GitLab CI/CD, optimizing pipeline performance, or setting up automated testing and deployment.
gitops-workflow
Implement GitOps workflows with ArgoCD and Flux for automated, declarative Kubernetes deployments with continuous reconciliation. Use when implementing GitOps practices, automating Kubernetes deployments, or setting up declarative infrastructure management.
gke-expert
Expert guidance for Google Kubernetes Engine (GKE) operations including cluster management, workload deployment, scaling, monitoring, troubleshooting, and optimization. Use when working with GKE clusters, Kubernetes deployments on GCP, container orchestration, or when users need help with kubectl commands, GKE networking, autoscaling, workload identity, or GKE-specific features like Autopilot, Binary Authorization, or Config Sync.
infrastructure
Infrastructure as Code patterns for deploying Guts nodes using Terraform, Docker, and Kubernetes
k8s-security-policies
Implement Kubernetes security policies including NetworkPolicy, PodSecurityPolicy, and RBAC for production-grade security. Use when securing Kubernetes clusters, implementing network isolation, or enforcing pod security standards.
linkerd-patterns
Implement Linkerd service mesh patterns for lightweight, security-focused service mesh deployments. Use when setting up Linkerd, configuring traffic policies, or implementing zero-trust networking with minimal overhead.
multi-cloud-architecture
Design multi-cloud architectures using a decision framework to select and integrate services across AWS, Azure, and GCP. Use when building multi-cloud systems, avoiding vendor lock-in, or leveraging best-of-breed services from multiple providers.
operating-k8s-local
Operates local Kubernetes clusters with Minikube for development and testing. Use when setting up local K8s, deploying applications locally, or debugging K8s issues. Covers Minikube, kubectl essentials, local image loading, and networking.
prometheus-configuration
Set up Prometheus for comprehensive metric collection, storage, and monitoring of infrastructure and applications. Use when implementing metrics collection, setting up monitoring infrastructure, or configuring alerting systems.
senior-computer-vision
World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.
senior-data-engineer
World-class data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, or implementing data governance.
senior-data-scientist
World-class data science skill for statistical modeling, experimentation, causal inference, and advanced analytics. Expertise in Python (NumPy, Pandas, Scikit-learn), R, SQL, statistical methods, A/B testing, time series, and business intelligence. Includes experiment design, feature engineering, model evaluation, and stakeholder communication. Use when designing experiments, building predictive models, performing causal analysis, or driving data-driven decisions.
senior-ml-engineer
World-class ML engineering skill for productionizing ML models, MLOps, and building scalable ML systems. Expertise in PyTorch, TensorFlow, model deployment, feature stores, model monitoring, and ML infrastructure. Includes LLM integration, fine-tuning, RAG systems, and agentic AI. Use when deploying ML models, building ML platforms, implementing MLOps, or integrating LLMs into production systems.
senior-prompt-engineer
World-class prompt engineering skill for LLM optimization, prompt patterns, structured outputs, and AI product development. Expertise in Claude, GPT-4, prompt design patterns, few-shot learning, chain-of-thought, and AI evaluation. Includes RAG optimization, agent design, and LLM system architecture. Use when building AI products, optimizing LLM performance, designing agentic systems, or implementing advanced prompting techniques.
service-mesh-expert
Expert service mesh architect specializing in Istio, Linkerd, and cloud-native networking patterns. Masters traffic management, security policies, observability integration, and multi-cluster mesh con
slb
Simultaneous Launch Button - Two-person rule for destructive commands in multi-agent workflows. Risk-tiered classification, command hash binding, 5 execution gates, client-side execution with environment inheritance. Go CLI.
tilt
Manages Tilt development environments via CLI and Tiltfile authoring. Must use when working with Tilt or Tiltfiles.
timescaledb
TimescaleDB - PostgreSQL extension for high-performance time-series and event data analytics, hypertables, continuous aggregates, compression, and real-time analytics
vm-template-creation
Create, configure, and manage VM templates in Proxmox. Build reusable VM images for rapid deployment of standardized environments, including Kubernetes clusters and managed applications.
kubectl-investigator
Investigate a live or recent incident in a Kubernetes cluster. Anchor the window, bisect the change surface (rollouts, ConfigMaps/Secrets, RBAC, HPA/cluster changes, CronJobs), classify against four reference failure paths (OOM, DNS, cascading-failure, deploy-correlator), confirm the hypothesis with three independent signals, quantify blast radius, and propose mitigation before root cause. Use whenever an agent is asked "what is breaking in the cluster right now", "why did this pod/Deployment just page", "did the rollout cause Z", or to triage an active Kubernetes incident. Vendor-neutral by default (works with kubectl, kube-state-metrics, and whatever telemetry you have); an opt-in Anyshift integration is documented separately.
docker-devops
Create optimized Docker configurations, docker-compose setups, Kubernetes manifests, and CI/CD pipelines. Use when containerizing applications, setting up deployment infrastructure, or automating builds. Triggers on: Docker, Dockerfile, container, docker-compose, Kubernetes, k8s, CI/CD, GitHub Actions, deployment.
setup-container-registry
Configure container image registries including GitHub Container Registry (ghcr.io), Docker Hub, and Harbor with automated image scanning, tagging strategies, retention policies, and CI/CD integration for secure image distribution. Use when setting up a private container registry, migrating from Docker Hub to self-hosted registries, implementing vulnerability scanning in CI/CD pipelines, managing multi-architecture images, enforcing image signing, or configuring automatic cleanup and retention policies.
vmware-harden
Use this skill whenever the user needs to perform VMware compliance auditing, baseline checking, or drift detection on vSphere/ESXi/NSX environments. Directly handles: CIS / DISA STIG / vSphere SCG / 等保 2.0 三级 / PCI-DSS scans; custom YAML baselines; LLM-driven remediation suggestions; web dashboard. Always use this skill for "scan compliance", "check baseline", "audit etcd", "check 等保", "drift detection", "compliance report" when the context is explicitly VMware/vSphere/ESXi. Do NOT use for general vSphere monitoring (use vmware-monitor or vmware-aiops), network changes (use vmware-nsx), or executing remediations directly (this skill only suggests; execution goes through vmware-pilot).
qmd
Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. Supports CLI and MCP integration.
aiperf
NVIDIA AIPerf — vendor-neutral generative-AI inference benchmarking (genai-perf successor). Covers `aiperf profile` with concurrency / request-rate / fixed-schedule trace replay / user-centric / multi-run confidence, 15 endpoint types (chat, completions, embeddings, rankings, responses, image-gen, video-gen, NIM, HF-TGI, template, etc.), 6 custom dataset formats (single_turn, multi_turn, mooncake_trace, bailian_trace, burst_gpt_trace, random_pool), 40+ public datasets, goodput SLOs, GPU + Prometheus telemetry, plot/analyze-trace/synthesize/service subcommands, plugin extensibility, and reasoning-token TTFT/TTFO split.
ansible-idrac-9-10
Run and debug `dellemc.openmanage` Ansible playbooks against Dell PowerEdge **iDRAC 9** (14G–16G) and **iDRAC 10** (17G — R670, R770, R870, R970, XE9780, XE9785). Covers the iDRAC 10 / iDRAC 9 ≥ 7.30.10.50 `BasicAuthState: Unadvertised` default that silently 401s `ansible.builtin.uri` (Dell KB 000437501), the `idrac_session` + `x_auth_token` lifecycle with `block:/always:`, `force_basic_auth: true` fallback for raw Redfish, OMSDK modules (`idrac_firmware`, `idrac_server_config_profile`) that cannot use tokens, iDRAC 10 attribute deltas (`iDRAC.IPv4Static.*` → `iDRAC.IPv4.Static*`, `iDRAC.NIC.*` → `iDRAC.Network.*`, ACME+SCEP → `iDRAC.ACE`, `BIOS.SysSecurity.AcPwrRcvry*` → `System.ServerPwr.*`), iDRAC 9-only modules (`idrac_network` → `idrac_network_attributes`, `idrac_syslog`, `idrac_timezone_ntp`), iDRAC 10 Redfish Jobs URI under `/Oem/Dell/Jobs/`, WS-MAN removed on 17G, and version pins (collection ≥9.12.3 broad / ≥10.0.2 full; 9.12.1 for iDRAC 8).
argo-cd-apps
Author and maintain Argo CD `Application` and `ApplicationSet` manifests as a GitOps consumer (publisher), targeting Argo CD v3.3 / v3.4 (May 2026). Covers source types (Helm, Kustomize, OCI, multi-source, plugin), sync policies + options + waves + hooks, ApplicationSet generators (List, Cluster, Git, Matrix, Merge, SCMProvider, PullRequest, Plugin, ClusterDecisionResource), Progressive Sync (Beta), Source Hydrator (still Alpha), AppProjects, RBAC, sync impersonation (`destinationServiceAccounts`), GPG/cosign signature verification, GitOps repo layout (mono vs poly, app-of-apps vs ApplicationSet — Argo recommends ApplicationSet first), troubleshooting drift / OutOfSync / sync loops / stuck-deletion / hook failures, and v3.0→v3.4 changes (annotation tracking default, SSA-migration regression, CVE-2026-42880 Secret leak). NOT for installing or operating the Argo CD control plane (HA, Dex, repo-server tuning, UI customization).
confluence-best-practices
Advise on USING Confluence well, not operating it: make the structural call — is this a space, a page, or a child page? — diagnose why a wiki is a dread (can't find anything, content rots, duplicates, hidden by permissions, unreadable), and recommend the lean fix. Built FIRST for an agent that ACTS on Confluence (creates/organises/governs content via REST/CQL or an MCP server) and SECOND for helping humans author readable pages. Self-hosted Server/Data Center first (storage format NOT ADF; no native page archive; REST v1), but works for Cloud too. Adapt to the org's own space conventions and working language; never auto-translate content. Covers ALL content types — knowledge base, docs, intranet, meeting notes, runbooks, decision records.
gpu-host-tuning
Audit AND tune Linux/GPU inference hosts — read-only host snapshot (CPU power state, C-states, NUMA topology, PCIe link state, GPU settings, kernel boot params, sysctl, ulimits, IRQ affinity, container runtime), optional pinned-host↔GPU memcpy bench (torch + numactl), and per-lever cheat-sheets to flip settings (governor, EPP, cpuidle, persistence, ECC, hugepages, intel_iommu, NCCL env, tuned-adm profiles, Dell/Supermicro/HPE BIOS guidance). Sits beneath any inference framework (vLLM, sglang, TensorRT-LLM) — about the host, not the framework.
harvester-upgrade
Plan and run a controlled, COMMUNITY-edition Harvester HCI upgrade off an EOL line up to latest stable — the no-skip minor ladder (1.5→1.6→1.7→1.8; embedded RKE2/KubeVirt/Longhorn/SLE-Micro ride along), gated at each hop on first upgrading the EXTERNAL Rancher + a matching Harvester UI-extension (1.6↔Rancher 2.12, 1.7↔2.13, 1.8↔2.14). Covers air-gapped version detection, why node-upgrade order is NOT operator-choosable (forced serial; the pause knob is v1.7.0+ only) and how to protect VM-hosted control planes anyway via anti-affinity spread + N+1 live-migration, making self-managed RKE2 guests Harvester-aware (cloud-provider, CSI, qemu-guest-agent), per-hop breaking changes (wicked→NetworkManager, Intel NIC rename, DHCP IP churn), the enforced pre-flight health gates, and the no-downgrade backup/rollback reality. Companion to k8s-components-checker and rancher-upgrade.
helm
This skill should be used when authoring or maintaining Helm charts — creating charts, writing templates and _helpers.tpl, values.yaml patterns, Chart.yaml, values.schema.json, helm-docs, and library charts. Covers Helm 4 (SSA, WASM, OCI digest), chart CI/CD, OpenShift compatibility, chart security, CRD management, and production templates. NOT for installing or consuming third-party charts.
jinja-expert
Author, read, and debug Jinja2 templates across the three places Jinja lives in 2026 — HuggingFace `chat_template.jinja` (rendered by `apply_chat_template` for vLLM / sglang), Ansible playbooks + `.j2` files, and Jinja-adjacent Kubernetes workflows (`values.yaml.j2`, `kubernetes.core.k8s + template`, Helm post-renderers). Companion to the `helm` skill — Helm charts are Go `text/template` + Sprig, not Jinja, and this skill makes that disambiguation explicit.
jira-best-practices
Advise on USING Jira well, not operating it: make the structural call — is this an epic, a story, a task, or a sub-task? — and diagnose why a Jira is a dread, then recommend the lean fix. Adapt to the organisation's OWN hierarchy names, conventions, and working language instead of imposing a methodology. Self-hosted-first: Jira Data Center 10.3/11.x (no Cloud AI; dual Epic Link + Parent Link). Built for an agent that ACTS on Jira through the jira-cli tool or the mcp-atlassian MCP server while advising the user; Jira web-UI and admin-schema guidance is secondary. Covers ALL project types — software AND non-software (operations, engineering, services, business).
jira-cli
Drive Atlassian Jira from the terminal with the `jira` CLI (jira-cli, v1.7.0) against ANY Jira — Cloud or on-premise/Data Center. Covers the full command surface (issue / epic / sprint / board / project / release), the non-interactive automation contract (`--no-input` + `--plain`/`--raw`/`--csv` for agent-safe, parseable output), JQL filtering, GitHub/Jira markdown → Atlassian Document Format (ADF) conversion, authentication for every backend (Cloud API token, on-prem basic, PAT/bearer, mTLS), and live-discovery of instance-specific values (project keys, issue types, statuses, priorities, link types, custom fields) instead of guessing them.
jira-confluence-mcp
Install, configure, secure, and troubleshoot the mcp-atlassian MCP server (sooperset/mcp-atlassian) that connects an agent to Jira/Confluence — including AIR-GAPPED setup (mirror the prebuilt image by digest; no PyPI/git mirror) and internal-CA / TLS handling (mount the CA vs JIRA_SSL_VERIFY=false). Self-hosted Data Center first: the #1 gotcha is DC uses JIRA_PERSONAL_TOKEN (a PAT), NOT the Cloud username+API-token pattern. Covers `claude mcp add`, the env-var catalog, hardening (READ_ONLY_MODE, TOOLSETS/ENABLED_TOOLS, project filters, the v0.22 default-toolset change), Cloud-vs-DC tool/format divergence, and 401/403/field/rate-limit/SSL fixes. NOT a catalogue of the 72 tools — those self-document at runtime; this is the setup/ops knowledge invisible at call time.
k8s-components-checker
Survey an RKE2 community cluster against an embedded compatibility registry of 19 stack components and produce a verdict for upgrade-readiness, drift-review, and version-skew questions. Components: RKE2, Rancher, Harvester, Cilium, Tetragon, cert-manager, Kyverno, KEDA, Argo CD, Harbor, Traefik, Rook, Ceph, OpenEBS, GitLab, ECK, Zalando postgres-operator, Grafana Mimir, NVIDIA GPU Operator. Works air-gapped — compatibility data lives in `references/compat/`. Surveys run via `kubectl` + `helm` + `pluto` + the apiserver `apiserver_requested_deprecated_apis` metric from the operator's workstation. Community editions only — Prime/EE-gated content is ignored. NOT for installing components, NOT for executing upgrades, NOT for tracking per-cluster running state (the registry is methodology, not inventory).
keda
Configure, operate, and master KEDA (Kubernetes Event-driven Autoscaling) — ScaledObject, ScaledJob, TriggerAuthentication CRDs, 70+ scalers, HPA behavior tuning, scale-to-zero, the KEDA HTTP Add-on, production hardening, multi-trigger semantics, scalingModifiers formulas, GitOps integration, and troubleshooting stuck scalers. Covers the common traps (cooldownPeriod only applies to N→0, CPU/memory cannot drive scale-to-zero alone, activationThreshold vs threshold, multi-trigger max-of semantics, HPA conflicts).
keycloak-iam
Operate, configure, deploy, secure, and integrate with Keycloak (open-source IAM) — the modern Quarkus distribution (24.x–26.6.x), the Keycloak Operator with `Keycloak` and `KeycloakRealmImport` CRDs, and realm/client/identity-provider configuration.
lmcache-mp
LMCache multiprocess (MP) mode — standalone LMCache server in its own pod/process that vLLM connects to over ZMQ. Gives process isolation, no GIL contention on the inference path, one cache shared by multiple vLLM pods per node, and CPU-memory scaling independent of GPU memory. Covers the `LMCacheMPConnector` path (vs the in-process `LMCacheConnectorV1`), the DaemonSet+Deployment K8s pattern and LMCache Operator, the L1 (CPU DRAM) + L2 (NIXL, fs, mooncake_store, s3, Redis) cascade, the `lmcache/standalone` + `lmcache/vllm-openai` image pair, and the production gotchas (`--no-enable-prefix-caching`, `--disable-hybrid-kv-cache-manager`, vLLM/lmcache version pins, hybrid models unsupported, cache_salt fallback bug).
makefile-best-practices
Makefile best practices, patterns, and templates for GNU Make 4.x — dependency graphs, task-runner workflows, parallel-safe recipes, self-documenting help targets, and language-specific patterns (Go, Python, Node, Docker, Helm, POSIX).
nvidia-datacenter-bringup
Bring up NVIDIA HGX/DGX datacenter GPU hosts on Ubuntu 24.04 LTS — air-gapped or connected, Secure Boot enabled. Covers B300/B200/H100/A100/L40S/L4 driver+fabricmanager+NVLSM+DOCA-OFED install order and exact package set from NVIDIA CUDA repo + DOCA repo. Triggers on B300/B200/HGX/DGX install, "fabricmanager won't start", "system not yet initialized" / cudaErrorSystemNotReady, NVLSM missing, ib_umad not loading, DOCA-OFED before NVIDIA driver, nvidia-driver-pinning-XXX, nvlink5-XXX, nvidia-open vs cuda-drivers, "Blackwell requires open kernel modules", ConnectX-7/8 bridge device, FM exact-version-match, gpu-operator cuda-validator CrashLoopBackOff, B300 PCI ID 0x3182, air-gap CUDA + DOCA mirror, three-tier DOCA GPG key, MOK enrollment, DKMS sign, Dell PowerEdge XE9780/XE9785 baseboard firmware v1.4.30, iDRAC Redfish virtual AC cycle DellOemChassis.ExtendedReset, generic "install nvidia driver ubuntu 24.04 datacenter".
nvidia-nixl
NVIDIA Inference Xfer Library (NIXL) operator + developer reference. Point-to-point KV-cache and tensor transport for distributed inference (Dynamo, vLLM, SGLang). Covers the agent API (full Python reference; C++/Rust via upstream pointers), all 13 backend plugins (UCX, GDS, GDS_MT, libfabric, mooncake, posix, hf3fs, obj/S3, azure_blob, gusli, uccl, gpunetio/DOCA, telemetry), build paths (pip nixl-cu12/cu13, meson+ninja from source), ETCD vs side-channel metadata, telemetry (Prometheus + cyclic shared-memory), NIXL-EP elastic MoE device kernels, and Dynamo / vLLM NixlConnector / SGLang integration patterns.
open-webui-embeddings
Wire HuggingFace embedding + reranker models (BGE-M3, BGE-Reranker-v2-m3, etc.) into Open WebUI's RAG pipeline via LiteLLM proxying HuggingFace Text Embeddings Inference (TEI). Covers the exact wire shapes Open WebUI sends (URL auto-append on embed but NOT rerank; payload + response shapes for both modes), the LiteLLM-TEI gotchas (encoding_format=null trap, HF-driver task_type misdetection, openai vs huggingface driver tradeoffs), TEI config cliffs (max-client-batch-size 422 under hybrid search, max-batch-tokens AS the auto-truncate boundary, arch-specific Docker images), and the end-to-end production config. BGE-M3 + BGE-Reranker-v2-m3 are worked examples; patterns generalise to any TEI encoder.
open-webui-valkey-websocket
Deploy Open WebUI multi-pod with WebSockets and Valkey/Redis Sentinel at 1000+ user scale on Kubernetes. Centerpiece is the structural Socket.IO+Redis frame-amplification bug (#23733) that cripples multi-pod streaming, and the maintainer-endorsed mitigation (`CHAT_RESPONSE_STREAM_DELTA_CHUNK_SIZE`). Covers all multi-pod env vars, the custom-model-icon perf history (base64-in-/api/models, fixed late 2025–Apr 2026), the official helm chart's gaps (bundled Redis is unsuitable for production; no HPA/PDB/probes/sticky sessions), and the catalog of known multi-pod issues with current status.
patch
Generate candidate fixes for verified security findings. Consumes TRIAGE.json (preferred), VULN-FINDINGS.json, or an execution-harness results directory. Static-analysis input gets a per-finding patch subagent + an independent reviewer and is written as inert diffs for human review; results-directory input from an external execution harness (the defending-code reference pipeline, if installed) is delegated to its verified build→reproduce→regress→re-attack patch ladder. Writes PATCHES/bug_NN/{patch.diff,patch_result.json}, PATCHES.md, and PATCHES.json. Use when asked to "fix the findings", "patch these vulns", "generate fixes", or "close the loop on triage".
prometheus-mimir-grafana
Query Prometheus and Grafana Mimir, write and debug PromQL, and build or fix Grafana dashboards — for agents solving problems from metrics. Covers the Prometheus HTTP API (`/api/v1/query`, `query_range`, `series`, `labels`, `metadata`), Mimir multi-tenancy (`X-Scope-OrgID`, federation `a|b|c`, per-tenant 422/429 limits), the PromQL surface (selectors, rate family, classic + native histograms, `histogram_quantile`, vector matching `on()`/`group_left`, recording rules), Grafana dashboard JSON (panels, targets, variables + interpolation specifiers, legacy `/api/dashboards/db` vs Grafana-12 `/apis/dashboard.grafana.app/v1beta1/…`), KPI frameworks (RED, USE, Golden Signals, SLO burn-rate), connection recipes, MCP servers vs curl, and the PromQL trap list.
rancher-upgrade
Plan and sequence COMMUNITY-edition Rancher upgrades across air-gapped multi-cluster fleets — a management/"hosting" Rancher cluster plus the downstream RKE2/K3s clusters it provisions. Covers the community release model (2.11→2.14, community-vs-Prime cadence, EOL), the Kontainer Driver Metadata (KDM) downstream- Kubernetes support matrix that decides which downstream k8s minors each Rancher version can manage (and the stranding risk when a host-Rancher bump outruns its sub-clusters), cross-cluster upgrade ordering, the embedded-CAPI→Rancher-Turtles migration, Fleet coupling, cert-manager/Helm/backup prerequisites, backup- restore-operator + etcd rollback, and the air-gapped upgrade procedure (which images/charts/KDM to mirror). Community editions only; Prime-gated content is flagged and excluded. Companion to k8s-components-checker, which owns the management-cluster k8s compatibility verdict; this skill owns the upgrade methodology and downstream coordination.
secure-boot-cert-rotation
Triage and remediate the Microsoft Secure Boot 2011→2023 UEFI certificate rotation (CAs expiring June/October 2026) across Dell PowerEdge / iDRAC9 bare metal, Ubuntu/Linux servers, and Harvester HCI / KubeVirt guest VMs. Establishes the load-bearing fact that UEFI firmware ignores certificate expiry — nothing stops booting on the deadline; the real risk is forward-compat once a 2023-only-signed shim arrives, plus a dbx/revocation freeze — then routes to the cleanest per-platform fix: iDRAC BIOS-staged keys applied on reboot (Dell), fwupd-free manual `db` append that self-authenticates via the existing 2011 KEK (Linux), and the Harvester virt-launcher OVMF floor (v1.6.0) with ephemeral-vs-persistent NVRAM triage (VMs). Covers the PK→KEK→db trust chain, why no generic Microsoft 2023 KEK payload exists, and audit via mokutil / efi-readvar / racadm bioscert / Redfish.
sglang-hicache
SGLang HiCache (hierarchical KV cache) — three-tier prefix cache: GPU HBM (L1) → pinned host DRAM (L2) → distributed L3 (Mooncake / 3FS / NIXL / AIBrix / EIC / SiMM / file / LMCache). Covers `--enable-hierarchical-cache`, all `--hicache-*` flags, write policies, page_first* layouts, prefetch policy (best_effort / wait_complete / timeout), per-rank sizing, MHA / MLA / DSA / Mamba / SWA support matrix (SWA + 3FS hybrid shipped in v0.5.11), runtime attach/detach HTTP admin, and auto-rewrite startup log lines that silently substitute layout × IO × storage combinations.
sglang-model-gateway
SGLang Model Gateway (`sgl-model-gateway`, formerly `sgl-router`) — Rust router fronting vLLM and SGLang inference workers on Kubernetes. Covers first-class vLLM gRPC backend plus HTTP transparent-proxy for vanilla vLLM, the policy set (six `--policy` values, `cache_aware` default), tokenizer-format dispatch (`tokenizer.json` HF-fast vs `tiktoken.model` BPE — including when neither is required because `cache_aware` is text-based), air-gapped recipe (gateway ignores `HF_ENDPOINT`, mount tokenizer files on PVC only when actually needed), K8s manifests with `model_id` labels and per-model RBAC, three HA mitigations (single + PDB, `sessionAffinity: ClientIP`, `--enable-mesh` CRDT sync), and a pitfall catalog covering the Dec 2025 `sgl-router` → `sgl-model-gateway` rename and over-engineered tokenizer init-container traps.
skill-improver
Autoresearch loop for Claude Code skills — greedy keep/discard hill climbing on a 10-dimension quality rubric, with blind subagent validation for self-scoring bias, plus a `freshen` mode that probes external references (release notes, docs, deprecation signals) and applies verified updates, plus a `trigger` mode that measures and tunes the skill's frontmatter description until it reliably fires when it should and stays silent when it shouldn't (60/40 train/test split, 3 runs/query, blinded test scores).
threat-model
Build a threat model for a target codebase. Three modes: "interview" walks an application owner through the four-question framework and produces a threat model from their answers; "bootstrap" derives a threat model from the code plus past vulnerabilities (CVEs, git history, pentest reports) when no owner is available; "bootstrap-then-interview" chains the two when both owner and codebase are present. All write THREAT_MODEL.md in a shared schema. Use when asked to "threat model", "build a threat model", "map the attack surface", or "what should we be worried about in this codebase".
transformers-config-tokenizers-expert
Preflight reference for HuggingFace snapshots — what vLLM, sglang, and transformers.generate see at runtime. Covers config-file precedence (tokenizer.json, tokenizer_config.json, generation_config.json, chat_template.jinja), transformers v5 tokenizer-class taxonomy (TokenizersBackend, PythonBackend, MistralCommonBackend, TikTokenTokenizer), special-token discovery (all_special_ids, added_tokens_decoder, extra_special_tokens, backend_tokenizer.get_added_tokens_decoder), chat-template Jinja contract (ImmutableSandboxedEnvironment, loopcontrols, raise_exception, strftime_now, tojson, add_generation_prompt), and engine knobs (skip_special_tokens, trust_request_chat_template, chat_template_kwargs allowlist, adjust_request, incremental detokenizer, EOS merge). Ships verified 2026 hall-of-shame for Kimi-K2.6, GLM-5.1, Gemma-4, Qwen3, DeepSeek-V3, plus drop-in Python for resolving markers to IDs, detecting turn-primer-as-EOS leaks, and cross-referencing tokenizer.json vs tokenizer_config.json.
triage
Triage a batch of raw security findings. Verify each is real, collapse duplicates, re-rank by derived exploitability, and tag with an owner. Takes a directory or file of scanner output and writes TRIAGE.json + TRIAGE.md sorted by what actually needs engineering attention. Use when asked to "triage findings", "validate scanner output", "prioritize vulns", or "review the backlog". Runs interactively by default; pass --auto to skip the interview.
vllm-benchmarking
Run production vLLM benchmarks — `vllm bench` (serve, throughput, latency, sweep, startup, mm-processor), request-rate vs max-concurrency semantics, TTFT/TPOT/ITL/E2EL percentiles, goodput SLO measurement, prefix-cache workloads, air-gapped operation (HF_ENDPOINT, ModelScope, hf-mirror, offline cache). Methodology split — SLO health checks vs A/B change sweeps — plus pitfalls that produce misleading numbers (no warmup, wrong tokenizer, random-as-prod, `--request-rate inf` alone).
vllm-caching
vLLM tiered KV cache configuration for production H100/H200 clusters. Native CPU offload, LMCache (CPU+NVMe+GDS), NixlConnector (disaggregated prefill), MooncakeConnector (RDMA), MultiConnector composition. Version gates, sizing math (flag total across TP, not per-GPU — opposite of SGLang), KV-vs-weights offload distinction operators most often get wrong.
vllm-chat-templates
vLLM chat-template (prompt-side Jinja) operator reference. Template resolution precedence (`--chat-template` → AutoProcessor → tokenizer default → bundled fallback), `chat_template_kwargs` allowlist silently dropping `add_generation_prompt`/`enable_thinking`/custom kwargs (PR 27622 fix), 27 shipped `tool_chat_template_*.jinja` files, known template-layer bugs for Qwen3/Qwen3-Coder, DeepSeek-R1/V3/V3.1/V3.2, GPT-OSS, Kimi-K2, Llama-4, Mistral (HF vs mistral mode), Gemma-3/4, Phi-4, GLM. Prompt side only — output parsing lives in sibling skills.
vllm-configuration
Configure vLLM completely — YAML config file format, CLI arg precedence, full VLLM_*/HF_*/TRANSFORMERS_* env-var catalog, end-to-end recipe for air-gapped environments (internal HF mirrors, hf-mirror.com, ModelScope, HF_HUB_OFFLINE with pre-seeded cache, gated models offline, trust_remote_code supply-chain implications). VLLM_HOST_IP vs API-host confusion, Kubernetes-service-named-`vllm` env-var poisoning, usage-stats triple opt-out, YAML precedence surprises.
vllm-deployment
Use this skill when authoring, reviewing, or fixing a vLLM Kubernetes manifest, Docker/Podman pod, or OpenShift ServingRuntime — even when the user does not say "vllm". Triggers on: lab cluster performance practices, cache mount + survival across pod restarts (/root/.cache, VLLM_CACHE_ROOT, TORCHINDUCTOR_CACHE_DIR, TRITON_CACHE_DIR, "do we have caches saved"), HF_TOKEN secret in pod env, liveness + readiness probe tuning (initialDelaySeconds, failureThreshold, "pod takes 12 min to boot"), serve_args review, --enforce-eager rationale, MoE deployment ("ep2 dp2", --enable-expert-parallel, expert-parallel sizing), TP/PP sizing, ConfigMap parser-plugin mount, image tag selection, cold-boot reduction, multi-node LWS + Ray, control planes (llm-d, production-stack, AIBrix, NVIDIA Dynamo, KServe), KEDA autoscaling, GAIE routing, disaggregated prefill/decode (Nixl/Mooncake/LMCache/MORI-IO), RHAIIS on OpenShift (SCC, arbitrary UID, Routes 60s, ModelCar, air-gapped). Lead with operator intent, not vendor names.
vllm-gemma-4-31b
Operating-point reference for serving Gemma 4 31B on vLLM — TP sizing, max_model_len, max_num_seqs, gpu_memory_utilization, kv_cache_dtype, EAGLE3 spec-dec, chat_template choice.
vllm-input-modalities
vLLM non-chat inference surfaces — text embeddings (`/v1/embeddings`, `/v2/embed`), reranking/scoring (`/rerank`, `/score`), speech-to-text (`/v1/audio/transcriptions`, `/v1/audio/translations`), document OCR via VLMs. Covers 2026 `--runner pooling` (replacing `--task embed`), v0.20 deprecations (`score`→`classify`, multitask pooling, `encode`→`token_embed`+`token_classify`), Matryoshka/MRL, ColBERT/ColPali/ColQwen late-interaction MaxSim, Cohere v2 `/v2/embed`, Jina v3/v4/v5 quirks, cross-encoder score templates, Whisper large-v3-turbo quants, DeepSeek-OCR recipe (NGramPerReqLogitsProcessor, no prefix cache, GUNDAM mode).
vllm-nvidia-hardware
NVIDIA AI-hardware + vLLM-platform reference covering Hopper (H100/H200), Blackwell (B100/B200/B300) and Blackwell Ultra, Grace-Blackwell superchips and NVL72 racks (GB200, GB300), Vera Rubin (R100/R300) with VR200 NVL144 and Kyber NVL576, Dell PowerEdge XE family and IR5000/IR7000/IR9048 racks. Per-SKU HBM, FP4/FP8/FP16 TFLOPs, NVLink5, TDP, rack power/cooling (135 kW GB300, 180-220 kW NVL144, 600 kW Kyber), DLC vs RDHx, 800 VDC HVDC. Memory-wall roofline, HBM3E→HBM4 supply 2026. vLLM attention-backend × SM matrix, FP4/FP8 paths, KV connectors, Blackwell gotchas (SM103 TRTLLM hang, 270 vs 288 GB B300 bin split).
vllm-observability
Observe production vLLM — `/metrics` Prometheus surface (V1 engine), SLO-driven alerting on TTFT/ITL/queue/KV/preemption/aborts/corrupted-logits, shipping Grafana dashboards in `examples/observability/`, OTLP tracing with `--otlp-traces-endpoint` and `--collect-detailed-traces={model,worker,all}`, diagnostic rules to triage from /metrics alone — queue-grows + TPOT-stable means capacity, queue-stable + TPOT-grows means context/model, DCGM `SM_OCCUPANCY` is the real GPU-saturation signal not `GPU_UTIL`. V1 metric names (kv_cache_usage_perc), gpu_→kv_ rename saga, DCGM-exporter pairing, dashboard-lying pitfalls.
vllm-omni
vLLM-Omni output-side multimodal generation — image (FLUX.1/2, Qwen-Image, GLM-Image, BAGEL, SD3.5, HunyuanImage-3.0), video (Wan2.1/2.2, LTX-2, HunyuanVideo-1.5), TTS (Qwen3-TTS, CosyVoice3, Voxtral-TTS), any-to-any omni (Qwen3-Omni, Qwen2.5-Omni, MiMo-Audio) via `vllm serve --omni`. Stage-based disaggregation (OmniConnector + Mooncake + RDMA), `/v1/images/generations`, async+sync `/v1/videos`, `/v1/audio/speech` with voice-upload, PCM16 WebSocket `/v1/realtime`, Ulysses/Ring SP + CFG-parallel, DiT FP8/INT8/GGUF, CUDA/ROCm/NPU/XPU/MUSA matrix, release pitfalls (v0.19.0rc1 FLUX regression, GLM-Image transformers>=5.0, Qwen3-TTS enforce-eager).
vllm-performance-tuning
vLLM performance-tuning operator reference — tuning workflow (baseline → bottleneck → knob → re-bench), fused-MoE kernel autotune (`benchmark_moe.py` generates `E=N,N=M,device_name=X.json` configs), DeepEP all-to-all + expert parallelism + EPLB, CUDA graph modes (FULL_AND_PIECEWISE default), torch.compile AOT + compile cache, scheduler knobs (`--max-num-batched-tokens`, `--max-num-seqs`, `--async-scheduling`), TP/EP/DP/PP decision tree, NCCL/DCGM on H100/H200/B200/GB200, PD disaggregation (Nixl/Mooncake/LMCache), known regressions + vendor quirks (v0.14→0.15.1 MiniMax, MI300X FP8<BF16, DeepGEMM M<128 TTFT).
vllm-quantization
vLLM datacenter-GPU quantization — picking, configuring, troubleshooting NVFP4, FP8, MXFP4, MXFP8, AWQ, GPTQ, INT8, compressed-tensors, modelopt, quark on H100/H200/B200/B300/GB200/GB300. 29 `--quantization` flag values, KV-cache dtypes (fp8_e4m3, nvfp4, per-token-head, turboquant), MoE backend selection (CUTLASS, TRTLLM, FlashInfer, DeepGEMM, Marlin, Qutlass), producing checkpoints with llm-compressor and NVIDIA ModelOpt (NVFP4_DEFAULT_CFG, FP8_DEFAULT_CFG, W4A16, SmoothQuant+GPTQ), online quantization (`fp8_per_tensor`, `fp8_per_block`), training EAGLE-3/dflash drafters on BF16 targets before PTQ, version gates per vLLM release (v0.14 → v0.21).
vllm-reasoning-parsers
vLLM reasoning-parser operator + developer reference. `--reasoning-parser` CLI wiring, `ReasoningParser` contract (non-streaming `extract_reasoning` + per-delta `extract_reasoning_streaming`), `is_reasoning_end` xgrammar gating, `--structured-outputs-config.enable_in_reasoning` bypass, 25 built-in parsers with per-model quirks, 15 production pitfalls, authoring custom parsers via `@ReasoningParserManager.register_module` or plugin.
vllm-speculative-decoding
Pick, configure, tune, monitor vLLM speculative decoding in production. Eleven SpeculativeMethod options (ngram, ngram_gpu, medusa, mlp_speculator, draft_model, suffix, eagle, eagle3, dflash, mtp, extract_hidden_states), `--speculative-config` JSON schema, which methods pair with which target model family, Prometheus acceptance metric surface, version gates (v0.11.1 EAGLE-3 preamble fix, v0.16 parallel drafting, v0.18 ngram_gpu, v0.19 dflash and zero-bubble), composability with chunked prefill / PP / LoRA / FP8 / structured outputs, Arctic Inference plugin, where spec-dec stops paying at high batch.
vllm-tool-parsers
vLLM tool-calling operator reference — picking `--tool-call-parser` per model family, writing custom parsers via `--tool-parser-plugin`, navigating vLLM source + GitHub tracker to debug any specific tool-call question. Pointer map, not source paraphrase. All 40+ built-in parsers (JSON-sentinel, pythonic, XML, harmony grammars), CLI contract, `prev_tool_call_arr`/`streamed_args_for_tool` flush invariants, diagnostic playbook (isolate template-vs-parser via raw `/v1/completions` + unicodedata codepoint decode).
vuln-scan
Static source-code vulnerability scan. Reads a target directory (and THREAT_MODEL.md if present), spawns parallel review subagents per focus area, and writes VULN-FINDINGS.json + .md for /triage to consume. Read-only — no building, running, or network. For execution-verified crashes (build + run + sanitizer), see HARNESS.md. Use when asked to "scan for vulns", "review this code for security issues", "find bugs in <dir>", "audit this code for vulnerabilities", or as the step between /threat-model and /triage.
careful
Safety guardrails for destructive commands. Warns before rm -rf, DROP TABLE, force-push, git reset --hard, kubectl delete, and similar destructive operations. User can override each warning. Use when touching prod, debugging live systems, or working in a shared environment. Use when asked to "be careful", "safety mode", "prod mode", or "careful mode". (gstack).
kubebolt-copilot
AI copilot skill for KubeBolt — the Kubernetes monitoring platform. This skill provides deep knowledge about Kubernetes clusters, workloads, networking, storage, RBAC, and troubleshooting, combined with real-time awareness of the user's connected cluster data via KubeBolt's REST API. Use this skill whenever the user asks questions about their Kubernetes cluster, wants to troubleshoot pods, deployments, services, nodes, or any K8s resource, asks about cluster health or insights, wants to understand topology or relationships between resources, needs help interpreting metrics (CPU, memory), asks about Gateway API, Ingresses, RBAC, storage, or any Kubernetes concept in the context of their monitored cluster. Also trigger when the user says things like "what's wrong with my cluster", "why is this pod crashing", "show me resource usage", "explain this insight", or any Kubernetes troubleshooting question. This skill powers the KubeBolt in-app chatbot.
devops-engineer
DevOps Engineer (/devops) - Senior DevOps Engineer with 12+ years cloud infrastructure experience. Use when setting up cloud infrastructure, writing Terraform configurations (loads references/terraform.md), creating Kubernetes manifests, building CI/CD pipelines with GitHub Actions, configuring Docker, or managing secrets.
secops-engineer
Soren - Principal Security Engineer with 15+ years application, infrastructure, and cloud security experience. Security review is a safety-override gate, required on security-relevant changes (auth, secrets, PII, external input, etc.) and always in the regulated preset. Use when conducting security reviews, threat modeling (STRIDE/PASTA/LINDDUN), implementing authentication (OAuth 2.1/Passkeys/WebAuthn), supply chain security (SBOM/SLSA), container/K8s hardening, Zero Trust architecture, AI/LLM security, privacy engineering, security scanning pipelines, compliance (GDPR/PCI-DSS/SOC2/ISO27001), or incident response. Primary command: /secops. Alias: /soren.
sre-engineer
SRE / Observability Engineer (/sre) — reliability engineering: SLOs/SLIs & error budgets, monitoring & alerting (Prometheus, Grafana, OpenTelemetry), incident response & runbooks, on-call, capacity & load, chaos/resilience, and post-incident reviews. Use when defining reliability targets, instrumenting observability, setting up alerting, writing runbooks, doing incident response, or reviewing a change for production readiness. Invoke alongside /arch for reliability NFRs and devops-engineer for the underlying infra/CI-CD. NOT for provisioning infra or pipelines (that's devops-engineer) — /sre owns reliability, not the cluster.
ansible
Ansible 自动化运维
azure-cli
Azure CLI 操作
ci-cd
CI/CD 流水线配置
compose
Docker Compose 编排
configmap-secret
Kubernetes ConfigMap 与 Secret
container-ops
Docker 容器操作与管理
deployment
Kubernetes Deployment 管理
dockerfile
Dockerfile 编写最佳实践
elasticsearch
Elasticsearch 集群管理
file-operations
Linux file and directory operations
gcloud
Google Cloud CLI 操作
kubectl-basics
kubectl 基础操作与常用命令
mongodb
MongoDB 数据库管理
mysql
MySQL 数据库管理与运维
network-tools
Linux network tools and diagnostics
networking
Docker 容器网络
postgresql
PostgreSQL 数据库管理
redis
Redis 数据库管理
rsync
rsync 文件同步与备份
service-ingress
Kubernetes Service 与 Ingress
shell-scripting
Bash Shell 脚本编写
terraform
Terraform 基础设施即代码
user-permissions
Linux user and permission management
cicd-pipeline
Generates CI/CD pipeline configurations for GitHub Actions, GitLab CI, and AWS CodePipeline. Covers build, test, lint, security scanning, and deployment stages with caching and parallelism. Triggers on: "create CI/CD pipeline", "GitHub Actions workflow", "deployment pipeline", "automate build".
cicd-pipelines
CI/CD pipeline design and DevOps automation — use when the user mentions GitHub Actions, GitLab CI, Jenkins, Terraform, infrastructure as code, DevSecOps, ArgoCD, Kubernetes deployment automation, or pipeline configuration YAML. NOT for release orchestration or semantic-release workflows (use git-workflow), NOT for Docker containers or Dockerfiles (use docker-containerization), NOT for git branching or commits (use git-workflow).
docker-containerization
Docker and container development — use when the user mentions Dockerfiles, multi-stage builds, Docker Compose, container optimization, image size reduction, DDEV, containerization, or dev environment setup with containers. NOT for CI/CD pipeline YAML or pipeline configuration (use cicd-pipelines), NOT for workflow orchestration or release automation (use workflow-automation), NOT for Kubernetes or container orchestration platforms (use cloud-native tooling).
guard
Use when working near production, sensitive systems, or destructive operations. Activates on-demand safety hooks that block dangerous commands. Supports modes — careful (warn), freeze (block writes outside scope), unfreeze (remove restrictions). Triggers on /guard, /careful, /freeze, /unfreeze.
deploy-preflight
Read-only deployment pre-flight scan — repo state, secrets sync check, deploy-changes diff, migration verify locally. Reports what WOULD break a deploy without committing to one. Use when the user says '/deploy-preflight', 'check the deploy', 'what would deploy break', 'preflight', or wants to validate readiness before running /deploy.
cluster-creator
End-to-end OpenShift cluster creation using Red Hat Assisted Installer. Handles Single-Node OpenShift (SNO) and HA multi-node clusters on baremetal, vsphere, oci, nutanix. Use when: - "Create a new OpenShift cluster" - "Install OpenShift on my servers" - "Set up a single-node cluster for edge deployment" - "Deploy a production HA cluster" Complete workflow: cluster definition, ISO generation, host discovery/validation, role assignment, network configuration (VIPs, static networking), installation monitoring, credential retrieval. NOT for: - Listing existing clusters → Use `/cluster-inventory` skill - Modifying running clusters → Out of scope (Day-2 operations require direct cluster access) - Cluster upgrades (not yet supported)
vanguard-frontier-agentic-install
Install all Vanguard Frontier Agentic Codex agents and companion skills into the current user's ~/.codex home after adding or installing the plugin marketplace.
simplicity-spec
Проход на простоту для ПОСТАНОВОК, ТЗ, АРХИТЕКТУРНЫХ и ИНФРАСТРУКТУРНЫХ РЕШЕНИЙ — ловит и убирает переусложнение СУТИ решения до того, как пользователь увидит результат. Use when drafting or reviewing technical specifications, ТЗ, requirements, architecture decisions, ADR, API design, database schema changes, workflow / state machines, RBAC/ABAC models, microservice boundaries, choosing patterns (CQRS, Event Sourcing, event-driven, saga, rule engine), OR infrastructure / DevOps / deployment decisions (Kubernetes vs docker-compose, cache / queue / replica / sharding, observability stack, CI/CD, новые сервисы / стенды / окружения). Triggers: «постановка», «ТЗ», «спека», «дизайн API», «��зменения в БД», «архитектура», «инфраструктура», «деплой», «devops», «выбрать паттерн», «проверь на простоту», «не переусложни». НЕ для: чистки прозы (это задача стиль-прохода) и гигиены оформления документа. Этот скилл — про сложность РЕШЕНИЯ, не про текст.
helm-chart-scaffolding
Design, organize, and manage Helm charts for templating and packaging Kubernetes applications with reusable configurations. Use when creating Helm charts, packaging Kubernetes applications, or implementing templated deployments.
k8s-manifest-generator
Create production-ready Kubernetes manifests for Deployments, Services, ConfigMaps, and Secrets following best practices and security standards. Use when generating Kubernetes YAML manifests, creating K8s resources, or implementing production-grade Kubernetes configurations.
deploy
Deployment strategy, production-readiness gating, and rollback planning for AWS/EKS services. Use when user says 'how should I deploy this', 'blue-green or canary', 'are we ready to ship', 'production readiness', 'plan a rollback', 'pre-deploy check', or before a first production release. Pairs with /k8s, /ci, /github-actions, /tf which own the per-artifact checks.
finops
AWS cost optimization — waste detection, right-sizing, Savings Plans, RIs, EKS cost, multi-account governance. Use when user says 'reduce AWS bill', 'find waste', 'right-size this', 'should I buy SP or RI', 'gp2 vs gp3', 'EKS is expensive', 'NAT gateway cost', or asks about AWS cost optimization.
owasp
Security review against OWASP Top 10:2025, ASVS 5.0, and Agentic AI risks. Use when user says 'review for security', 'is this secure', 'check for vulnerabilities', 'review auth/authorization', 'check input handling', or when writing cryptography, session management, or AI agent code.
skill-creator
Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
scout
START HERE — Skill discovery and installation assistant. The recommended first skill when you don't know which skills you need. Searches verified-skill.com, recommends plugin bundles, and installs skills. Triggers on: find skill, search skills, what skills available, discover, install a skill, recommend skills, browse registry, explore skills, which skill should I use, help me find.
golang-cli
Golang CLI application development. Use when building, modifying, or reviewing a Go CLI tool — especially for command structure, flag handling, configuration layering, version embedding, exit codes, I/O patterns, signal handling, shell completion, argument validation, and CLI unit testing. Also triggers when code uses cobra, viper, or urfave/cli.
pr-contribution-excellence
Patterns for excellent open-source PR contributions, distilled from analyzing real PRs across repositories
lythoskill-writer
Human-first documentation writer and reviewer. Reviews README, wiki, ADR, daily handoff, showcase, and reference docs for information density, structural rhythm, and anti-template patterns. Ensures human readers get clear prose, not AI-flavored filler.
devops
DevOps practices, CI/CD, and infrastructure management
offensive-osint
Operational arsenal for external red-team and bug-bounty reconnaissance. Concrete wordlists (28 Swagger paths, 13 GraphQL paths, 35 high-risk ports, 6 missing-header findings, 15 always-on HTTP checks, 5 SAML paths, cloud bucket permutations, JS guess-paths, vendor product fingerprints for Citrix/F5/Pulse/Fortinet/Cisco/PaloAlto/VMware/Exchange, cloud-native service fingerprints, container/K8s exposure paths, CI/CD platform paths, documentation/wiki leak paths, WHOIS/RDAP, DNS record catalog, Wayback CDX recipes), 43+-pattern secret-regex catalog (incl. modern AI API keys: Anthropic/OpenAI/HuggingFace/Cloudflare/DigitalOcean/npm/PyPI/Docker Hub/Atlassian/DataDog/Sentry/ngrok), 80+ dork corpus across 9 categories, GitHub code-search dorks, copy-paste curl/httpie probes for every check, post-discovery enumeration workflows (AWS/GitHub/Slack/JWT/PMAK/Anthropic/OpenAI), endpoint interest scoring rubric (0–100), mobile app ownership confidence, identity-fabric endpoints (Entra/Okta/ADFS/Google/SAML/M365 Teams+Shar
gcp-architecture-best-practices-reviewer
Evidence-backed review of Google Cloud Platform architecture against GCP best practices and CIS GCP Foundation Benchmark concepts. Use when reviewing Terraform, Kubernetes/GKE manifests, network topology, IAM, Cloud SQL, KMS, Cloud Storage, Secret Manager, or CI/CD config for security, reliability, cost, and compliance gaps. Read-only — produces findings only.
devops-engineer
Use when setting up CI/CD pipelines, containerizing applications, or managing infrastructure as code. Invoke for pipelines, Docker, Kubernetes, cloud platforms, GitOps.
kubernetes-specialist
Use when deploying or managing Kubernetes workloads requiring cluster configuration, security hardening, or troubleshooting. Invoke for Helm charts, RBAC policies, NetworkPolicies, storage configuration, performance optimization.
kubernetes-ontology-access
Use this skill whenever a user wants to onboard, deploy, install, or operate kubernetes-ontology; set up its Helm chart, release CLI, daemon, or topology viewer; run Kubernetes topology queries; diagnose Pod or Workload failures with AI-agent workflows; or connect human visual troubleshooting to the CLI and HTTP API. This skill should trigger for requests about Kubernetes ontology onboarding, Helm deployment, topology query, diagnostic subgraph, ImagePullBackOff or storage/RBAC/Event graph troubleshooting, viewer usage, and agent integration.
define-deployment
Capture deployment characteristics for both production and development — hosting, IaC, CI/CD, secrets, observability, local dev environment, containerization, hot reload, and seed data. Use when the project-builder agent is gathering deployment information.
analyzing-projects
Analyzes codebases to understand structure, tech stack, patterns, and conventions. Use when onboarding to a new project, exploring unfamiliar code, or when asked "how does this work?" or "what's the architecture?"
node.js-
检查 RCE、SSRF、SQL 注入、路径穿越等安全问题,支持 Express/Koa/NestJS
azure-aks-edge-essentials
Expert knowledge for Azure Kubernetes Service Edge Essentials development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing AKS Edge/Arc clusters, Arc connectivity, IoT/OPC/ONVIF workloads, TPM/AI deployments, or gMSA, and other Azure Kubernetes Service Edge Essentials related development tasks. Not for Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure IoT Edge (use azure-iot-edge), Azure Stack Edge (use azure-stack-edge), Azure Container Apps (use azure-container-apps).
azure-arc
Expert knowledge for Azure Arc development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when managing Arc-enabled Kubernetes, servers, SQL MI, Edge RAG, resource bridge, or SCVMM/VMware integration, and other Azure Arc related development tasks. Not for Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Virtual Machines (use azure-virtual-machines), Azure Policy (use azure-policy), Azure Monitor (use azure-monitor).
azure-cache-redis
Expert knowledge for Azure Cache for Redis development including troubleshooting, best practices, decision making, architecture & design patterns, security, configuration, integrations & coding patterns, and deployment. Use when configuring geo-replication, persistence, VNet/Private Link, CLI/PowerShell automation, or Blob import/export, and other Azure Cache for Redis related development tasks. Not for Azure Managed Redis (use azure-managed-redis), Azure HPC Cache (use azure-hpc-cache), Azure Blob Storage (use azure-blob-storage), Azure Table Storage (use azure-table-storage).
azure-cloud-shell
Expert knowledge for Azure Cloud Shell development including troubleshooting, limits & quotas, and security. Use when handling Cloud Shell storage mounts, session limits, private VNet access, or secure private endpoints, and other Azure Cloud Shell related development tasks. Not for Azure Portal (use azure-portal), Azure Virtual Machines (use azure-virtual-machines), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Functions (use azure-functions).
azure-container-apps
Expert knowledge for Azure Container Apps development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring ingress/scale, Entra/OIDC auth, GitHub Actions CI/CD, Dapr integrations, or Java microservices on ACA, and other Azure Container Apps related development tasks. Not for Azure App Service (use azure-app-service), Azure Functions (use azure-functions), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Spring Apps (use azure-spring-apps).
azure-container-instances
Expert knowledge for Azure Container Instances development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, and deployment. Use when configuring ACI networking, standby pools, GitHub Actions deploys, Spot containers, or GPU workloads, and other Azure Container Instances related development tasks. Not for Azure Container Apps (use azure-container-apps), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure App Service (use azure-app-service), Azure Virtual Machines (use azure-virtual-machines).
azure-container-registry
Expert knowledge for Azure Container Registry development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using ACR Tasks, geo-replication/connected registries, Defender scans, Notation signing, or AKS/ACI pulls, and other Azure Container Registry related development tasks. Not for Azure Container Apps (use azure-container-apps), Azure Container Instances (use azure-container-instances), Azure Kubernetes Service (AKS) (use azure-kubernetes-service), Azure Red Hat OpenShift (use azure-redhat-openshift).
azure-container-storage
Expert knowledge for Azure Container Storage development including troubleshooting, decision making, limits & quotas, security, and configuration. Use when configuring CMK-encrypted Elastic SAN volumes, ACS pools, LRS/ZRS redundancy, volume resize, or v1 installs, and other Azure Container Storage related development tasks. Not for Azure Blob Storage (use azure-blob-storage), Azure Files (use azure-files), Azure Elastic SAN (use azure-elastic-san), Azure NetApp Files (use azure-netapp-files).
abp-deployment
ABP Framework v10.4 deployment quick reference: clustered/stateless, distributed cache (Redis), BLOB provider, distributed lock, SignalR backplane, DataProtection, ForwardedHeaders, SSL, OpenIddict prod certificates, Docker/Helm. Use when you need to deploy an ABP application to production.
careful
Intercept destructive commands before execution — rm -rf, DROP TABLE, force-push, git reset --hard, and similar irreversible operations. Prompts for confirmation and suggests safer alternatives. Inspired by gstack's careful skill.
deployment-rollback
Safe deployment rollback with health checks and database migration reversal
dependency-versions
MUST consult this skill before answering whenever the user's task involves external versioned dependencies — even if you think you can handle it directly. This applies to: checking if packages/tools are up to date, upgrading npm/pip/cargo/go dependencies, planning or writing CI/CD workflows (GitHub Actions, CircleCI, GitLab CI), pinning action versions, reviewing Dockerfiles or base images, checking Terraform providers or modules for drift, reviewing Helm chart versions, verifying Kubernetes/EKS/cloud resource versions, updating pre-commit hooks, writing Dependabot configs, or any task where the user mentions specific version numbers, package names, or config files like package.json, pyproject.toml, Dockerfile, .pre-commit-config.yaml, main.tf, or values.yaml. Even casual requests like "is this still current" or "has anything drifted" require this skill because your training data is unreliable for volatile version facts. Do NOT use for: refactoring code, writing tests, debugging errors, designing APIs, or tas
agenticx-deployer
Guide for deploying AgenticX agents to production including Docker containerization, Kubernetes orchestration, Volcengine AgentKit cloud deployment, and API server setup. Use when the user wants to deploy agents, containerize applications, set up Kubernetes, configure cloud deployment, or run the AgenticX API server in production.
agenticx-deployer
Guide for deploying AgenticX agents to production including Docker containerization, Kubernetes orchestration, Volcengine AgentKit cloud deployment, and API server setup. Use when the user wants to deploy agents, containerize applications, set up Kubernetes, configure cloud deployment, or run the AgenticX API server in production.
agenticx-deployer
Guide for deploying AgenticX agents to production including Docker containerization, Kubernetes orchestration, Volcengine AgentKit cloud deployment, and API server setup. Use when the user wants to deploy agents, containerize applications, set up Kubernetes, configure cloud deployment, or run the AgenticX API server in production.
deployment-procedures
Production deployment principles and decision-making. Safe deployment workflows, rollback strategies, and verification. Teaches thinking, not scripts.
depgen-k8s
Generate a Dockerfile and Kubernetes manifests for an application targeting a single environment. Supports Spring Boot (Java), Laravel (PHP), and Node.js application stacks. Auto-detects the stack from project files (pom.xml, composer.json, package.json), reads CLAUDE.md dependencies, SPECIFICATION.md tech stack, and the application's externalized environment variables. Generates a Dockerfile in the application root folder and Kubernetes manifest YAML files directly in `<app_folder>/k8s/` (no per-environment subfolders — the k8s/ folder is gitignored, each machine maintains its own copy). Standardized input: application name (mandatory), environment (optional). Use this skill whenever the user asks to create deployment artifacts, Dockerfiles, Kubernetes manifests, or containerize an application. Also trigger when the user says things like "deploy this app", "containerize this", "create a Dockerfile", "generate k8s manifests", or any request for deployment-related artifacts.
util-preparek8senv
Prepare Kubernetes environment infrastructure by generating K8s manifests for all 3rd party supporting applications for a single target environment defined in CLAUDE.md. Creates/updates ENVIRONMENT.md with per-environment configs and credentials, then generates persistent StatefulSet-based K8s manifests for each 3rd party application (databases, message queues, caches, SSO, API gateways, etc.) directly in the `environment/` folder. Since the `environment/` folder is gitignored, each machine maintains its own independent copy. Ensures all services are remotely accessible using tools from DEVTOOL.md. Trigger on keywords: "prepare k8s environment", "prepare kubernetes", "setup k8s infra", "generate k8s manifests for 3rd party", "prepare environment", "setup infrastructure", "prepare k8s", "init k8s environment", "scaffold k8s environment". Accepts an optional environment argument to select which Kubernetes environment to generate for.
util-projectinit
Initialize a new CO2 project by analyzing a free-form prompt or markdown brief and generating the three foundational documents at the project root: `CLAUDE.md` (project detail, terminology, supporting 3rd party applications, custom applications, port allocation, system modules, business modules, rules), `DEVTOOL.md` (skeleton of development tools required by the inferred technology stack) and `ENVIRONMENT.md` (skeleton of local environment variables and credentials matching the 3rd party applications and external services declared in CLAUDE.md). Infers required infrastructure (databases, message queues, caches, SSO, file storage, SMTP, etc.), the number and names of custom applications, the adequacy of the standard system modules and the project-specific business modules from the input. This skill seeds the project — it is the first skill to run before any `modelgen-*`, `mockgen-*`, `specgen-*`, `testgen-*`, `util-projectsync` or `util-preparek8senv` invocation. Trigger on keywords: "init project", "initializ
fastify-production
This skill should be used when deploying Fastify to production, configuring Fastify security headers, setting up reverse proxy with Fastify, implementing graceful shutdown, configuring @fastify/helmet, @fastify/cors, @fastify/rate-limit, trustProxy settings, Kubernetes Fastify deployment, Fastify performance tuning, request timeouts, handler timeouts, return503OnClosing, prototype poisoning protection, production Fastify checklist, or hardening Fastify server.
architecture-runtime-topology
Use when code work touches runtime shape: services, app/CLI/background flows, deployment/IaC, observability, resilience, external integrations, ownership, and runtime coupling.
meremoth-devops-craft
How Meremoth builds CI/CD pipelines — GitLab CI / GitHub Actions stages, secret marshalling via SOPS, hash-based config drift detection, SSH-direct deploy patterns, the prepare-not-execute rule, and the "check the CI AND the remote script" diverge-silently rule. Invoke when a pipeline or release-automation change is in scope.
baml-expert
BAML (Boundary ML) expert for projects defining LLM calls as typed functions in .baml files with a generated Python client. Use whenever the repo contains baml_src/, baml_client/, baml-cli commands, or imports from baml_py / baml_client. Covers .baml syntax (function, class, enum, client, test, retry_policy, attributes), Python integration (baml_client sync/async, streaming, ClientRegistry, Collector, TypeBuilder), Schema-Aligned Parsing, ctx.output_format, @@assert / @@check tests, @stream.done / @stream.not_null / @stream.with_state streaming, multimodal (image/audio/pdf), and debugging via BAML_LOG plus Boundary Studio. Triggers even unnamed — "add an LLM function", "fix a failing parse", "add a test for the prompt", "stream the response" in a project with baml_src/. Prefer over raw LLM-SDK guidance here; defer to jinja-expert for standalone chat-template / .j2 work.
netbox-best-practices
NetBox 4.2-4.6 deployment and upgrade knowledge that the official netboxlabs/skills marketplace does NOT cover - use for deploying or upgrading NetBox on Kubernetes with the netbox-community helm chart (netbox-chart), external PostgreSQL/valkey wiring, API token bootstrap on 4.5+ (nbt_ v2 tokens), plugin installation in the official image, version-migration planning between NetBox 4.2 and 4.6, module type profiles, and front/rear port (patch panel) API changes. Trigger on "netbox helm", "netbox chart", "netbox kubernetes", "netbox upgrade", "netbox plugin install", "netbox api token bootstrap", "netbox 4.x breaking changes", or seeding/automation that must survive a NetBox version bump. For general NetBox data modeling, IPAM design, Diode, or validation questions, prefer the official netboxlabs/skills marketplace skills - this skill only covers the gaps.
openshift-app
Package applications for OpenShift deployment: container images (UBI, arbitrary UID, multi-stage builds), packaging formats (Helm, Kustomize, Operators, OLM v1), CI/CD (Tekton, ArgoCD, Shipwright, Conforma), security (SCC, PSA, supply chain, image signing, secrets), operations (Routes, probes, scaling, monitoring, storage), disconnected/air-gapped patterns, and critical gotchas. Also when an app "works on Kubernetes but fails on OpenShift" (SCC denied, random/arbitrary UID, permission errors). Covers OCP 4.14-4.21. NOT for cluster installation or infrastructure management.
tenet-infra-cloud
Audits IaC and cloud risks: exposure, IAM wildcards, encryption, buckets, Kubernetes, and drift.
senior-ml-engineer
ML engineering skill for productionizing models, building MLOps pipelines, and integrating LLMs. Covers model deployment, feature stores, drift monitoring, RAG systems, and cost optimization. Use when the user asks about deploying ML models to production, setting up MLOps infrastructure (MLflow, Kubeflow, Kubernetes, Docker), monitoring model performance or drift, building RAG pipelines, or integrating LLM APIs with retry logic and cost controls. Focused on production and operational concerns rather than model research or initial training.
kubernetes-skills
Kubernetes orchestration patterns, deployments, and best practices
fastify-production
This skill should be used when deploying Fastify to production, configuring Fastify security headers, setting up reverse proxy with Fastify, implementing graceful shutdown, configuring @fastify/helmet, @fastify/cors, @fastify/rate-limit, trustProxy settings, Kubernetes Fastify deployment, Fastify performance tuning, request timeouts, handler timeouts, return503OnClosing, prototype poisoning protection, production Fastify checklist, or hardening Fastify server.
system-admin
Linux system administration and monitoring
tcp-ip
TCP/IP 网络诊断与排查
debug
Debug systematically across software, data pipelines, infrastructure, and analytics. Find root cause before fixing — for bugs, test failures, CI/CD breakage, K8s/Cloud incidents, dbt/Airflow pipeline failures, schema drift, freshness violations, dashboard wrong-numbers, and performance issues. Validates at every layer; verifies with fresh evidence before claiming done.
scout
Fast, parallel codebase scouting across software, data-engineering, devops, and analytics surfaces. Use to locate files, dbt models, dashboards, IaC, pipeline DAGs, K8s manifests, secrets, and CI workflows before changes. Supports internal (Explore subagents) and external (Gemini/OpenCode CLI) modes.
cloud-finops
Expert FinOps guidance covering cloud, AI, SaaS, and adjacent technology spend. Includes AI cost management, GenAI capacity planning, AI-powered FinOps automation, Anthropic billing, AWS (EC2, Bedrock, Savings Plans, CUR, commitment strategy), Azure (reservations, Savings Plans, AHB, OpenAI PTUs, portfolio liquidity), GCP (Vertex AI, Compute Engine, BigQuery), Kubernetes and container FinOps (OpenCost, Kubecost), serverless FinOps (Lambda, Functions, Cloud Run), data platforms (Kafka/MSK, Elasticsearch/OpenSearch, Redis/Valkey), multi-cloud normalization (FOCUS specification), tagging governance, SaaS management (SAM, licence optimisation, SMPs, shadow IT), AI coding tools (Cursor, Claude Code, Copilot, Windsurf, Codex), ITAM, Databricks, Snowflake, OCI, and GreenOps. Use for any query about technology cost, commitment portfolio management, rightsizing, cost allocation, SaaS sprawl, AI dev tool spend, container cost attribution, serverless optimization, multi-cloud strategy, or connecting spend to business va
cloud-infrastructure
Cloud infrastructure design and infrastructure-as-code (IaC) authoring. Use for Terraform module authoring, AWS CDK constructs, cloud architecture design (VPCs, load balancers, managed services, serverless), multi-region and disaster-recovery patterns, cost-optimisation analysis, and IaC code review. Trigger phrases: "write Terraform for", "design the AWS architecture", "set up a VPC", "convert this to CDK", "optimise our cloud costs". NOT for application-layer code — this skill models infrastructure, not the code running on it. NOT for Kubernetes application manifests (Deployments, Services, Ingress) — those belong in a k8s-specific skill. NOT for CI/CD pipeline configuration — that is a deployment concern separate from infrastructure provisioning.
oops
Deploy applications to Kubernetes via OOPS PaaS using the Python CLI script. Use when the user asks to deploy/release/ship an application to OOPS, create a new OOPS app, inspect a pipeline, configure an app's build/service/runtime/env-vars, or mentions oops, ZIP/Git deploys, helloworld deploys, namespace/environment/pipeline/configmap.
github-actions-templates
Create production-ready GitHub Actions workflows for automated testing, building, and deploying applications. Use when setting up CI/CD with GitHub Actions, automating development workflows, or creating reusable workflow templates.
gitlab-ci-patterns
Build GitLab CI/CD pipelines with multi-stage workflows, caching, and distributed runners for scalable automation. Use when implementing GitLab CI/CD, optimizing pipeline performance, or setting up automated testing and deployment.
linkerd-patterns
Implement Linkerd service mesh patterns for lightweight, security-focused service mesh deployments. Use when setting up Linkerd, configuring traffic policies, or implementing zero-trust networking with minimal overhead.
multi-cloud-architecture
Design multi-cloud architectures using a decision framework to select and integrate services across AWS, Azure, and GCP. Use when building multi-cloud systems, avoiding vendor lock-in, or leveraging best-of-breed services from multiple providers.
cli-tool-creator
【CLI工具开发】设计和开发命令行工具,包含参数解析、子命令、交互式提示、输出格式化、自动补全。 触发时机: - 用户要求"开发CLI工具"、"命令行工具" - 需要将脚本封装为可分发的 CLI - 需要添加交互式命令行界面 支持 Python(Click/Typer) 和 Node.js(Commander/Ink)。
deploy-checklist
Generate pre-deployment checklist based on project type. Trigger: user says "部署前检查"、"发版检查"、"deploy checklist"、"预发布检查" before release.
administering-linux
Manage Linux systems covering systemd services, process management, filesystems, networking, performance tuning, and troubleshooting. Use when deploying applications, optimizing server performance, diagnosing production issues, or managing users and security on Linux servers.
building-ci-pipelines
Constructs secure, efficient CI/CD pipelines with supply chain security (SLSA), monorepo optimization, caching strategies, and parallelization patterns for GitHub Actions, GitLab CI, and Argo Workflows. Use when setting up automated testing, building, or deployment workflows.
building-clis
Build professional command-line interfaces in Python, Go, and Rust using modern frameworks like Typer, Cobra, and clap. Use when creating developer tools, automation scripts, or infrastructure management CLIs with robust argument parsing, interactive features, and multi-platform distribution.
configuring-firewalls
Configure host-based firewalls (iptables, nftables, UFW) and cloud security groups (AWS, GCP, Azure) with practical rules for common scenarios like web servers, databases, and bastion hosts. Use when exposing services, hardening servers, or implementing network segmentation with defense-in-depth strategies.
debugging-techniques
Debugging workflows for Python (pdb, debugpy), Go (delve), Rust (lldb), and Node.js, including container debugging (kubectl debug, ephemeral containers) and production-safe debugging techniques with distributed tracing and correlation IDs. Use when setting breakpoints, debugging containers/pods, remote debugging, or production debugging.
deploying-applications
Deployment patterns from Kubernetes to serverless and edge functions. Use when deploying applications, setting up CI/CD, or managing infrastructure. Covers Kubernetes (Helm, ArgoCD), serverless (Vercel, Lambda), edge (Cloudflare Workers, Deno), IaC (Pulumi, OpenTofu, SST), and GitOps patterns.
deploying-on-aws
Selecting and implementing AWS services and architectural patterns. Use when designing AWS cloud architectures, choosing compute/storage/database services, implementing serverless or container patterns, or applying AWS Well-Architected Framework principles.
deploying-on-gcp
Implement applications using Google Cloud Platform (GCP) services. Use when building on GCP infrastructure, selecting compute/storage/database services, designing data analytics pipelines, implementing ML workflows, or architecting cloud-native applications with BigQuery, Cloud Run, GKE, Vertex AI, and other GCP services.
implementing-gitops
Implement GitOps continuous delivery for Kubernetes using ArgoCD or Flux. Use for automated deployments with Git as single source of truth, pull-based delivery, drift detection, multi-cluster management, and progressive rollouts.
implementing-service-mesh
Implement production-ready service mesh deployments with Istio, Linkerd, or Cilium. Configure mTLS, authorization policies, traffic routing, and progressive delivery patterns for secure, observable microservices. Use when setting up service-to-service communication, implementing zero-trust security, or enabling canary deployments.
implementing-tls
Configure TLS certificates and encryption for secure communications. Use when setting up HTTPS, securing service-to-service connections, implementing mutual TLS (mTLS), or debugging certificate issues.
load-balancing-patterns
When distributing traffic across multiple servers or regions, use this skill to select and configure the appropriate load balancing solution (L4/L7, cloud-managed, self-managed, or Kubernetes ingress) with proper health checks and session management.
managing-dns
Manage DNS records, TTL strategies, and DNS-as-code automation for infrastructure. Use when configuring domain resolution, automating DNS from Kubernetes with external-dns, setting up DNS-based load balancing, or troubleshooting propagation issues across cloud providers (Route53, Cloud DNS, Azure DNS, Cloudflare).
managing-secrets
Managing secrets (API keys, database credentials, certificates) with Vault, cloud providers, and Kubernetes. Use when storing sensitive data, rotating credentials, syncing secrets to Kubernetes, implementing dynamic secrets, or scanning code for leaked secrets.
operating-kubernetes
Operating production Kubernetes clusters effectively with resource management, advanced scheduling, networking, storage, security hardening, and autoscaling. Use when deploying workloads to Kubernetes, configuring cluster resources, implementing security policies, or troubleshooting operational issues.
optimizing-costs
Optimize cloud infrastructure costs through FinOps practices, commitment discounts, right-sizing, and automated cost management. Use when reducing cloud spend, implementing budget controls, or establishing cost visibility across AWS, Azure, GCP, and Kubernetes environments.
planning-disaster-recovery
Design and implement disaster recovery strategies with RTO/RPO planning, database backups, Kubernetes DR, cross-region replication, and chaos engineering testing. Use when implementing backup systems, configuring point-in-time recovery, setting up multi-region failover, or validating DR procedures.
resource-tagging
Apply and enforce cloud resource tagging strategies across AWS, Azure, GCP, and Kubernetes for cost allocation, ownership tracking, compliance, and automation. Use when implementing cloud governance, optimizing costs, or automating infrastructure management.
siem-logging
Configure security information and event management (SIEM) systems for threat detection, log aggregation, and compliance. Use when implementing centralized security logging, writing detection rules, or meeting audit requirements across cloud and on-premise infrastructure.
couchbase-kubernetes
Deploy and operate Couchbase on Kubernetes using the Couchbase Autonomous Operator (CAO). Use whenever the user asks about Couchbase Autonomous Operator, CAO, CouchbaseCluster CRD, Couchbase on Kubernetes, Couchbase on EKS, Couchbase on GKE, Couchbase on AKS, Couchbase on OpenShift, Helm chart for Couchbase, CouchbaseBucket CRD, CouchbaseUser CRD, CouchbaseBackup CRD, CouchbaseReplicationRepresentation, server groups in Kubernetes, rack awareness in Kubernetes, persistent volumes for Couchbase, Couchbase pod resources, rolling upgrades via operator, Couchbase operator RBAC, Prometheus with CAO, or 'how do I run Couchbase on Kubernetes.' Distinct from couchbase-capella (managed Capella — no Kubernetes involved) and couchbase-upgrade (server binary upgrades outside Kubernetes). Use proactively when the user has a Kubernetes or OpenShift deployment requirement.
docker-vps-deploy
Use when deploying a Dockerized application to a VPS (Linux server) via SSH without a container registry, generating a GitHub Actions pipeline that uses docker save, gzip compression, and rsync to transfer images. Triggers: "deploy to VPS", "rsync docker image", "docker save and load", "VPS CI/CD", "SSH deploy pipeline", "deploy without registry", "transfer docker image via SSH".
infra-audit
Infrastructure and CI/CD security audit - GitHub Actions workflows (pwn-request, secret logging, missing pinning, permissions overreach), Dockerfile (latest tag, USER root, ADD on URL), Kubernetes manifests (runAsNonRoot, privileged containers, hostNetwork), Terraform (IAM wildcards, state in git, module pinning), GitLab CI equivalent checks. Stack-agnostic.
safety-guard
Enforces Law 3 (One Thing at a Time) of the 7 Laws of AI Agent Discipline by scoping edits to a directory and blocking destructive shell commands. Use this skill to prevent destructive operations when working on production systems or running agents autonomously.
cli-tool-architect
Cross-language CLI standards — subcommand structure, flag/env/config/default precedence, TOML in XDG, stdout-data/stderr-logs split, --output json|yaml, exit codes, NO_COLOR, completions. Go (cobra+pflag+viper) and Python (typer) recipes. Use when designing or reviewing a CLI.
aliyun-cli
阿里云 CLI 操作
audit
安全审计
aws-cli
AWS CLI 操作
backup-strategy
备份策略设计
benchmarking
性能基准测试
cloud-backup
云备份方案
disaster-recovery
灾难恢复
dns
DNS 配置与排查
git-advanced
Git 高级操作
helm
Helm 包管理
load-balancer
负载均衡配置
monitoring
监控与告警
profiling
性能分析
proxy
代理服务器配置
snapshot
快照管理
tar-compression
归档与压缩
traffic-analysis
流量分析与抓包
troubleshooting
性能问题排查
tuning
系统调优
vpn
VPN 配置与管理
argocd-operations
Designs and debugs ArgoCD ApplicationSets, picks generators, templates per-tenant deploys, configures sync waves and hooks, and untangles syncPolicy.automated prune/selfHeal. Use when working with ArgoCD, ApplicationSet, sync wave, GitOps, or per-tenant Application deploys.
aws-codepipeline-codebuild
Authors and debugs AWS CodePipeline + CodeBuild workflows — pipeline v1 vs v2 (triggers, variables), source providers via CodeStar Connections, artifact handoff, buildspec.yml authoring, IAM service roles, ECR pull permissions, VPC build environments, S3/local caching strategies, Lambda invoke action callback pattern, and manual approval setup. Use when working with AWS CodePipeline, AWS CodeBuild, buildspec.yml, CodeStar Connections, pipeline service roles, build VPC config, or "CodeBuild can't pull image" / "Lambda action hangs" debugging.
aws-cost-investigation
Diagnoses AWS cost spikes and audits accounts for ongoing waste. Cost Explorer + Cost & Usage Report query patterns, anomaly detection, the cost-trap inventory (forever log groups, NAT egress, unattached EBS/EIPs, idle ELBs, incomplete S3 multipart uploads, gp2/gp3 migration), commitment decision rules (Compute SP vs EC2 Instance SP vs RI), and the cost-allocation-tag activation trap. Use when working with AWS billing, "bill is up", `aws ce`, Cost Explorer, Cost and Usage Report, Savings Plans, Reserved Instances, NAT vs VPC endpoint trade-offs, or AWS cost optimization.
claude-md-optimizer
Analyzes and optimizes CLAUDE.md files following Anthropic's official best practices. Use when reviewing existing CLAUDE.md for improvements, or when user mentions CLAUDE.md is too long or ineffective.
cloud-storage-identification
Identifies which object-storage provider an S3-compatible target actually hits, from endpoint URLs, env vars, or Terraform provider blocks. Prevents AWS-default assumptions on GCS/DO Spaces/R2/Hetzner/B2/MinIO. Use when working with boto3, `aws_s3_bucket`, rclone, s3cmd, or S3-compatible storage.
cloudflare-access-mcp
Adds OAuth/SSO to a remote MCP server using Cloudflare. Three paths — AI Controls MCP Portal (REST, fastest), self-hosted Access app with Managed OAuth (REST), and the same as Terraform (when IaC already exists) — with a decision matrix, REST recipes per path, Terraform templates for the IaC path, and a stdlib validator that lints a `terraform show -json` plan. Use when the user asks to put an MCP server behind Cloudflare, add OAuth/SSO to a remote MCP server, expose a private MCP server via Cloudflare Tunnel, register MCP servers with the AI Controls portal, enable Managed OAuth or DCR on an Access app, or wire Claude Desktop / claude.ai web / Claude Code to an internal MCP server.
cloudflare-cf-cli
Operates Cloudflare's new unified `cf` CLI (technical preview, April 2026) — install path, flag conventions, the local-vs-remote default trap, coexistence with Wrangler and `wrangler.jsonc`, and agent-mode usage via the Local Explorer OpenAPI. Use when the user mentions `cf`, `npx cf`, "the new Cloudflare CLI", or is choosing between `cf` / `wrangler` / REST / Terraform.
cloudflare-dns-zones
Operates Cloudflare DNS zones and records via the REST API (curl + jq) — token scoping, zone discovery, record CRUD, batch operations, BIND import/export, proxied vs DNS-only decisions, CNAME flattening at apex, DNSSEC, and DNS-01 ACME challenge wiring with cert-manager. Use when working with Cloudflare DNS, `api.cloudflare.com`, `CF_API_TOKEN`, zone records, DNS-01 challenges, mail records (MX/SPF/DKIM/DMARC), or "orange cloud / grey cloud" proxy decisions.
cloudflare-workers
Authors and reviews Cloudflare Workers projects — wrangler config (toml/jsonc), bindings (KV, R2, D1, Queues, Durable Objects, service bindings, Vectorize, Workers AI), env-scoped vs root config and the non-inheritable bindings trap, Durable Object migrations (renames, SQLite backend), compatibility_date semantics, static assets and Pages migration, secrets vs vars, cron triggers, observability, and deploy/CI patterns with `cloudflare/wrangler-action`. Use when working with Cloudflare Workers, wrangler.toml/wrangler.jsonc, Workers bindings, Durable Objects, Workers KV/R2/D1/Queues, Workers Static Assets, migrating from Pages to Workers, service bindings or WorkerEntrypoint RPC, or deploying Workers from CI.
digitalocean-app-platform
Lints DigitalOcean App Platform app specs (app.yaml / doctl apps spec JSON / digitalocean_app Terraform) for security, reliability, correctness, and sizing anti-patterns — plaintext secrets, missing health checks, single-instance services, dev databases in production, port mismatches, overlapping ingress routes, conflicting git/image sources, deprecated routes, unknown instance sizes, and app/database region mismatch. Use when working with DigitalOcean App Platform, app.yaml, .do/app.yaml, doctl apps, the digitalocean_app Terraform resource, or reviewing an App Platform deployment for problems.
digitalocean-dns-zones
Operates DigitalOcean DNS zones and records via doctl, the DigitalOcean API v2, and the digitalocean Terraform provider — domain/record CRUD, the apex CNAME / no-flattening trap when migrating from Cloudflare, account-wide token handling, FQDN trailing-dot semantics, DNS-01 ACME wildcard certs, and nameserver delegation. Use when working with DigitalOcean DNS, doctl compute domain, DIGITALOCEAN_ACCESS_TOKEN, api.digitalocean.com domains, digitalocean_record/digitalocean_domain Terraform, apex CNAME questions, wildcard cert DNS-01, or moving a zone between Cloudflare and DigitalOcean.
docker-workflows
Reviews and hardens Dockerfiles and docker-compose files — multi-stage build conversion, base-image choice, layer caching, secret leakage, root-user containers, missing healthchecks. Use when reviewing a Dockerfile, optimizing image size or build time, writing a compose file, or auditing container security.
drawio-diagramming
Create and open draw.io diagrams. Use when the user wants to generate, edit, or open a diagram in draw.io (architecture/HLA diagrams, infra & Kubernetes topology, flowcharts, network diagrams) — covers the draw.io MCP servers (open_drawio_xml/mermaid/csv) and native .drawio file generation.
gcp-iam
Debugs GCP permission-denied errors, designs IAM bindings, traces org → folder → project inheritance, and untangles service-account impersonation chains. Covers Workload Identity. Use when working with GCP IAM, gcloud, "permission denied" on GCP resources, Workload Identity, or SA impersonation.
github-actions-pipelines
Debugs and authors GitHub Actions workflows — OIDC federation to AWS/GCP/Azure, GITHUB_TOKEN permissions hardening, reusable workflows vs composite actions, deploy concurrency, caching, the path-filter/required-check trap, and pull_request_target security. Use when working with GitHub Actions, `.github/workflows/`, OIDC to cloud providers, `pull_request_target`, branch protection required checks, reusable workflows, or CI/CD pipelines that deploy to AWS/GCP/DigitalOcean.
kubernetes-operations
Debugs Kubernetes pods and controllers — FailedCreate, ImagePullBackOff, init-container failures, probe flapping, missing service endpoints, GKE NEG readiness. Use when a pod is not Running, a Deployment/StatefulSet shows FailedCreate, image pulls fail, or services lack endpoints.
kubernetes-operators
Designs and audits Kubernetes Operators — CRD shape, reconcile-loop correctness, finalizer and status-subresource handling, OperatorHub capability levels, framework choice. Use when building a controller for a CRD, reviewing an operator for capability gaps, or designing the API surface of a Custom Resource. Not for general pod debugging — see kubernetes-operations.
mindfulness-mentor
Guide users through mindfulness exercises, meditation practices, and stress reduction techniques. Use when users ask for help with relaxation, stress management, breathing exercises, or cultivating inner peace.
setup-project-skills
Installs skills from a user-curated manifest (`~/.claude/skill-manifest.json`) into the current project's `.claude/skills/` — symlinks local skills, runs `npx skills add` for third-party ones, and advises `/plugin install` for native Claude plugins. Optionally scans the project for trigger files (Dockerfile, wrangler.jsonc, *.tf, etc.) and pre-selects recommended matches. Use when the user wants to set up skills in a new project, add a skill they curated, see what skills fit the current project, or bootstrap a freshly cloned repo with their toolbox.
terraform-workflows
Reviews Terraform/OpenTofu plans, detects drift, performs state surgery (mv/rm/import), upgrades providers, and traces Terragrunt cache errors. Multi-cloud. Use when working with Terraform, OpenTofu, Terragrunt, terraform plan, drift, or provider upgrades.
terragrunt-workflows
Terragrunt-specific orchestration patterns — CLI redesign migration (run/run --all, --terragrunt-* flag removal, TG_* env vars, strict controls), config composition (include, locals, inputs deep-merge, generate blocks), dependency wiring (mock_outputs semantics), run --all safety, hooks, and the new terragrunt.stack.hcl. Use when working with Terragrunt, `terragrunt.hcl`, `terragrunt.stack.hcl`, the deprecated `run-all`, `--terragrunt-*` flags, `TERRAGRUNT_*` env vars, `include` blocks, `dependency` blocks, or `terragrunt run --all`.
gitops-patterns
Provides GitOps best practices for ArgoCD, Flux, Argo Rollouts, and progressive delivery strategies. Use when setting up GitOps workflows, configuring continuous delivery, managing Kubernetes deployments declaratively, or when user mentions 'gitops', 'argocd', 'flux', 'progressive delivery', 'reconciliation', 'argo rollouts', 'canary', 'sealed secrets'.
kubernetes-patterns
Provides Kubernetes resource management, Helm chart patterns, service mesh configuration, and autoscaling strategies. Covers HPA, VPA, KEDA, operators, security contexts, and namespace isolation. Use when user mentions 'kubernetes', 'k8s', 'helm', 'istio', 'linkerd', 'service mesh', 'HPA', 'VPA', 'KEDA', 'pod security', 'resource quotas', 'operators'.
platform-engineering
Provides platform engineering best practices for Internal Developer Platforms (IDPs), golden paths, service catalogs, and developer experience. Use when building developer platforms, configuring Backstage, designing self-service workflows, or when user mentions 'platform engineering', 'backstage', 'golden path', 'IDP', 'developer portal', 'service catalog', 'DevEx', 'platform team', 'self-service'.
k8s-sidecar-testing
End-to-end testing workflow for nat464-sidecar in IPv6-only Kubernetes clusters. Use when setting up test environments, deploying the sidecar to k3s, verifying IPv6-to-IPv4 translation (inbound and outbound), benchmarking performance, running path validation, or troubleshooting pod networking issues. Triggers: 'test the sidecar', 'set up test cluster', 'verify IPv6 translation', 'deploy to k3s', 'benchmark sidecar', 'validate paths', 'test inbound/outbound', 'IPv6-only cluster setup'.
cli-recorder
Use when authoring a CLI session recording for the cli-recorder skill. Translates a user's intent ("video of installing kubectl") into a valid recipe.toml through interactive Q&A.
lettactl
Manage Letta AI agent fleets with kubectl-style CLI
context-mode
Use context-mode tools (ctx_execute, ctx_execute_file) instead of Bash/cat when processing large outputs. Triggers: "analyze logs", "summarize output", "process data", "parse JSON", "filter results", "extract errors", "check build output", "analyze dependencies", "process API response", "large file analysis", "page snapshot", "browser snapshot", "DOM structure", "inspect page", "accessibility tree", "Playwright snapshot", "run tests", "test output", "coverage report", "git log", "recent commits", "diff between branches", "list containers", "pod status", "disk usage", "fetch docs", "API reference", "index documentation", "call API", "check response", "query results", "find TODOs", "count lines", "codebase statistics", "security audit", "outdated packages", "dependency tree", "cloud resources", "CI/CD output". Also triggers on ANY MCP tool output that may exceed 20 lines. Subagent routing is handled automatically via PreToolUse hook.
context-mode
Use context-mode tools (ctx_execute, ctx_execute_file) instead of Bash/cat when processing large outputs. Triggers: "analyze logs", "summarize output", "process data", "parse JSON", "filter results", "extract errors", "check build output", "analyze dependencies", "process API response", "large file analysis", "page snapshot", "browser snapshot", "DOM structure", "inspect page", "accessibility tree", "Playwright snapshot", "run tests", "test output", "coverage report", "git log", "recent commits", "diff between branches", "list containers", "pod status", "disk usage", "fetch docs", "API reference", "index documentation", "call API", "check response", "query results", "find TODOs", "count lines", "codebase statistics", "security audit", "outdated packages", "dependency tree", "cloud resources", "CI/CD output". Also triggers on ANY MCP tool output that may exceed 20 lines. Subagent routing is handled automatically via PreToolUse hook.
context-mode
Use context-mode tools (ctx_execute, ctx_execute_file) instead of Bash/cat when processing large outputs. Triggers: "analyze logs", "summarize output", "process data", "parse JSON", "filter results", "extract errors", "check build output", "analyze dependencies", "process API response", "large file analysis", "page snapshot", "browser snapshot", "DOM structure", "inspect page", "accessibility tree", "Playwright snapshot", "run tests", "test output", "coverage report", "git log", "recent commits", "diff between branches", "list containers", "pod status", "disk usage", "fetch docs", "API reference", "index documentation", "call API", "check response", "query results", "find TODOs", "count lines", "codebase statistics", "security audit", "outdated packages", "dependency tree", "cloud resources", "CI/CD output". Also triggers on ANY MCP tool output that may exceed 20 lines. Subagent routing is handled automatically via PreToolUse hook.
obs-bootstrap
Step-by-step OpenTelemetry and uFawkesObs setup: SDK init patterns for TypeScript, Python, Go; DORA metric spans; Grafana dashboard spec. Use when adding observability to a service.
colima
Use when Docker commands fail with "Cannot connect to Docker daemon", when starting/stopping container environments on macOS, when managing Docker contexts or profiles, or when running incus (system containers / VMs with nested virtualization) on macOS - provides Colima lifecycle management, profile handling, SSH commands, and troubleshooting
scope
Guide for working with Scope, a developer environment management tool that automates environment checks, detects known errors, and provides automated fixes. Use when creating Scope configurations (ScopeKnownError, ScopeDoctorGroup, ScopeReportLocation), debugging environment issues, or writing rules for error detection and remediation.
working-with-mise
Use when adding, configuring, or troubleshooting mise-managed tools - ensures proper CLI usage, detects existing config files, and diagnoses PATH/activation issues when commands aren't found
aws
AWS infrastructure management — EKS, ECR, VPC, RDS, ElastiCache, S3, Route53, ACM, Secrets Manager, CloudWatch, IAM
clone
Clone a GitHub repo as a starting skeleton — strips its git history, re-inits, generates CLAUDE.md for the detected stack, optionally renames variables/namespaces to your project
kubernetes
Kubernetes management — Helm charts, ArgoCD GitOps, Ingress, ConfigMaps, HPA autoscaling, blue/green deployments, debugging
terraform
Interactive Terraform/Terragrunt wizard — preset full-stack skeletons (AWS EKS, DigitalOcean Kubernetes) or custom AWS component picker, generates production-ready .tf files
choose-boring-technology
Apply the "Choose Boring Technology" principle when evaluating tech stack decisions, adding new tools or frameworks, proposing a rewrite, or debating between a familiar vs. cutting-edge approach. Use this skill whenever someone is considering adopting a new library, language, database, or service—even if they frame it as "should we use X?", "what tech should we pick?", or "is it worth trying Y?". This skill helps teams resist shiny-object syndrome and make grounded technology choices.
k8s-cluster
【K8s 集群】Kubernetes 集群管理配置生成。触发时机:用户说"搭建 K8s 集群"、"写 Helm chart"、"配置 RBAC"、"HPA 扩缩"时。
k8s-gen
【K8s 部署】自然语言描述生成 K8s YAML manifests。触发时机:用户说"生成 K8s 部署配置"、"写 Deployment YAML"、"生成 Service/Ingress"时。
silverblast-radius
This skill should be used to assess the blast radius of a proposed infrastructure or DevOps change before planning. Maps change scope, downstream dependencies, failure scenarios, rollback plan, and change window risk. Required before /devops-quality-gates in the devops-cycle workflow.
silverdevops
This skill should be used for SB-orchestrated infrastructure/CI-CD workflow: intel → silver:blast-radius → devops-skill-router → devops-quality-gates (7 IaC dims) → GSD plan/execute/verify → review → secure → ship
label-system
A minimal, opinionated GitHub label taxonomy for OSS / internal projects covering priority, area, issue status, PR review state, and independent reproduction. Use when setting up labels for a new repo, when triaging a backlog, when asked "how should we label issues", when reviewing whether existing labels are coherent, or when applying labels to a batch of open issues. Five orthogonal axes, ~16 labels total, every label answers a specific filter query — designed against the open-source convention of `S-waiting-on-*` (Rust) and two-stage approval (Kubernetes), but kept small enough for a solo / small-team repo to actually maintain. Includes a bootstrap script (`scripts/bootstrap-labels.sh`) that creates the full label set in a target GitHub repo with one `gh` call per label.
cnpg
Create and operate CloudNativePG (CNPG) Postgres databases on Kubernetes the GitOps/Flux way — on managed cloud (GKE + GCS via Workload Identity) OR self-hosted (K3s/bare-metal + any S3-compatible store via a credentials secret). Covers Cluster + ScheduledBackup manifests, barman WAL archiving, pgvector, PITR, prod→dev clones, and the NetworkPolicies a default-deny cluster needs. Use when provisioning a new app database, cloning prod into dev, enabling pgvector, wiring backups/PITR, writing CNPG NetworkPolicies, or debugging the silent "WAL archiving failed → PVC fills → Postgres CrashLoop → app can't read data" chain on CloudNativePG.
excalidraw
MANDATORY prerequisite for ALL Excalidraw MCP tool usage. Read BEFORE calling any Excalidraw tool (batch_create_elements, create_element, update_element, etc.). Without the sizing formulas, two-batch ordering (shapes-then-arrows), compact legends, domain styling presets, and write-check-review cycle in this skill, diagrams have invisible arrows, truncated text, and inconsistent colors. Use whenever the user asks to draw, sketch, visualize, or diagram anything technical — system architecture, microservices topology, C4 diagrams, data pipelines / ETL flows / lakehouse, sequence diagrams, ER diagrams, deployment / Kubernetes diagrams, network topology, flowcharts, decision trees. Includes ready-to-apply color palettes for software engineering, system architecture, and data solutions.
resume-target
Use when managing target job positions for resume customization. Handles JD parsing, match score calculation, gap analysis, and multi-target comparison for directed resume generation.
kubernetes
Kubernetes manifest generation, review, security hardening, and best practices for production workloads
digitalocean-registry-cleanup
Analyze and clean DigitalOcean Container Registry images. Lists repos with tag counts, deletes old tags (keep last N), finds stale repos, triggers garbage collection. Supports dry-run mode. Use when user says "clean registry", "delete old images", "DO registry", "registry cleanup", "docker images cleanup", "container registry", or "clean up old tags".
pattern-engineer-container
Containerized setups: every Dockerfile is multi-stage (`base`/`build`/`final`), pinned (no `:latest`) and vetted via `docker scout`, non-root with writable paths redirected, no in-image virtualenvs, `.dockerignore` required. Backends `alembic upgrade head` in entrypoint before exec'ing the server; expose fast `/healthz`. Frontend nginx puts API `location` blocks ABOVE the SPA `try_files` fallback. Secrets are runtime env vars. Activate on Dockerfile, compose, `.dockerignore`.
ship-it
Set up or fix a deploy pipeline. Picks a platform that fits the app, writes the config (Dockerfile, vercel.json, railway.toml, fly.toml, GitHub Actions), and ships a first deploy. Knows Vercel, Railway, Fly.io, Render, AWS basics (ECS, Lambda, Amplify), Docker, Kubernetes essentials, and GitHub Actions. Use when the user says "deploy this", "ship it", "set up vercel", "dockerize this", "write the GitHub Actions for deploy", or has working local code that needs to be live.
project-readme
Create, rewrite, update, or validate truthful README.md files for any project archetype. Use for libraries, SDKs, CLIs, web apps, API services, MCP servers, agent skills, monorepos, docs sites, GitHub Actions, extensions, container images, Terraform modules, Helm charts, model cards, dataset cards, research code, templates, demos, specs, desktop/mobile apps, badges, quick starts, setup docs, API or command references, README validation, and README quality checks.
ship-it
Set up or fix a deploy pipeline. Picks a platform that fits the app, writes the config (Dockerfile, vercel.json, railway.toml, fly.toml, GitHub Actions), and ships a first deploy. Knows Vercel, Railway, Fly.io, Render, AWS basics (ECS, Lambda, Amplify), Docker, Kubernetes essentials, and GitHub Actions. Use when the user says "deploy this", "ship it", "set up vercel", "dockerize this", "write the GitHub Actions for deploy", or has working local code that needs to be live.
bitbucket-workflow
Bitbucket best practices for pull requests, Pipelines CI/CD, Jira integration, and Atlassian ecosystem workflows
go-code-review
Quick-reference checklist for Go code review based on the Go Wiki CodeReviewComments. Maps to detailed skills for comprehensive guidance. Use when reviewing Go code or checking code against community style standards.
go-concurrency
Go concurrency patterns including goroutine lifecycle management, channel usage, mutex handling, and sync primitives. Use when writing concurrent Go code, spawning goroutines, working with channels, or documenting thread-safety guarantees. Based on Google and Uber Go Style Guides.
go-context
Go context.Context usage patterns including parameter placement, avoiding struct embedding, and proper propagation. Use when working with context.Context in Go code for cancellation, deadlines, and request-scoped values.
go-control-flow
Go control flow idioms from Effective Go. Covers if with initialization, omitting else for early returns, for loop forms, range, switch without fallthrough, type switch, and blank identifier patterns. Use when writing conditionals, loops, or switch statements in Go.
go-data-structures
Go data structures including allocation with new vs make, arrays, slices, maps, printing with fmt, and constants with iota. Use when working with Go's built-in data structures, memory allocation, or formatted output.
go-defensive
Defensive programming patterns in Go including interface verification, slice/map copying at boundaries, time handling, avoiding globals, and defer for cleanup. Use when writing robust, production-quality Go code.
go-documentation
Guidelines for Go documentation including doc comments, package docs, godoc formatting, runnable examples, and signal boosting. Use when writing or reviewing documentation for Go packages, types, functions, or methods.
go-error-handling
Comprehensive Go error handling patterns from Google and Uber style guides. Covers returning errors, wrapping with %w, sentinel errors, choosing error types, handling errors once, error flow structure, and logging. Use when writing Go code that creates, returns, wraps, or handles errors.
go-functional-options
The functional options pattern for Go constructors and public APIs. Use when designing APIs with optional configuration, especially with 3+ parameters.
go-interfaces
Go interfaces, type assertions, type switches, and embedding from Effective Go. Covers implicit interface satisfaction, comma-ok idiom, generality through interface returns, interface and struct embedding for composition. Use when defining or implementing interfaces, using type assertions/switches, or composing types through embedding.
go-linting
Recommended Go linters and golangci-lint configuration. Use when setting up linting for a Go project or configuring CI/CD.
go-naming
Go naming conventions for packages, functions, methods, variables, constants, and receivers from Google and Uber style guides. Use when naming any identifier in Go code—choosing names for types, functions, methods, variables, constants, or packages—to ensure clarity, consistency, and idiomatic style.
go-packages
Go package organization, imports, and dependency management from Google and Uber style guides. Use when creating packages, organizing imports, managing dependencies, using init(), or deciding how to structure Go code into packages.
go-performance
Go performance patterns including efficient string handling, type conversions, and container capacity hints. Use when optimizing Go code or writing performance-critical sections.
go-style-core
Core Go style principles and formatting guidelines from Google and Uber style guides. Use when writing any Go code to ensure clarity, simplicity, and consistency. This is the foundational skill - other Go style skills build on these principles.
go-testing
Go testing patterns from Google and Uber style guides including test naming, table-driven tests, subtests, parallel tests, test helpers, test doubles, and assertions. Use when writing or reviewing Go test code, creating test helpers, or setting up table-driven tests.
tui-design
Design and build clean, professional, minimal terminal UI (TUI) applications and command-line tools. Use this skill whenever the user is building, designing, refactoring, reviewing, or asking about terminal interfaces — full-screen TUIs (file managers, dashboards, monitors, git/k8s tools, REPLs), interactive CLI prompts, or simple command-line utilities. Use it for library questions ("Bubble Tea vs Ratatui vs Textual vs Ink"), design questions ("how should I lay out this dashboard"), and concrete build requests ("build me a TUI for X"), even when the user doesn't say "TUI" explicitly — phrases like "terminal app", "ncurses-style", "interactive shell tool", "CLI dashboard", "fzf-like picker", or naming a known TUI app (lazygit, k9s, btop, helix, yazi) all qualify.
rcode-ci
CI/CD setup and quality gates for the rcode-default stack — GitHub Actions for Node test matrix,.
rcode-perf
Performance optimisation for the rcode-default stack — Next.js (LCP / TBT / CLS / hydration),.
docker-patterns
Docker patterns covering Dockerfile best practices, multi-stage builds, Compose service configuration, networking, volumes, and security. Use whenever the project contains a Dockerfile, docker-compose.yml, .dockerignore, or compose.yaml, OR the user asks about Docker, containers, docker-compose, multi-stage builds, base images, volumes, healthcheck, depends_on, even if Docker is not mentioned by name.
vps-provisioning
VPS provisioning patterns for Linux servers covering initial setup, firewall, nginx reverse proxy, SSL/TLS with Let's Encrypt, systemd service management, and server hardening. Use whenever the project contains Ansible playbooks, shell provisioning scripts, nginx configs, systemd unit files, or certbot references, OR the user asks about VPS setup, server hardening, ufw, fail2ban, nginx reverse proxy, certbot, Let's Encrypt, systemd services, unattended-upgrades, even if VPS is not mentioned by name.
proxmox-lxc
Deploy and configure Proxmox LXC containers for self-hosted services. Always trigger immediately when Mick asks to deploy, set up, or configure a new service on Proxmox, mentions spinning up a container, or needs a systemd service, Cloudflare Tunnel entry, or UniFi static IP assignment. Generate the full stack including the pct create command, container config, apt setup, systemd unit file, and Cloudflare Tunnel config entry from a service name and IP.
kubeview-debug
Debug and diagnose Kubernetes clusters using KubeView MCP server tools. Use when investigating cluster issues (pod crashes, deployment failures, service connectivity problems, node issues, resource constraints), performing cluster health checks, or troubleshooting any Kubernetes workload. Trigger phrases include "cluster health", "pod won't start", "CrashLoopBackOff", "service unreachable", "deployment stuck", "node pressure", "OOMKilled", "ImagePullBackOff".
deploy-ninja
Handles zero-downtime deployments: blue-green, canary releases, rolling updates, and feature flag rollouts. Covers Kubernetes, Docker, Cloudflare Workers, Terraform, and CI/CD pipeline setup. Use this skill when the user wants to deploy an application, set up a deployment pipeline, implement canary releases, configure rolling updates, manage feature flags, or handle any release automation. Also triggers on "deploy to production," "set up CI/CD," "blue-green deployment," "canary release," "rolling update," "zero-downtime deploy," "rollback," or even casual requests like "push this to prod" or "how do I safely release this."
sast-analysis
Perform codebase analysis and architecture mapping as the first phase of a security assessment. Explores the tech stack, frameworks, entry points, data flows, and trust boundaries. Outputs sast/architecture.md. Run this before any vulnerability detection skill. Use when asked to analyze a codebase for security or when sast/architecture.md does not yet exist.
devops-best-practices
Opinionated production-grade DevOps defaults for Terraform, Kubernetes, CI/CD, Docker, cloud security, observability, cost, and disaster recovery. ALWAYS use when generating, reviewing, or modifying any infrastructure code, Kubernetes manifests (Deployment, Service, StatefulSet, Helm, Kustomize), Terraform (.tf, modules, state), Dockerfiles, docker-compose, CI/CD pipelines (.github/workflows, .gitlab-ci.yml, Jenkinsfile), cloud resources (AWS/GCP/Azure), IAM policies, security groups, observability setup (Prometheus, Grafana, OpenTelemetry), or DNS/TLS/CDN config — even if the user does not explicitly ask for best practices. Prevents the failure modes that hurt production teams most often: missing PDBs, single replicas in prod, latest image tags, public S3 buckets, long-lived credentials, missing observability, and CI/CD supply-chain risks. Apply opinionated defaults by default; surface tradeoffs when the user has reason to deviate.
proofread-subtitles
Use this skill to proofread a `.srt` subtitle file produced by Whisper or another voice-to-text engine.
code-mode
Add a "code mode" tool to an existing MCP server so LLMs can write small processing scripts that run against large API responses in a sandboxed runtime — only the script's compact output enters the LLM context window. Use this skill whenever someone wants to add code mode, context reduction, script execution, sandbox execution, or LLM-generated-code processing to an MCP server. Also trigger when users mention reducing token usage, shrinking API responses, running user-provided code safely, or adding a code execution tool to their MCP server — in any language (TypeScript, Python, Go, Rust, etc.).
vekil-reverse-proxy-deploy
Deploy or update Vekil from github.com/sozercan/vekil as a Kubernetes or local reverse proxy for Anthropic, Gemini, OpenAI Chat Completions, and OpenAI Responses-compatible clients. Use when the user asks to install, redeploy, configure provider routing for, expose, port-forward, validate, or troubleshoot Vekil reverse proxy deployments.
docker-expert
You are an advanced Docker containerization expert with comprehensive, practical knowledge of container optimization, security hardening, multi-stage builds, orchestration patterns, and production deployment strategies based on current industry best practices.
kube-audit-kit
Performs read-only Kubernetes security audits by exporting resources, sanitizing metadata, grouping applications by topology, and generating PSS/NSA-compliant audit reports. Use when the user requests auditing Kubernetes clusters, Namespaces, security reviews, or configuration analysis.
thanos-query
Claude Code skill for querying CNCF Thanos HTTP API and cross-region Prometheus metrics via PromQL. Use when running PromQL/MetricsQL queries against a Thanos Querier or Query Frontend, discovering metrics/labels across regions, inspecting Thanos stores (sidecars/store-gateway), checking alerts/rules, troubleshooting empty query results, or analyzing observability data across multiple Prometheus instances. Triggers on: thanos query, promql, cross-region metrics, p50/p95/p99 latency, sidecar status, store-gateway, query frontend, label discovery across regions, empty results, missing data, data delay, no data found, query returns empty, Kubernetes observability, Prometheus skill, Thanos skill.
fr-init
Initialize a repo for isolated runs: scan it, interview the operator about working patterns, tools, and credentials, then scaffold one or more devcontainer profiles via `fr init scaffold`. Use when a repo has no devcontainer profile, when fr-isolation or fr-brainstorming hard-stops asking for one, when the operator says "init this repo", "set up the devcontainer", or wants separate read-only/admin environments.
health-check
Check the health of the running WealthWise API, web app, and MongoDB services. Triggers when asked to "check if the app is running", "verify the API is up", "is the server healthy", or "show service status".
skill-atlas
Find the right public AI-agent skill for a job — and know whether to trust it. Load when about to start a task type (Upwork freelancing, technical interviews, office documents, MCP/tool building, prompt engineering, web/frontend, data analysis, learning English) and you want to know which existing public skills to pull in, rated by source reputation and freshness. Answers "which skill do I load for X, and can I trust it?"
opentelemetry-skill
Use when working with OpenTelemetry - configuring collectors, designing pipelines, instrumenting applications, implementing sampling strategies, managing cardinality, securing telemetry data, troubleshooting observability issues, writing OTTL transformations, making production observability architecture decisions, or setting up observability for AI coding agents (Claude Code, Codex, Gemini CLI, GitHub Copilot, and others)
automode-config
Author, validate, and migrate Claude Code autoMode blocks at the project level. Models the four official autoMode sections (environment, allow, soft_deny, hard_deny — all arrays of prose rules, with `$defaults` per section). Primary target is .claude/settings.local.json (per-user-per-project, gitignored, classifier-read). Reads ~/.claude/settings.json (user baseline, read-only) and .claude/settings.json (shared, classifier-ignores autoMode) for adoption candidates. Phase 1b is agent-driven: the calling agent reads CLAUDE.md / AGENTS.md / .claude/CLAUDE.md and emits a proposal JSON that flows through the same critique + hash-gate + atomic-write pipeline. Runs `claude auto-mode critique` as the canonical gate. Atomic write under per-file flock with sha256 hash gate. Requires Claude Code 2.1.83+ (auto mode itself; see references/automode_doc_bible.md).
dockerfile-best-practices
Create and optimize Dockerfiles with BuildKit, multi-stage builds, advanced caching, and security. Use this skill whenever you need to create, modify, or optimize a Dockerfile or a Docker Compose file. Also trigger when the user discusses container images, build performance, or Docker security — even if they don't explicitly mention 'Dockerfile'.
helm-bjw-s-chart
Generate production-ready Helm charts using the bjw-s-labs common library (app-template v5, with v4 legacy support). Use when creating a new Helm chart, converting Docker Compose to Helm, configuring controllers with sidecars or init containers, setting up services/ingress/persistence, HorizontalPodAutoscalers, ServiceMonitors/PodMonitors, NetworkPolicies, or handling StatefulSets and multi-controller deployments.
drawio-skill
Use when the user requests diagrams, flowcharts, architecture diagrams, ER diagrams, UML / sequence / class diagrams, network topology, ML/DL model figures (Transformer/CNN/LSTM), mind maps, or any visualization. Also use proactively when explaining systems with 3+ components, complex data flows, or relationships that benefit from visual representation. Best suited when the diagram needs custom styling, rich shape vocabulary, swimlanes, or exportable images (PNG/SVG/PDF/JPG). Generates .drawio XML and exports locally via the native draw.io desktop CLI.
kubernetes-architect
Expert Kubernetes architect specializing in cloud-native infrastructure, advanced GitOps workflows (ArgoCD/Flux), and enterprise container orchestration.
kube-audit-kit
Performs read-only Kubernetes security audits by exporting resources, sanitizing metadata, grouping applications by topology, and generating PSS/NSA-compliant audit reports. Use when the user requests auditing Kubernetes clusters, Namespaces, security reviews, or configuration analysis.
k8s-deploy
Deploy and manage applications on Kubernetes clusters with kubectl and Helm
zero-trust-patterns
Zero-Trust security patterns — mTLS between microservices (Istio/SPIFFE), SPIRE workload identity, OPA/Envoy authorization, NetworkPolicy default-deny-all, short-lived credentials, service mesh security, and Kubernetes RBAC hardening.
orka-kind-deploy
Rebuild and redeploy the Orka controller and all worker images into a local kind cluster from an Orka repository checkout. Use when the user asks to rebuild Orka, redeploy all components, refresh the local cluster after code changes, or roll out the full local stack for verification.
init
Scaffold a project-aware .claude/settings.json deny list with cfgaudit
readme-doc-writer
当需要为代码仓库新建或更新 README.md 时使用;先勘探代码库与部署目标,再按固定骨架产出一份覆盖本地开发/系统原理/生产部署的可复制粘贴 README;不适用于 API 参考、教程长文或设计文档等非 README 产物;触发词:写 readme、生成项目文档、document this project
careful
Enter careful-mode: stricter pre-edit gates and auto-checkpointing for high-stakes work. Use before changes to load-bearing or security-critical code.
helm-version-upgrade
Manages Helm chart version upgrades across Terraform+Helm platforms. Handles atomic 3-file updates with version discovery from ArtifactHub. Use when upgrading Helm charts, checking for outdated versions, or performing version consistency checks.
ingress-controller-install
GitOps-flavored Traefik Ingress Controller bootstrap, env addition, or chart upgrade in a Kustomize + ArgoCD repo. Operates exclusively on files under `common.traefik/` (base, overlays, argocd manifests). Never runs `helm install` or `helm upgrade` — those are ArgoCD's job. Plan-only: edits Kustomize files, emits the `git add` / commit / push commands, and the operator drives git. Validates coexistence with `ingress-nginx` via Kustomize-build inspection (no live cluster required). Use for new-cluster bootstrap, adding a new env overlay, or bumping the Traefik chart version.
kustomize-resource-validation
Auto-trigger skill that activates when any kustomization.yaml file is edited. Validates resource references, patch references, orphaned files, cross-environment consistency, build success, and generator configurations.
traefik-controller-decommission
GitOps-flavored SAFE uninstall of the `ingress-nginx` controller in a Kustomize + ArgoCD repo. Verifies cluster + repo are free of `ingressClassName: nginx` (precedence-aware: spec wins, legacy annotation falls back). After DNS bake confirmation, plans the decommission as: archive the `common.ingress-nginx/` (or equivalent) Kustomize module, disable the ArgoCD Application, wait for ArgoCD prune, then optional LB / IAM cleanup. Never runs `helm uninstall` — ArgoCD handles the actual resource removal via prune. Plan-only: emits a `commands.sh` for the operator to drive manually.
yaml-fix-suggestions
Auto-trigger skill that activates when YAML files in Kustomize module directories are modified. Checks formatting, Kubernetes label compliance, kustomization.yaml references, and build validation. Reports only when issues are found.
zeus
GitOps Engineer for Kustomize + ArgoCD platforms. Activates when the user works with Kustomize overlays, ArgoCD applications, Kubernetes manifests, or asks for YAML validation, environment management, or service scaffolding. Commanding, methodical, thorough approach.
quick-add-permission
Quickly add always-allow permissions to all AI tool permission lists
az-cli
Operate Azure resources from the command line using the az CLI
destroy-stack
Destroy an OCI Resource Manager stack's infrastructure.
careful
Safety guardrails for destructive commands. Warns before rm -rf, DROP TABLE, force-push, git reset --hard, kubectl delete, and similar destructive operations. User can override each warning. Use when touching prod, debugging live systems, or working in a shared environment. Use when asked to "be careful", "safety mode", "prod mode", or "careful mode". (gstack)
engineering-advanced
Advanced engineering patterns for AI-native products. Use when the user mentions agent design, RAG architecture, AI pipelines, MCP servers, API design best practices, CI/CD pipeline architecture, system design interviews, observability, infrastructure as code, or advanced engineering topics. Also triggers on: agent, RAG, retrieval augmented generation, MCP, API design, REST, GraphQL, CI/CD, GitHub Actions, Docker, Kubernetes, microservices architecture, event-driven, message queues, caching strategies, database design, system design.
platform-discovery
Pflicht-Discovery-Skill — fragt Projekt-Natur (concept/technical/hybrid) und bei technical/hybrid die Zielplattform (azure/aws/gcp/on-prem/hybrid-cloud/multi-cloud/claude-code-only). Trigger beim ersten Aufruf des PMO-Agent oder bei /run-harness wenn `plan/project.yaml` noch keine `target_platform` enthält.
kubernetes-operations
Automatically converted skill for kubernetes-operations
aegisops-ai
Autonomous DevSecOps & FinOps Guardrails. Orchestrates Gemini 3 Flash to audit Linux Kernel patches, Terraform cost drifts, and K8s compliance.
nowledge-mem-docker
Install, check on, or upgrade a self-hosted Nowledge Mem server (the headless Docker deployment) using the `nmemctl` lifecycle controller. Use this whenever the user mentions running their own Nowledge Mem instance, self-hosting Mem on a NAS, VPS, homelab, or server, deploying `nowledgelabs/mem` from Docker Hub, troubleshooting their Mem container, or upgrading a Mem server to a newer version. Trigger even when the user says "my Mem server", "self-hosted Mem", "the docker version of Mem", "memory server on my Synology / Proxmox / Raspberry Pi", or just describes a container that's at `docker.io/nowledgelabs/mem` without naming the product. Do NOT trigger for the Mem desktop app, Mem Cloud, or anything that doesn't touch the operator's own server.
horus
IaC Operations Engineer for Terraform + Helm + GKE platforms. Activates when the user works with Terraform modules, Helm charts, GKE infrastructure, or asks for validation, security scanning, or CI/CD improvements. Pipeline-driven, safety-first approach with automated checks.
github-actions-templates
Production-ready GitHub Actions workflow patterns for testing, building, and deploying applications.
docker-build-deploy
Use when user wants to containerize a project, set up Docker CI/CD with GitHub Actions, push images to GHCR or Docker Hub, deploy containers to a remote server, or generate optimized Dockerfiles
chinese-documentation
中文文档排版参考——中英文空格、全半角标点、术语保留、链接格式、中文文案排版指北约定。仅在用户显式 /chinese-documentation 时调用,不要根据上下文自动触发。
kubernetes-best-practices
Provides production-ready Kubernetes manifest guidance including resource management, security, high availability, and configuration best practices. This skill should be used when working with Kubernetes YAML files, deployments, pods, services, or when users mention k8s, container orchestration, or cloud-native applications.
accessibility-patterns
Flutter accessibility — Semantics, TalkBack, VoiceOver, contrast, touch targets, screen readers. Use when user mentions accessibility, a11y, semantics, screen reader, TalkBack, VoiceOver, or touch targets.
api-design
REST API design patterns — resource modeling, HTTP methods, status codes, pagination, RFC 9457 errors, OpenAPI documentation, versioning. Use when user asks about API design, endpoints, error handling, or documentation.
code-quality
Dart/Flutter code review — clean code, widget patterns, type safety, accessibility, performance. Use when user says "review code", "refactor", "check this PR", or before merging changes.
data-access
Data access patterns with Spring Data JPA, Hibernate 7.1, Flyway migrations, HikariCP tuning, N+1 prevention, caching, and query optimization. Use when user mentions database, JPA, queries, migrations, N+1, or slow queries.
design-patterns
Flutter design patterns — Composition, Repository, MVVM, Strategy, Observer, Singleton, Factory. Use when user asks "implement pattern", "use composition", or when designing reusable features.
observability
Backend observability patterns — structured logging, Micrometer metrics, OpenTelemetry tracing, Spring Boot Actuator, Kubernetes health probes, alerting, and dashboards. Use when user mentions logging, metrics, tracing, monitoring, health checks, or Prometheus.
performance-patterns
Flutter performance — widget rebuilds, Impeller, app size, startup time, frame rate, DevTools. Use when user mentions performance, slow UI, jank, large APK, startup time, or "optimize".
testing-patterns
Backend testing patterns with JUnit 6, Mockito 6, Testcontainers 2.0, Spring Boot slice tests, RestTestClient, and security testing. Use when user mentions testing, coverage, TDD, integration tests, or "write tests for".
list-missions
List all proactive background missions for the current user. Shows mission titles, schedules, statuses, and last run times. Use when the user asks to see their missions, what's running in the background, or what scheduled tasks exist. Keywords: missions, list, show, background, scheduled, active, paused.
careful
Global safety hook — active in ALL phases, ALL projects. Before executing rm -rf, DROP TABLE, force-push, terraform destroy, or any destructive command — STOP and ask user for confirmation.
iac-container-security
Audit infrastructure-as-code and container security including Terraform/OpenTofu/Pulumi configurations, Dockerfile hardening, Kubernetes manifests, base image hygiene, container scanning, secrets in IaC, IAM policies, network exposure, and runtime security context. Multi-cloud (AWS, GCP, Azure). Use this skill whenever the user asks about Terraform security, tfsec, Checkov, Trivy, Dockerfile hardening, distroless images, k8s securityContext, network policies, IAM least privilege, IaC secret scanning, or 'audit my infrastructure'. Trigger on phrases like 'scan my Dockerfile', 'review my Terraform', 'audit my k8s manifests', 'harden my containers', 'IaC security', 'base image hygiene', 'container CVEs', 'trivy scan'. Use this even when only one IaC layer is mentioned.
hcs-policy-tier-entry
Draft a proposed YAML tier entry for a new tool or capability. Target file is canonical in system-config, not this repo. Drafts require `hcs-policy-reviewer` subagent objections and human approval before merge.
debug-buttercup
Debugs the Buttercup CRS (Cyber Reasoning System) running on Kubernetes. Use when diagnosing pod crashes, restart loops, Redis failures, resource pressure, disk saturation, DinD issues, or any service misbehavior in the crs namespace. Covers triage, log analysis, queue inspection, and common failure patterns for: redis, fuzzer-bot, coverage-bot, seed-gen, patcher, build-bot, scheduler, task-server, task-downloader, program-model, litellm, dind, tracer-bot, merger-bot, competition-api, pov-reproducer, scratch-cleaner, registry-cache, image-preloader, ui.
prompt-to-drawio-skill
Generate and edit draw.io artifacts from natural-language prompts without a frontend. Use when the user asks for prompt-to-diagram workflows that need `.drawio` output, optional image export (`png`/`svg`/`pdf`/`jpg`), context ingestion (image/PDF/text/URL), shape-library lookup, or visual validation loops.
cicd-hardening
CI/CD pipeline hardening for GitHub Actions and GitLab CI — trust-model (pull_request_target vs pull_request), action pinning to SHA, OIDC-based cloud access, permissions minimization, runner isolation, and supply-chain gates (SLSA provenance, signing).
container-hardening
Docker and OCI image hardening — base-image selection, USER/caps/read-only FS discipline, distroless migration, build-time scanning with trivy/grype, image signing via sigstore, and runtime guardrails (seccomp, AppArmor).
iac-security
IaC misconfig scanning and cloud-aware review for Terraform, CloudFormation, Ansible and Pulumi. Covers tool orchestration (checkov/tfsec/kics/cfn-nag), policy-as-code (OPA/Conftest), CIS benchmark mapping, IAM over-permission detection, drift monitoring.
k8s-security
Kubernetes security review — RBAC discipline, Pod Security Standards (baseline/restricted), NetworkPolicy default-deny, admission controllers (Kyverno/Gatekeeper/VAP), External Secrets Operator, and runtime monitoring via Falco and audit logs.
rails-security
Rails security review — Brakeman integration, mass-assignment via strong_parameters, SQL injection in ActiveRecord, template injection via html_safe/raw, Devise hardening, credentials.yml.enc, force_ssl and CSP config, recent Rails/Rack CVE patterns.
secrets-scanner
Detect and remediate leaked credentials in code and git-history — entropy/regex scanning with gitleaks/trufflehog/detect-secrets, rotate-first incident response, and pre-commit/CI gating to prevent reoccurrence.
secure-coding
Language-agnostic secure-coding patterns — input validation, injection-safe APIs, authN/authZ, crypto, secrets, dependency hygiene. The default lens when no framework-specific skill applies.
spring-security
Spring Boot security review — Spring Security config (SecurityFilterChain), OAuth2/OIDC client and resource-server, method-level @PreAuthorize, JWT validation, actuator endpoint lockdown, CSRF model for web vs API, and recent Spring CVE patterns (Spring4Shell, SpEL injection, authorization bypasses).
supply-chain
Software supply-chain defense — SBOM generation (CycloneDX/SPDX), SLSA build provenance, artifact signing with sigstore/cosign, dependency-confusion and typosquat defense, and consumer-side verification of what you pull in.
deployment-pipeline-design
Design multi-stage CI/CD pipelines with approval gates, security checks, and deployment orchestration. Use this skill when designing zero-downtime deployment pipelines, implementing canary rollout strategies, setting up multi-environment promotion workflows, or debugging failed deployment gates in CI/CD.
github-actions-templates
Create production-ready GitHub Actions workflows for automated testing, building, and deploying applications. Use when setting up CI/CD with GitHub Actions, automating development workflows, or creating reusable workflow templates.
arch-guidance
Consult and apply the repository architecture reference when discussing system architecture, module boundaries, directory layout, service interfaces, Docker Compose, Kubernetes deployment, or operational entry points. Use when a user asks about architecture, refactors, project structure, deployment models, or infra layout.
kube-audit-kit
Performs read-only Kubernetes security audits by exporting resources, sanitizing metadata, grouping applications by topology, and generating PSS/NSA-compliant audit reports. Use when the user requests auditing Kubernetes clusters, Namespaces, security reviews, or configuration analysis.
infisical-ci-integration
This skill activates when configuring CI/CD pipelines, writing GitHub Actions workflows, GitLab CI configs, Dockerfiles, Kubernetes manifests, or serverless deployment configs that need secret injection. It provides patterns for integrating Infisical into build and deployment pipelines.
using-scrim
Route file reads and shell commands through Scrim's safe_read, safe_grep, and safe_shell tools whenever the target may contain secrets or PII (config files, .env*, *.pem, secrets/**, env-dumping commands, kubectl get secret, docker inspect, git remote -v with tokens, connection strings). Scrim returns tokenized content; the PreToolUse hook on Write|Edit|MultiEdit restores real values before bytes hit disk, so the model never sees raw secrets but files stay correct.
cloud-and-infra
Cloud-native service fingerprints, Kubernetes/container exposure, CI/CD platform exposure, TLS deep audit, and favicon hash pivot for authorized infrastructure recon.
managing-kubernetes
Manages Kubernetes clusters via kubectl. Supports pod/deployment/service management, log viewing, port-forwarding, and debugging. Use for "k8s", "kubectl", "파드", cluster management tasks.
launch-checklist
Validates full deployment readiness beyond code, checking infrastructure, Docker configuration, Kubernetes manifests, environment config, monitoring, security headers, and pipeline status. Use when launching, deploying to production, release readiness, go-live, deployment check, pre-launch, shipping to prod, or when preparing for production deployment.
sagemaker-hyperpod
Amazon SageMaker HyperPod expert for ML training clusters with Trainium or GPU. Use when: creating HyperPod clusters, running distributed training, configuring EKS or Slurm orchestration, troubleshooting cluster issues, checking quotas, or when user mentions "hyperpod", "hyp", "ml-cluster", "trainium", "trn1", "distributed training", or "multi-node training".
container-escape
Container escape methodology for Docker and Kubernetes. Covers privileged container breakout, mounted socket exploitation, capabilities abuse, cgroup v1 escape, and K8s node compromise.
context-packet
Design and build context-packet DAG pipelines — graph design, shell orchestration, MCP server integration, and programmatic TypeScript API. Use when creating AI agent workflows that pass context between nodes.
flutter-dart
Activated when building Flutter apps, creating widgets, implementing state management, setting up navigation, configuring themes, writing tests, or asking "how do I set up X" in a Flutter/Dart context. Covers UI, Riverpod, GoRouter, Dio, Freezed, Drift, Material 3 theming, testing, and Clean Architecture patterns.
react-nextjs
React 19.2 + Next.js 16 development - Server Components, Cache Components, proxy.ts, View Transitions, App Router, TypeScript 6, and Tailwind CSS v4. Use when building frontend apps, creating components, or asking "how do I set up X?"
spring-boot-core
Spring Boot 4.0 + Java 25 development - auto-configuration, starters, Actuator, profiles, externalized config, security, and production patterns. Use when building backend apps, creating endpoints, configuring Spring, or asking "how do I set up X?"
indexing-static-context
Provides an index of global static context files in ~/.agents/. Returns appropriate static file paths for natural language queries like "내 정보", "보안 규칙". Use when other skills or agents need to locate reference information.
iac
Config & container security review. Scans Dockerfiles, Kubernetes/Compose manifests, and Terraform/IaC for misconfigurations (privileged containers, root, unpinned images, hardcoded secrets, public network/storage, disabled TLS); the iac-reviewer agent confirms each in context and promotes real ones into .kuzushi/findings.json (source "iac"). Distinct from /sast (source injection) and the insecure-defaults companion (app config values).
container-expert
Container orchestration expert including Docker, Kubernetes, Helm, and service mesh
polaris-local-forge
**[REQUIRED]** Use for **ALL** requests involving local Apache Polaris: setup, API queries, catalog operations, cleanup, teardown. **AUTO-ACTIVATE:** If `.snow-utils/snow-utils-manifest.md` contains `polaris-local-forge:` this skill MUST handle ALL operations including cleanup. **DO NOT** use `polaris` CLI (does not exist), curl to Polaris endpoints (needs OAuth), or docker ps checks - invoke this skill first. Triggers: polaris local, local iceberg catalog, local polaris setup, rustfs setup, create polaris cluster, try polaris locally, get started with polaris, apache polaris quickstart, polaris dev environment, local data lakehouse, replay from manifest, reset polaris catalog, teardown polaris, clean up, cleanup, delete cluster, remove resources, polaris status, list catalogs, show namespaces, list tables, show catalog, describe table, list principals, show principal roles, list views, polaris namespaces, polaris catalogs, query data, query table, query iceberg, query catalog data, show my data, show table d
k8s
Deploy and operate workloads on Kubernetes
spring-microservices-architect
Production-grade governance agent for Spring Boot microservices. Scaffolds projects iteratively using capability-based layering, enforces coding standards, and validates against battle-tested reference patterns. Fully portable — works with any domain. USE FOR: microservice, Spring Boot, scaffold, Docker compose, kubernetes, helm, eureka, gateway, resilience4j, reactive, spring cloud, openapi, persistence, security, oauth, tracing, zipkin, monitoring, prometheus, grafana, native compilation, graalvm, code review, architecture review, quality gate, governance, spring cloud stream, rabbitmq, kafka, testcontainers, mapstruct, service discovery, edge server, config server, circuit breaker, distributed tracing, entity, entities, domain model, generate entity, persistence model, create entity, MongoDB document, JPA entity, MapStruct mapper, repository, test, verify, validate, TDD, test-driven, failing test, integration test, build check, regression test, quality check, security database, MFA, multi-factor, WebAuthn,
preparing-iac-deployment
Prepares IaC project deployment by analyzing the current project and generating K8s manifests, Dockerfiles, CI/CD workflows in standardized structure. Use for "배포 준비", "IaC 설정", "k8s 매니페스트", "deploy prep" requests.
add-cli
Add a new CLI binary (or wire missing auth/persistence for an existing one) to the toolbox image — Dockerfile layer + version ARG + opt-out flag + `internal/config/tools.go` entry + `smoke-test.sh` check + Renovate `customManager` + (when the CLI persists state) `~/.toolbox/<tool>` bind-mount in `internal/mountplan/defaults.go`. Use this whenever the user says things like "add <X> to the toolbox", "install <X> in the container", "put <X> in the image", "add <X> CLI", "wire auth for <X>", "persist <X> credentials", "save <X> authentication", or names a binary they want available inside `toolbox shell`. Also use it when an audit shows a CLI is in the Dockerfile but its credentials don't survive `toolbox stop` — that's the gws-style half-installed case this skill explicitly handles. Always perform the edits autonomously and finish with `/verify`; don't hand the user a checklist to apply themselves.
improve-codebase-architecture
Find deepening opportunities in a codebase, informed by the domain language in CONTEXT.md and the decisions in docs/adr/. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more testable and AI-navigable.
verify
Run the toolbox repo's pre-push validation — golangci-lint, go tests, and (when the image is built) the bundled-CLI smoke test. Mirrors the PR CI in `.github/workflows/ci.yml`, so green locally means green on CI. Use this before marking any code change "done", before opening a PR, or any time the user says things like "verify", "check it passes", "are we good to push", "è tutto a posto prima del commit". Always prefer this over running `go test` or `golangci-lint` ad-hoc, because Go is not installed on the host and this skill already encodes the containerised pattern.
commit-rules
Conventions for creating git commits in this repo — how to scope, stage, and word a commit. Use whenever about to run `git commit`, when the user asks to "commit", "commit this", or "save changes", or when wording a commit message. Covers subject/body style, atomic scoping, what not to commit, and required trailers. Generic and project-agnostic; no language- or framework-specific rules.
review-tests
Review changes to this TMDB API testing framework against its own conventions and gotchas — the project-specific checks that generic code review misses. Use after writing or editing an API client, endpoint method, test module, fixture, Pydantic schema, test-data YAML, or assertion helper, and before committing/opening a PR. Covers schema dual-registration, module-scope test data, the API-client/test separation, response validation via Pydantic + assert_http_response, and the known flaky-endpoint exemption. Complements (does not replace) the built-in /code-review for generic bugs.
update-claude-md
Update CLAUDE.md so it stays accurate after a significant change to this TMDB API testing framework. Use after adding/removing/renaming an API client, endpoint method, test module, fixture, schema, or helper; changing how tests are run (Poetry/pytest commands, flags, markers); adding/changing config or env vars; introducing a new convention, gotcha, or dependency; or reworking Docker/K8s/CI/MCP wiring. Trigger when finishing such a change, before committing, or when the user asks to "update CLAUDE.md" / "keep the docs in sync". Skip for pure test-data tweaks, formatting, or one-off fixes that don't change structure, commands, conventions, or gotchas.
search
Use when the user wants to search the web via Tavily and index the results, find recent information on a topic and store it, or combine live web search with automatic crawling. Triggers on "search the web for", "find recent articles about", "search and index", "Tavily search", or when the user wants to pull fresh web content into axon. Different from `query` — this searches the live web, not already-indexed content.
devops-engineer
Senior DevOps engineer specializing in Docker, Kubernetes, CI/CD pipelines, cloud infrastructure (AWS/GCP/Azure), and deployment automation. Use when setting up deployment pipelines, containerizing applications, or managing cloud infrastructure.
check-ceph-health
Check Ceph storage health on OpenShift OCS/ODF clusters. Use when PVCs are stuck in Pending, storage provisioning fails, Ceph is degraded, OSDs are full, or cluster storage needs diagnosis.
k8s-security-policies
Implement Kubernetes security policies including NetworkPolicy, PodSecurityPolicy, and RBAC for production-grade security. Use when securing Kubernetes clusters, implementing network isolation, or ...
secrets-vault-manager
Use when the user asks to set up secret management infrastructure, integrate HashiCorp Vault, configure cloud secret stores (AWS Secrets Manager, Azure Key Vault, GCP Secret Manager), implement secret rotation, or audit secret access patterns.
lettactl
Manage Letta AI agent fleets with kubectl-style CLI
ci-cd
Use when the user asks to create, edit, debug, or optimize CI/CD pipelines, workflow YAML, build/test jobs, deployment automation, matrix builds, caches, permissions, or secrets in GitHub Actions, GitLab CI, CircleCI, Jenkins, or similar systems.
setup-container-registry
Configure container image registries including GitHub Container Registry (ghcr.io), Docker Hub, and Harbor with automated image scanning, tagging strategies, retention policies, and CI/CD integration for secure image distribution. Use when setting up a private container registry, migrating from Docker Hub to self-hosted registries, implementing vulnerability scanning in CI/CD pipelines, managing multi-architecture images, enforcing image signing, or configuring automatic cleanup and retention policies.
gitops-repo-audit
Audit and validate Flux CD GitOps repositories by scanning local repo files (not live clusters) — runs Kubernetes schema validation, detects deprecated Flux APIs, reviews RBAC/multi-tenancy/secrets management, and produces a prioritized GitOps report. Use when users ask to audit, analyze, validate, review, or security-check a GitOps repo.
terraform-iac-expert
Terraform and OpenTofu infrastructure as code — module design, state management, multi-environment setups, remote backends, secrets management, CI/CD integration. NOT for Pulumi, CDK, Ansible, or Kubernetes manifests.
k8s-security-policies
Implement Kubernetes security policies including NetworkPolicy, PodSecurityPolicy, and RBAC for production-grade security. Use when securing Kubernetes clusters, implementing network isolation, or ...
eks-troubleshooting
Investigate Kubernetes/EKS issues by running run-investigation.sh with the issue type, resource name, kubectl context, and namespace.
skills-readme-updater
This skill should be used after creating or modifying skills to update the main README.md file. It scans all skills in ~/.claude/skills/, extracts metadata from SKILL.md files, and regenerates the README with categorized skill listings. Triggers on requests mentioning "update skills readme", "refresh skills list", or after adding new skills.
secrets-vault-manager
Use when the user asks to set up secret management infrastructure, integrate HashiCorp Vault, configure cloud secret stores (AWS Secrets Manager, Azure Key Vault, GCP Secret Manager), implement secret rotation, or audit secret access patterns.
validate
Pre-commit validation suite for manifests, scripts, and configs
Integration detected automatically from skill content. Some results may be false positives.