Stability AI
AICommonly used with
Skills using Stability AI (29)
storyboard-prompting
Generate detailed image prompts for storyboard frames optimized for Midjourney, DALL-E, and Stable Diffusion
stable-diffusion-image-generation
State-of-the-art text-to-image generation with Stable Diffusion models via HuggingFace Diffusers. Use when generating images from text prompts, performing image-to-image translation, inpainting, or building custom diffusion pipelines.
ideogram-migration-deep-dive
Migrate from other image generation APIs to Ideogram, or re-architect existing Ideogram integrations. Use when switching from DALL-E/Midjourney/Stable Diffusion to Ideogram, or performing major integration overhauls. Trigger with phrases like "migrate to ideogram", "switch to ideogram", "replace dall-e with ideogram", "ideogram replatform", "ideogram migration".
comfyui-gateway
REST API gateway for ComfyUI servers. Workflow management, job queuing, webhooks, caching, auth, rate limiting, and image delivery (URL + base64).
stability-ai
Geracao de imagens via Stability AI (SD3.5, Ultra, Core). Text-to-image, img2img, inpainting, upscale, remove-bg, search-replace. 15 estilos artisticos.
cover-art-prompting
Create detailed text-to-image prompts for album and song cover artwork optimized for Midjourney, DALL-E, and other AI image generators
stable-diffusion-image-generation
State-of-the-art text-to-image generation with Stable Diffusion models via HuggingFace Diffusers. Use when generating images from text prompts, performing image-to-image translation, inpainting, or building custom diffusion pipelines.
bedrock
AWS Bedrock foundation models for generative AI. Use when invoking foundation models, building AI applications, creating embeddings, configuring model access, or implementing RAG patterns.
stable-diffusion-image-generation
State-of-the-art text-to-image generation with Stable Diffusion models via HuggingFace Diffusers. Use when generating images from text prompts, performing image-to-image translation, inpainting, or building custom diffusion pipelines.
omniroute-image
Image generation via OmniRoute using OpenAI /v1/images/generations format with auto-fallback across DALL-E, Stable Diffusion, Flux, Imagen providers. Use when the user wants to generate, edit, or vary images.
aws-sdk-java-v2-bedrock
Provides Amazon Bedrock patterns using AWS SDK for Java 2.x. Invokes foundation models (Claude, Llama, Titan), generates text and images, creates embeddings for RAG, streams real-time responses, and configures Spring Boot integration. Use when asking about Bedrock integration, Java SDK for AI models, AWS generative AI, Claude/Llama invocation, embeddings for RAG, or Spring Boot AI setup.
art-director
AI art direction system. Claude directs image generation models (Gemini, DALL-E, Flux) via structured prompts. Generate banners, diagrams, logos, screenshots, and social media visuals without leaving the terminal.
ai-media-generator
為使用者產生高品質的 AI 生圖、生影片、生音樂提示詞,並在需要時透過瀏覽器自動化實際送到目標平台。涵蓋 OiiOii、Kling 3.0/O-series、Seedance 2.0 pro、Suno v5.5、Seedream 5.0/4.0、Vidu Q3、Midjourney V8.1、Flux 1.1 Pro / Kontext、Runway Gen-4.5 / Aleph、Google Veo 3.1、Ideogram 3、Nano Banana Pro、Stable Diffusion 3.5(⚠️ OpenAI Sora 2 已於 2026-04-26 停運,API 撐到 2026-09-24,預設改推 Runway/Veo/Kling)。只要使用者提到「AI 生圖」「AI 影片」「AI 音樂」「做 MV」「做 storyboard」「寫 prompt 給 XXX」「我想用 Kling/Suno/Midjourney/Runway/Veo...」「幫我操作 OiiOii / 即夢 / 可靈」「txt2img / img2video / 文生圖 / 文生影片 / 圖生影片」「角色一致性」「多鏡頭分鏡」「運鏡」「結果有瑕疵 / 不夠精緻 / 怎麼修」,或任何跟上述平台或影像/影片/音樂生成工作流相關的任務,都要用這個 skill。即使他們沒講明平台,只要任務是要餵給某個生成模型的 prompt,就用這個 skill 幫他們選對的平台、寫對的格式。
diffusion-engineering
Практическая инженерия диффузионных моделей: архитектуры, обучение, инференс, оптимизация памяти. Использовать при любых задачах с диффузионными моделями: проектирование или модификация архитектуры (UNet/DiT/Flow/Flux), выбор и настройка schedulers/samplers, дообучение (LoRA/DreamBooth/full fine-tune), оптимизация памяти (AMP/checkpointing/ZeRO/FSDP/quantization), замена или fusion текст-энкодеров (CLIP/Qwen), работа с Diffusers, отладка диффузионных пайплайнов, оценка качества (FID/CLIPScore/LPIPS), latent diffusion, VAE, guidance/CFG, rectified flow, Stable Diffusion, SDXL, Flux. Также применять при вопросах про GPU-память при обучении генеративных моделей, text-to-image пайплайны, ControlNet, multi-encoder fusion, WebDataset.
forensic-prompt-compiler
Forensic image-to-prompt compiler for image generation models. Use this skill whenever the user wants to: convert/describe an existing image into a generation prompt, reconstruct a scene as a prompt, generate prompts from reference images for AI image tools (Midjourney, FLUX, Stable Diffusion, DALL-E, or any diffusion model), write prompts that preserve exact visual properties of a source image, or needs precise control over identity-safe subject description, geometry lock, lighting reconstruction, color anchoring, or handler-based special cases (floating scenes, collages, close-ups, jewelry, garments, surreal elements). Also trigger for requests involving: image editing prompts, reference-driven generation, pose description, camera angle locking, fabric/material description, or any "turn this image into a prompt" task.
ai-image-generation
Generate AI images with FLUX, Gemini, Grok, Seedream, Reve and 50+ models via inference.sh CLI. Models: FLUX Dev LoRA, FLUX.2 Klein LoRA, Gemini 3 Pro Image, Grok Imagine, Seedream 4.5, Reve, ImagineArt. Capabilities: text-to-image, image-to-image, inpainting, LoRA, image editing, upscaling, text rendering. Use for: AI art, product mockups, concept art, social media graphics, marketing visuals, illustrations. Triggers: flux, image generation, ai image, text to image, stable diffusion, generate image, ai art, midjourney alternative, dall-e alternative, text2img, t2i, image generator, ai picture, create image with ai, generative ai, ai illustration, grok image, gemini image
open-forge
Self-host any open-source app on the user's own infrastructure (cloud VM, VPS, Raspberry Pi, localhost, k8s, PaaS). Walks the user through provisioning, DNS, TLS, SMTP, and hardening in phased + resumable workflows. 2216+ verified recipes plus live-derived fallback for the long tail. Agent-mode rules apply (no chat-paste credentials, no group-channel deploys).
logo-brief
Create comprehensive logo design briefs with concept directions, references, constraints, and deliverables — ready for human designers or AI image generators
agency-image-prompt-engineer
Expert photography prompt engineer specializing in crafting detailed, evocative prompts for AI image generation. Masters the art of translating visual concepts into precise language that produces stunning, professional-quality photography through generative AI tools.
gemini-mcp
Use Google Gemini for image generation, text chat, file analysis, URL/YouTube analysis, and multi-turn conversations via MCP. Triggers on requests to generate images with Gemini, chat with Gemini, analyze files/URLs/videos with Gemini, use Gemini models, or when user asks to create/edit images and needs prompting guidance. Do NOT use for DALL-E, Midjourney, Stable Diffusion, or OpenAI image generation.
visual-architect
Transform research papers into professional visual schemas. Analyzes paper logic, selects optimal layout patterns, and generates detailed prompts for AI image generation.
transparent-bg
Produce a truly RGBA-transparent asset from a brief. Handles the
stable-diffusion-comfyui-workflow-runner
Executes ComfyUI workflow JSON files against a local or remote ComfyUI server via its REST API. Supports LoRA loading, ControlNet conditioning, and queue management with progress polling.
ai-image-prompting
Crafts production-grade prompts for AI image generation (Midjourney, Flux, SDXL, Firefly, Imagen, ComfyUI workflows) — subject, composition, lighting, style references, negative prompts, ControlNet hints. Use when the goal is on-brand, repeatable imagery rather than a one-off lucky generation.
agency-image-prompt-engineer
Expert photography prompt engineer specializing in crafting detailed, evocative prompts for AI image generation. Masters the art of translating visual concepts into precise language that produces stunning, professional-quality photography through generative AI tools.
whisk-proxy
Generate images with Google Imagen 4 and Nano Banana for free via Flow API. No API key, no paid subscription — just a Google account and Chrome extension. Use when the user asks to generate, create, or make an image, picture, illustration, logo, banner, or avatar. Triggers: generate image, create picture, make illustration, нарисуй, сгенерируй картинку, создай изображение. Do NOT use for image editing, upscaling, background removal, video generation, or non-Google AI models (DALL-E, Midjourney, Stable Diffusion).
nano-banana-ultra
Multi-model AI image generation with Gemini, DALL-E, Stability AI, 30+ templates
paper-visualizer
Transform research papers into professional visual schemas. Analyzes paper logic, selects optimal layout patterns, and generates detailed prompts for AI image generation.
design-image-prompt-engineer
Expert photography prompt engineer specializing in crafting detailed, evocative prompts for AI image generation. Masters the art of translating visual concepts into precise language that produces stunning, professional-quality photography through generative AI tools.
Integration detected automatically from skill content. Some results may be false positives.