azure-openai-patternslisted

Azure OpenAI API patterns for rate limiting, function calling, error handling, and token optimization
fabioc-aloha/Alex_Skill_Mall · ★ 1 · AI & Automation · score 80

Install: claude install-skill fabioc-aloha/Alex_Skill_Mall

# Azure OpenAI Patterns > Rate limiting, function calling, error handling, and token optimization for Azure OpenAI API. > **Staleness Watch**: See [EXTERNAL-API-REGISTRY.md](../../EXTERNAL-API-REGISTRY.md) for source URLs and recheck cadence **Version**: 1.1.0 | **Last validated**: April 2026 (GPT-5.x, Responses API, Structured Outputs) --- ## Rate Limiting: The Dual System Azure OpenAI uses **dual rate limits**: Tokens Per Minute (TPM) and Requests Per Minute (RPM). The ratio is typically 6 RPM per 1000 TPM. ### TPM vs RPM Relationship | Model | Tier | TPM | RPM | Notes | |-------|------|-----|-----|-------| | gpt-5.4-mini | Default | 2M | 12K | Latest flagship (mini) | | gpt-5.2 | Default | 1M | 6K | Reasoning model | | gpt-4.1 | Default | 1M | 6K | 1M context, structured outputs | | gpt-4.1-mini | Default | 2M | 12K | Cost-efficient 1M context | | o4-mini | Default | 200K | 1.2K | Reasoning (o-series) | | o3 | Default | 200K | 1.2K | Advanced reasoning | | gpt-4o | Default | 450K | 2.7K | Legacy — prefer gpt-4.1+ | | gpt-4o-mini | Default | 2M | 12K | Legacy — prefer gpt-4.1-mini | > **Migration**: gpt-4o → gpt-4.1 (same API, larger context, better quality). gpt-4o-mini → gpt-4.1-mini or gpt-4.1-nano (cost savings). ### How TPM is Calculated TPM is estimated **before processing** based on: 1. Prompt text character count (converted to estimated tokens) 2. `max_tokens` parameter setting 3. `best_of` parameter setting (if used) The rate limit estimate is NOT the