azure-openai-patternslisted
Install: claude install-skill fabioc-aloha/Alex_Skill_Mall
# Azure OpenAI Patterns
> Rate limiting, function calling, error handling, and token optimization for Azure OpenAI API.
> **Staleness Watch**: See [EXTERNAL-API-REGISTRY.md](../../EXTERNAL-API-REGISTRY.md) for source URLs and recheck cadence
**Version**: 1.1.0 | **Last validated**: April 2026 (GPT-5.x, Responses API, Structured Outputs)
---
## Rate Limiting: The Dual System
Azure OpenAI uses **dual rate limits**: Tokens Per Minute (TPM) and Requests Per Minute (RPM). The ratio is typically 6 RPM per 1000 TPM.
### TPM vs RPM Relationship
| Model | Tier | TPM | RPM | Notes |
|-------|------|-----|-----|-------|
| gpt-5.4-mini | Default | 2M | 12K | Latest flagship (mini) |
| gpt-5.2 | Default | 1M | 6K | Reasoning model |
| gpt-4.1 | Default | 1M | 6K | 1M context, structured outputs |
| gpt-4.1-mini | Default | 2M | 12K | Cost-efficient 1M context |
| o4-mini | Default | 200K | 1.2K | Reasoning (o-series) |
| o3 | Default | 200K | 1.2K | Advanced reasoning |
| gpt-4o | Default | 450K | 2.7K | Legacy — prefer gpt-4.1+ |
| gpt-4o-mini | Default | 2M | 12K | Legacy — prefer gpt-4.1-mini |
> **Migration**: gpt-4o → gpt-4.1 (same API, larger context, better quality). gpt-4o-mini → gpt-4.1-mini or gpt-4.1-nano (cost savings).
### How TPM is Calculated
TPM is estimated **before processing** based on:
1. Prompt text character count (converted to estimated tokens)
2. `max_tokens` parameter setting
3. `best_of` parameter setting (if used)
The rate limit estimate is NOT the