From 4751e0c82c350e104e1e25e6da4f4520623c18f1 Mon Sep 17 00:00:00 2001 From: Kanchan-Microsoft Date: Fri, 26 Sep 2025 12:08:13 +0530 Subject: [PATCH 1/4] Add documentation for deploying with limited Azure OpenAI quota --- docs/DeployWithLimitedQuota.md | 99 ++++++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) create mode 100644 docs/DeployWithLimitedQuota.md diff --git a/docs/DeployWithLimitedQuota.md b/docs/DeployWithLimitedQuota.md new file mode 100644 index 00000000..870d1158 --- /dev/null +++ b/docs/DeployWithLimitedQuota.md @@ -0,0 +1,99 @@ +# Deploying with Limited OpenAI Quota + +This document provides guidance on deploying the Document Generation Solution Accelerator when you have limited Azure OpenAI model quota available. + +## Overview + +By default, the solution requires: +- **GPT model**: 200,000 Tokens Per Minute (TPM) +- **Embedding model**: 80,000 TPM + +If your Azure OpenAI service has lower quota limits, you can modify the deployment to work with reduced capacity. + +## Prerequisites + +Before proceeding, ensure you have: +- Azure Developer CLI (azd) installed +- Access to your Azure OpenAI service quota settings +- Knowledge of your current TPM limits + +## Deployment Options + +You have two approaches to deploy with less quota: + +### Option 1: Remove Quota Validation + +Remove the metadata section (lines 73-81) from the [`infra/main.bicep`](../infra/main.bicep) file: + +```bicep +@metadata({ + azd: { + type: 'location' + usageName: [ + 'OpenAI.GlobalStandard.gpt-4o-mini,200' + 'OpenAI.GlobalStandard.text-embedding-ada-002,80' + ] + } +}) +``` + +### Option 2: Modify Quota Thresholds (Recommended) + +Update the values on lines 77-78 in [`infra/main.bicep`](../infra/main.bicep) to match your available quota: + +```bicep +@metadata({ + azd: { + type: 'location' + usageName: [ + 'OpenAI.GlobalStandard.gpt4.1, 50' // Changed from 200 + 'OpenAI.GlobalStandard.text-embedding-ada-002, 50' // Changed from 80 + ] + } +}) +``` + +## Configuration Steps + +After modifying the Bicep file, configure your deployment capacity: + +```powershell +azd env set AZURE_ENV_MODEL_CAPACITY="50" +azd env set AZURE_ENV_EMBEDDING_MODEL_CAPACITY="50" +``` + +> **Note**: Adjust the values (50) to match your actual available quota. + +## Deploy the Solution + +Once configured, proceed with deployment: + +```powershell +azd up +``` + +## Performance Considerations + +⚠️ **Important**: Using reduced TPM limits may impact application performance: + +For optimal performance, we recommend maintaining at least 150,000 TPM for GPT models when possible. + +## Additional Resources + +For more detailed information, refer to: + +- [Deployment Guide](DeploymentGuide.md) - Complete deployment instructions +- [Customizing azd Parameters](CustomizingAzdParameters.md) - Advanced configuration options +- [Check or update Quota](AzureGPTQuotaSettings.md) - Check or update quota from Azure Portal +- [Quota Check](QuotaCheck.md) - Script for checking Azure OpenAI quota limits + +## Why we need to do this? +- The solution uses built-in Azure Developer CLI (azd) quota validation to prevent deployment failures. Specifically, azd performs pre-deployment checks to ensure sufficient quota is available i.e. 200k TPM for gpt model and 80k TPM for embedding model. + +- These quota thresholds are hardcoded in the infrastructure file because azd's quota checking mechanism doesn't currently support parameterized values. If your Azure OpenAI service has quota below these thresholds, the deployment will fail during the validation phase rather than proceeding and failing later in the process. + +- By following the steps above, you can either: + 1. **Bypass quota validation entirely** by removing the metadata block + 2. **Lower the validation thresholds** to match your available quota (e.g., 50,000 TPM) + +- This ensures successful deployment while working within your quota constraints. \ No newline at end of file From 01e6c3c009342cd876275372a3ecc0af5be48a8b Mon Sep 17 00:00:00 2001 From: Kanchan-Microsoft Date: Fri, 26 Sep 2025 12:32:13 +0530 Subject: [PATCH 2/4] update --- docs/DeployWithLimitedQuota.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/DeployWithLimitedQuota.md b/docs/DeployWithLimitedQuota.md index 870d1158..b4a0e22e 100644 --- a/docs/DeployWithLimitedQuota.md +++ b/docs/DeployWithLimitedQuota.md @@ -76,7 +76,7 @@ azd up ⚠️ **Important**: Using reduced TPM limits may impact application performance: -For optimal performance, we recommend maintaining at least 150,000 TPM for GPT models when possible. +For optimal performance, we recommend maintaining at least 200,000 TPM for GPT models when possible. ## Additional Resources From 41ad1e87787770862e526802b45505b917dfa7a3 Mon Sep 17 00:00:00 2001 From: Kanchan-Microsoft Date: Fri, 26 Sep 2025 15:00:54 +0530 Subject: [PATCH 3/4] update --- docs/DeployWithLimitedQuota.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/DeployWithLimitedQuota.md b/docs/DeployWithLimitedQuota.md index b4a0e22e..abbbb5bd 100644 --- a/docs/DeployWithLimitedQuota.md +++ b/docs/DeployWithLimitedQuota.md @@ -1,6 +1,6 @@ # Deploying with Limited OpenAI Quota -This document provides guidance on deploying the Document Generation Solution Accelerator when you have limited Azure OpenAI model quota available. +This document provides guidance on deploying the Build your own copilot Solution Accelerator when you have limited Azure OpenAI model quota available. ## Overview From f197c7fed8b063ce98b9791a9c581607262d8d11 Mon Sep 17 00:00:00 2001 From: Kanchan-Microsoft Date: Mon, 29 Sep 2025 20:24:39 +0530 Subject: [PATCH 4/4] deploy with limited quota --- docs/DeployWithLimitedQuota.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/DeployWithLimitedQuota.md b/docs/DeployWithLimitedQuota.md index abbbb5bd..08bc415b 100644 --- a/docs/DeployWithLimitedQuota.md +++ b/docs/DeployWithLimitedQuota.md @@ -1,6 +1,6 @@ # Deploying with Limited OpenAI Quota -This document provides guidance on deploying the Build your own copilot Solution Accelerator when you have limited Azure OpenAI model quota available. +This document provides guidance on deploying the Build Your Own Copilot Solution Accelerator when you have limited Azure OpenAI model quota available. ## Overview @@ -46,7 +46,7 @@ Update the values on lines 77-78 in [`infra/main.bicep`](../infra/main.bicep) to azd: { type: 'location' usageName: [ - 'OpenAI.GlobalStandard.gpt4.1, 50' // Changed from 200 + 'OpenAI.GlobalStandard.gpt-4o-mini, 50' // Changed from 200 'OpenAI.GlobalStandard.text-embedding-ada-002, 50' // Changed from 80 ] }