Complete Azure AI Foundry Terraform Configurations - Deploy AI Foundry with or without capability hosts
This repository provides production-ready Terraform configurations for deploying Azure AI Foundry with private networking and cross-subscription support.
Choose the configuration that best fits## π Repository Structur## π Additional Resourcesyour requirements:
Simplified configuration - Cost-optimized deployment without compute infrastructure
- β Lower cost (~30% reduction vs capability hosts)
- β No Cosmos DB for conversation storage
Start here β Continue with current configuration
Enterprise configuration - Complete AI Foundry with Standard Agent support
- β Capability hosts for bring-your-own Azure resources
- β Cosmos DB for thread and conversation storage
- β Agent subnet injection for network-secured deployments
- β Standard Agent runtime support
β οΈ Higher cost and complexity
Enterprise setup β Switch to terraform-foundry-caphost/
This deployment provides a cost-optimized Azure AI Foundry setup with:
- Azure AI Foundry Hub & Project - Core AI platform with GPT-4o model
- Private networking - Secure connectivity with private endpoints
- Cross-subscription support - Separate workload and infrastructure subscriptions
- Storage & Search - Integrated storage account and AI search service
- Monitoring - Application Insights and diagnostic settings
- Security - RBAC, managed identities, and optional Key Vault
- Simplified Architecture - No capability hosts or Cosmos DB for reduced cost
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Azure AI Foundry β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β AI Foundry β β Storage β β AI Search β β
β β β β Account β β Service β β
β β β β β β β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β AI Foundry β β Private β β Private β β
β β Project β β Endpoint β β Endpoint β β
β β β β (Storage) β β (Search) β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββ ββββββββββββββββββββββββββββββββββββ β
β β OpenAI β β Monitoring β β
β β Deployment β β β’ Application Insights β β
β β (GPT-4o) β β β’ Log Analytics Workspace β β
β ββββββββββββββββ β β’ Diagnostic Settings β β
β ββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
graph TB
subgraph "Infrastructure Subscription"
DNS[Private DNS Zones]
VNET[Virtual Network]
end
subgraph "Workload Subscription"
RG[Resource Group]
AI[AI Foundry Hub]
PROJ[AI Foundry Project]
ST[Storage Account]
SRCH[AI Search Service]
MON[App Insights]
end
subgraph "Private Endpoints"
PE1[Storage Endpoint]
PE2[Search Endpoint]
PE3[AI Foundry Endpoint]
end
AI --> PROJ
AI --> ST
AI --> SRCH
ST --> PE1
SRCH --> PE2
AI --> PE3
PE1 --> DNS
PE2 --> DNS
PE3 --> DNS
PE1 --> VNET
PE2 --> VNET
PE3 --> VNET
- Azure Subscriptions: Two subscriptions recommended (workload and infrastructure)
- Existing Network: VNet with subnets for private endpoints
- DNS Zones: Private DNS zones created and linked to VNet
- Permissions: Contributor access to both subscriptions
- Tools: Terraform >= 1.5.0, Azure CLI
Before deployment, ensure you have these private DNS zones:
privatelink.cognitiveservices.azure.comprivatelink.openai.azure.comprivatelink.services.ai.azure.comprivatelink.blob.core.windows.netprivatelink.search.windows.netprivatelink.vaultcore.azure.net(optional, for Key Vault)
-
Clone and navigate to the repository:
git clone <repository-url> cd terraform-foundry-nocaphost
-
Initialize Terraform:
terraform init
-
Set up environment variables:
# Required for Azure Storage backend authentication export ARM_SUBSCRIPTION_ID="your-infrastructure-subscription-id"
Note: The
ARM_SUBSCRIPTION_IDshould be set to the subscription containing the Terraform state storage account. -
Create a terraform.tfvars file:
# Copy the example template cp terraform.tfvars.example terraform.tfvars # Edit with your specific values nano terraform.tfvars # or use your preferred editor
Required configuration updates:
# Core settings project_name = "your-ai-project" environment = "dev" # or staging/prod location = "eastus2" # your preferred region # Subscription IDs (REQUIRED - replace with your values) subscription_id_resources = "your-workload-subscription-id" subscription_id_infra = "your-infrastructure-subscription-id" # Resource groups resource_group_name_resources = "rg-ai-workload" resource_group_name_dns = "rg-dns-zones" # Networking (REQUIRED if using private endpoints) subnet_id_private_endpoint = "/subscriptions/.../subnets/your-endpoint-subnet" # Private DNS zones (REQUIRED for private endpoints) dns_zone_cognitiveservices = "/subscriptions/.../privatelink.cognitiveservices.azure.com" dns_zone_openai = "/subscriptions/.../privatelink.openai.azure.com" dns_zone_ai_services = "/subscriptions/.../privatelink.services.ai.azure.com" storage_blob_dns_zone_id = "/subscriptions/.../privatelink.blob.core.windows.net" search_dns_zone_id = "/subscriptions/.../privatelink.search.windows.net" # Admin access (REQUIRED - replace with your email/group IDs) platform_admin_users = ["your-email@company.com"] platform_admin_groups = ["your-azure-ad-group-id"]
-
Plan and deploy:
terraform plan terraform apply
| Variable | Description | Default |
|---|---|---|
project_name |
Name of the project | Required |
environment |
Environment (dev/staging/prod) | Required |
location |
Azure region | Required |
resource_group_name_resources |
Resource group name | Required |
| Variable | Description | Default |
|---|---|---|
enable_private_endpoints |
Enable private endpoints | true |
subnet_id_private_endpoint |
Subnet for private endpoints | null |
| Variable | Description | Default |
|---|---|---|
enable_customer_managed_keys |
Use customer-managed encryption | false |
The deployment automatically configures resources based on the environment variable:
| Setting | dev | staging | prod |
|---|---|---|---|
| Storage SKU | Standard_LRS | Standard_ZRS | Premium_ZRS |
| Search SKU | basic | standard | standard |
| AI Foundry SKU | S0 | S0 | S0 |
| Backup | disabled | enabled | enabled |
| Monitoring | basic | standard | comprehensive |
- Private Endpoints - Secure network connectivity
- Managed Identity - Azure AD authentication without secrets
- RBAC Permissions - Least-privilege access control
- Network Isolation - Traffic never leaves Azure backbone
- Optional Key Vault - Customer-managed encryption keys
- Cross-subscription - Separate workload and infrastructure
- Private DNS - Custom domain resolution
- Private Endpoints - All services accessible privately
- Network Security Groups - Layer 4 protection
- Application Insights - Application performance monitoring
- Log Analytics - Centralized logging and queries
- Diagnostic Settings - Resource-level telemetry
- Optional Alerts - Proactive monitoring notifications
After successful deployment, you'll receive:
- AI Foundry Hub - Hub details and endpoint URLs
- AI Foundry Project - Project ID and principal information
- OpenAI Deployment - GPT-4o model endpoint and details
- Storage Account - Blob storage endpoint and connection info
- AI Search Service - Search endpoint for RAG scenarios
- Network Configuration - Private endpoint IPs and DNS details
- RBAC Summary - Role assignments and permissions overview
This configuration is optimized for cost by:
- Simplified Architecture - No capability hosts or Cosmos DB (~30% cost reduction)
- Environment-based SKUs - Appropriate sizing for dev/staging/prod
- Optional Features - Key Vault and encryption only when needed
- Configurable Monitoring - Basic to comprehensive based on environment
- Resource Efficiency - Shared networking across subscriptions
This simplified deployment is ideal for:
- Development environments - Lower cost for testing and experimentation
- Proof of concepts - Quick AI platform setup without complex infrastructure
- Simple AI applications - Basic chat, search, and generation workloads
- Cost-conscious deployments - When capability hosts aren't required
- Learning and experimentation - Understanding AI Foundry fundamentals
Be aware of these architectural limitations:
- No compute infrastructure - Capability hosts not available for custom runtimes
- No document database - Cosmos DB not included for session/conversation storage
- Simplified networking - Basic private endpoint setup without complex topologies
- Limited scalability - No auto-scaling VM infrastructure for heavy workloads
| Feature | NoCapabilityHosts (Current) | WithCapabilityHosts |
|---|---|---|
| Deployment Time | 5-8 minutes | 8-12 minutes |
| Cosmos DB | β Not included | β Thread storage |
| Capability Hosts | β Not available | β Account & Project level |
| Agent Subnet | β Not required | β Network injection |
| Standard Agents | β Limited support | β Full support |
When you're ready to upgrade to the full configuration:
-
Navigate to the enhanced configuration:
cd terraform-foundry-caphost/ -
Review the enhanced README:
- terraform-foundry-caphost/README.md
- Additional requirements: Agent subnet, Cosmos DB configuration
- Enhanced RBAC and networking setup
-
Plan your migration:
- Use a new resource group to avoid conflicts
- Cosmos DB and agent subnet requirements
- Enhanced private endpoint configuration
To migrate to the full deployment with capability hosts:
- Use the
terraform-foundry-caphost/configuration - Add Cosmos DB variables to your terraform.tfvars
- Configure agent subnet network injection parameters
- Plan migration carefully to avoid resource naming conflicts
Cross-subscription permissions:
# Ensure you have access to both subscriptions
az account list --query "[].{Name:name, Id:id, State:state}"Private endpoint deployment failures:
- Verify DNS zones exist and are linked to the VNet
- Check subnet has sufficient IP addresses
- Ensure proper permissions on target VNet
Resource naming conflicts:
- The configuration uses random suffixes to avoid conflicts
- Check existing resources if deployment fails
OpenAI deployment failures:
- Check regional availability for GPT-4o model
- Verify subscription quotas for OpenAI services
- Validate model versions and SKU availability
RBAC permission errors:
- Verify managed identity roles and assignments
- Check resource-level permissions
- Validate user/group object IDs
Role assignment conflicts:
- If you get "RoleAssignmentExists" errors, set
create_resource_group_reader_assignments = falsein yourterraform.tfvars - This happens when resource group Reader roles already exist from previous deployments
- The infrastructure will work correctly either way - this variable just prevents duplicate role creation
terraform-foundry-nocaphost/ # β You are here
βββ README.md # This file - NoCapabilityHosts guide
βββ main.tf # Simplified AI Foundry configuration
βββ terraform.tfvars.example # Configuration template
βββ terraform-foundry-caphost/ # Enhanced configuration directory
β βββ README.md # WithCapabilityHosts guide
β βββ main.tf # Full AI Foundry with capability hosts
β βββ terraform.tfvars.example # Enhanced configuration template
βββ modules/ # Shared Terraform modules
βββ networking/ # Private endpoint configurations
βββ security/ # RBAC and Key Vault modules
βββ monitoring/ # Application Insights setup
βββ rbac/ # Role assignment modules
- π Current Configuration: NoCapabilityHosts (you're reading this)
- ποΈ Enhanced Configuration: terraform-foundry-caphost/README.md
- π§ Shared Modules: modules/ directory
- βοΈ CI/CD Pipeline: .github/workflows/
- Azure AI Foundry Documentation
- Azure OpenAI Service
- Azure AI Search
- Azure Private Endpoints
- Terraform Azure Provider
See CONTRIBUTING.md for development setup and contribution guidelines.
This project is licensed under the MIT License - see LICENSE for details.