Use Azure API Management with Azure OpenAI skills and vectorizers in Azure AI Search

You can place Azure API Management in front of Azure OpenAI in Microsoft Foundry Models and Microsoft Foundry deployments to centralize routing, load balancing, throttling, and observability for Azure AI Search workloads that use Foundry models. This article describes the supported scenarios, the recommended authentication and role-based access control (RBAC) pattern, and how to optionally make the connection from Azure AI Search private.

Supported scenarios

You can call a Microsoft Foundry model deployment through API Management from these Azure AI Search features:

For each, set the skill or vectorizer endpoint to the API Management gateway URL, either https://<resource-name>.azure-api.net or an API Management custom ___domain.

Unsupported scenarios

API Management isn't supported as a gateway for:

The large language model (LLM) used by the Azure Content Understanding skill.
The LLM used by a knowledge base in agentic retrieval.

Architecture

There are two common flows. Both use the same API Management configuration on the backend.

Public path

Azure AI Search calls the API Management gateway over its public endpoint. API Management authenticates to the Microsoft Foundry resource and forwards the request.

Azure AI Search (skill or vectorizer)
   └─▶ API Management gateway (azure-api.net or custom ___domain)
          [API Management policies authenticate the caller]
          [API Management's managed identity authenticates to Foundry]
          └─▶ Azure OpenAI in Foundry Models / Microsoft Foundry resource

Private path from Azure AI Search to API Management

When the search service must reach API Management privately, create a shared private link from the search service to the API Management instance, and run indexers in the private execution environment.

Azure AI Search (private indexer execution)
   └─▶ Shared private link (Microsoft.ApiManagement/service, group "Gateway")
          └─▶ API Management private endpoint
                 └─▶ Microsoft Foundry resource (public or private endpoint)

API Management to Foundry is the standard backend pattern. For deeper topology guidance, including multi-region, multi-instance, and active-active or active-passive variants, see Use a gateway in front of multiple Azure OpenAI deployments or instances and Overview of generative AI gateway capabilities in Azure API Management.

Configure the endpoint in your skill or vectorizer

Set the skill or vectorizer endpoint to the API Management gateway URL:

Feature	Property	Value
Azure OpenAI embedding skill	`resourceUri`	`https://<apim-name>.azure-api.net` or API Management custom ___domain
Azure OpenAI vectorizer	`resourceUri`	`https://<apim-name>.azure-api.net` or API Management custom ___domain
GenAI prompt skill	`uri`	`https://<apim-name>.azure-api.net` or API Management custom ___domain

Authentication and RBAC

Two identities are involved. They have different responsibilities and different role assignments.

Identity	Where credentials are presented	Role or mechanism
Azure AI Search managed identity	At the API Management gateway	Authorized by API Management policies. No role on the Microsoft Foundry resource is required when you use the credential-termination pattern below.
API Management managed identity	At the Microsoft Foundry resource	Cognitive Services OpenAI User on the Microsoft Foundry resource.

The recommended pattern is credential termination and re-establishment: the caller authenticates to API Management, and API Management uses its own managed identity to authenticate to the Microsoft Foundry resource. For more information, see Azure OpenAI authorization in the AOAI gateway architecture guide.

Authentication options at the API Management gateway

Choose one of the following options for how Azure AI Search authenticates to API Management. For more information and policy examples, see Authenticate and authorize access to LLM APIs by using API Management.

Subscription key: API Management stores the Microsoft Foundry resource API key in a named value, and a policy passes it on the backend request. This is the simplest option to configure. It uses the set-header policy with the api-key header. Because the search service sends only the API Management subscription key, no Microsoft Entra ID role is required on the search service managed identity.
Managed identity at the gateway with OAuth validation: API Management uses the validate-azure-ad-token policy to validate a Microsoft Entra ID token presented by the search service managed identity. Use this option for defense in depth.
(Required for the recommended pattern) API Management managed identity to the backend: Enable a system-assigned or user-assigned managed identity on the API Management instance, and then assign it the Cognitive Services OpenAI User role on the Microsoft Foundry resource. See Authenticate with managed identity and Role-based access control for Azure OpenAI.

Tip

When you import a Microsoft Foundry API into API Management, the backend and managed identity wiring is created automatically.
For circuit-breaker, retry, and backend pool guidance, see Backends in API Management.

Private connectivity from Azure AI Search to API Management

Skip this section if Azure AI Search calls API Management over its public endpoint.

To restrict outbound traffic from the search service to a private channel:

Create a shared private link from the search service to the API Management instance. Use the Microsoft.ApiManagement/service resource type and the Gateway group ID. For steps, see Make outbound connections through a shared private link.
Approve the private endpoint connection on the API Management instance.
Configure indexers that use the skill or vectorizer to run in the private execution environment.

The shared private link counts against the shared private link limit for your search service tier.

Private connectivity from API Management to the Microsoft Foundry resource

The private channel from API Management to the Microsoft Foundry resource is independent of the search-to-API Management connection and is configured on the API Management side. For options, see:

Feedback

Was this page helpful?

Last updated on 2026-06-02