EKS

Components and Sizing Recommendations

Component	Options	Sizing Recommendations
AI Gateway	Deploy in your EKS cluster using Helm charts.	Use Amazon EKS t4g.medium worker nodes, each providing at least 2 vCPUs and 4 GiB of memory. For high availability, deploy them across multiple Availability Zones.
Logs Store (optional)	Amazon S3 or S3-compatible Storage	Each log document is ~10kb in size (uncompressed)
Cache (Prompts, Configs & Providers)	Built-in Redis, Amazon ElastiCache for Redis OSS or Valkey	Deployed within the same VPC as the Portkey Gateway.

Prerequisites

Ensure that following tools and resources are installed and available:

A running Amazon EKS cluster with at least 2 worker nodes. ( Best Practice: Use 2 nodes, with 1 node in each Availability Zone, to ensure high availability.)
AWS CLI
Kubectl
Helm (v3 or above)
eksctl

Create a Portkey Account

Go to the Portkey website.
Sign up for a Portkey account.
Once logged in, locate and save your Organisation ID for future reference. You can find it in the browser URL: https://app.portkey.ai/organisation/<organisation_id>/
Contact the Portkey AI team and provide your Organisation ID and the email address used during signup.
The Portkey team will share the following information with you:
- Docker credentials for the Gateway images (username and password).
- License: Client Auth Key.

Setup Project Environment

cluster_name=<EKS_CLUSTER_NAME>               # Specify the name of the EKS cluster where the gateway will be deployed.
namespace=<NAMESPACE>                         # Specify the namespace where the gateway should be deployed (for example, portkeyai).
service_account_name=<SERVICE_ACCOUNT_NAME>   # Provide a name for the Service Account to be associated with Gateway Pod (for example, gateway-sa)

mkdir portkey-gateway
cd portkey-gateway
touch values.yaml

Image Credentials Configuration

# Update the values.yaml file
imageCredentials:
  - name: portkey-enterprise-registry-credentials
    create: true
    registry: https://index.docker.io/v1/
    username: <PROVIDED BY PORTKEY>
    password: <PROVIDED BY PORTKEY>

images:
  gatewayImage:
    repository: "docker.io/portkeyai/gateway_enterprise"
    pullPolicy: Always
    tag: "latest"
  dataserviceImage:
    repository: "docker.io/portkeyai/data-service"
    pullPolicy: Always
    tag: "latest"
  redisImage:
    repository: "docker.io/redis"
    pullPolicy: IfNotPresent
    tag: "7.2-alpine"
environment:
  create: true
  secret: true
  data:
    ANALYTICS_STORE: control_plane
    SERVICE_NAME: <SERVICE_NAME>                      # Specify a name for the service
    PORTKEY_CLIENT_AUTH: <PROVIDED BY PORTKEY>
    ORGANISATIONS_TO_SYNC: <ORGANISATION_ID>           # This is obtained after signing up for a Portkey account.  

Configure Components

Based on the choice of components and their configuration update the values.yaml.

MCP Gateway (Optional)

By default, only the AI Gateway is enabled in the deployment. To enable the MCP Gateway, add the following configuration to values.yaml:

environment:
  data:
    SERVER_MODE: "mcp/all"
    MCP_PORT: "8788"
    MCP_GATEWAY_BASE_URL: "<This must be set to MCP LoadBalancer URL or Domain pointing to MCP Service>"

Note:

MCP_GATEWAY_BASE_URL must include the protocol prefix — either http:// or https://.
This value is not required for the initial deployment. After the first deployment, once the MCP Load Balancer is provisioned and a hostname is mapped to the MCP Service, set this value and redeploy.

Server Modes

"" (empty or not provided): Deploys only the AI Gateway. This is the default configuration.
"mcp": Deploys only the MCP Gateway.
"all": Deploys both the AI Gateway and MCP Gateway.

Cache Store

The Portkey Gateway deployment includes a Redis instance pre-installed by default. You can either use this built-in Redis or connect to an external cache like Amazon ElastiCache for Redis OSS or Valkey.

Built-in Redis

No additional permissions or network configurations are required.

## To use the built-in Redis, add the following configuration to the values.yaml file.
environment:
  data:
    CACHE_STORE: redis
    REDIS_URL: "redis://redis:6379"
    REDIS_TLS_ENABLED: "false"

Amazon ElastiCache

To enable the gateway to work with an ElastiCache cache, ensure that an inbound rule is configured in ElastiCache’s Security Group allowing access from the EKS cluster on the required port.

No Auth
Auth Token
IRSA
EKS Pod Identity

environment:
  data:
    CACHE_STORE: aws-elastic-cache
    REDIS_URL: "redis://<ElastiCache_Endpoint>:<Port>" 
    REDIS_TLS_ENABLED: "false"
    REDIS_MODE: cluster                                   ## Add this parameter only if cluster mode is enabled

environment:
  data:
    CACHE_STORE: aws-elastic-cache
    REDIS_URL: "redis://<ElastiCache_Endpoint>:<Port>" 
    REDIS_TLS_ENABLED: "true"
    REDIS_MODE: cluster                                   ## Add this parameter only if cluster mode is enabled
    REDIS_PASSWORD: <Auth_Token>                          ## Provide Auth Token configured on ElastiCache

serviceAccount:
  create: true
  automount: true
  name: <SERVICE_ACCOUNT_NAME>
  annotations:
    eks.amazonaws.com/role-arn: <ROLE_ARN>

environment:
  data:
    CACHE_STORE: aws-elastic-cache
    REDIS_URL: "redis://<ElastiCache_Endpoint>:<Port>" 
    REDIS_TLS_ENABLED: "true"
    REDIS_MODE: cluster                                   ## Add this parameter only if cluster mode is enabled
    AWS_REDIS_AUTH_MODE: iam
    AWS_REDIS_CLUSTER_NAME: <ELASTICACHE_CLUSTER_NAME>    ## Name of the ElastiCache replication group or serverless cache
    REDIS_USERNAME: <ELASTICACHE_USER_ID>                 ## ElastiCache user ID configured for IAM authentication

Note: The IAM role must have elasticache:Connect permission. See ElastiCache IAM authentication for details.

serviceAccount:
  create: true
  automount: true
  name: <SERVICE_ACCOUNT_NAME>

environment:
  data:
    CACHE_STORE: aws-elastic-cache
    REDIS_URL: "redis://<ElastiCache_Endpoint>:<Port>" 
    REDIS_TLS_ENABLED: "true"
    REDIS_MODE: cluster                                   ## Add this parameter only if cluster mode is enabled
    AWS_REDIS_AUTH_MODE: iam
    AWS_REDIS_CLUSTER_NAME: <ELASTICACHE_CLUSTER_NAME>    ## Name of the ElastiCache replication group or serverless cache
    REDIS_USERNAME: <ELASTICACHE_USER_ID>                 ## ElastiCache user ID configured for IAM authentication

Note: The IAM role must have elasticache:Connect permission and a Pod Identity association must be created. See ElastiCache IAM authentication and EKS Pod Identity for details.

Note: If cluster mode is enabled in ElastiCache then use Configuration Endpoint otherwise use Primary Endpoint. For more information on ElastiCache endpoints, refer to the AWS resources.

Log Store

Amazon S3

Create an Amazon S3 bucket for storing LLM access logs.

Set up access to the log store. The Gateway supports the following methods for connecting to S3 bucket for log storage:

IAM Roles for Service Accounts (IRSA)
EKS Pod Identity

Depending on the chosen S3 access method, update values.yaml with the following configuration.

IRSA
EKS Pod Identity

## To enable IRSA update values.yaml with the following details:-
serviceAccount:
  create: true
  automount: true
  name: <SERVICE_ACCOUNT_NAME>             # Provide the name of service account. Must be same as the name you provided while creating IAM Role in last step.
  annotations:
    eks.amazonaws.com/role-arn: <ROLE_ARN> # Provide the IAM role ARN obtained in previous step.

environment:
  data:
    LOG_STORE: s3_assume
    LOG_STORE_REGION: "<AWS_BUCKET_REGION>"                     # Specify the AWS region where the S3 log bucket resides (e.g., us-east-1).
    LOG_STORE_GENERATIONS_BUCKET: "<AWS_BUCKET_NAME>"           # Specify the name of S3 log bucket.

## To enable EKS Pod Identity update values.yaml with following details:-
serviceAccount:
  create: true
  automount: true
  name: <SERVICE_ACCOUNT_NAME>             # Provide the name of service account. Must be same as the name you provided while creating IAM Role in last step.

environment:
  data:
    LOG_STORE: s3_assume
    LOG_STORE_REGION: "<AWS_BUCKET_REGION>"                     # Specify the AWS region where the S3 log bucket resides (e.g., us-east-1).
    LOG_STORE_GENERATIONS_BUCKET: "<AWS_BUCKET_NAME>"           # Specify the name of S3 log bucket.

(Optional) Configure log path format using LOG_STORE_FILE_PATH_FORMAT. See Log Object Path Format for details.

Data Service (Optional)

The Data Service is a component of the Portkey deployment responsible for batch processing, fine-tuning, and log exports. To enable Data Service, add the following configuration to the values.yaml file.

dataservice:
  name: "dataservice"
  enabled: true
  env:
    DEBUG_ENABLED: false
    SERVICE_NAME: "portkeyenterprise-dataservice"
  serviceAccount:
    create: false
    name: <SERVICE_ACCOUNT_NAME>                                  # Provide the name of service account. Must be same as the name you provided while creating IAM Role in last step.

Network Configuration

Set Up External Access

To make the Gateway service accessible externally, you can set up either of the following:

AWS Application Load Balancer with Kubernetes Ingress
AWS Network Load Balancer with Kubernetes Service

Prerequisites

VPC and subnet tagging requirements
Installed and running AWS Load Balancer Controller. For Load Balancer Controller installation details, refer to the AWS documentation.

AWS Application Load Balancer

To create Application Load Balancer Ingress update the values.yaml file with following configuration:

service:
  type: ClusterIP
  port: 8787

ingress:
  enabled: true
  # hostname: "<AI Gateway Hostname>"
  # hostBased: false
  # mcpHostname: "<MCP Gateway Hostname>"
  ingressClassName: "alb"
  annotations: 
    alb.ingress.kubernetes.io/load-balancer-name : portkey-gateway
    alb.ingress.kubernetes.io/scheme: internet-facing                                                 # Set to 'internal' for internal ALB, set to 'internet-facing' for creating an ALB accessible from internet.
    alb.ingress.kubernetes.io/target-type: ip 
    # alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS": 443}]'
    alb.ingress.kubernetes.io/healthcheck-path: /v1/health
    alb.ingress.kubernetes.io/inbound-cidrs: 0.0.0.0/0                                                # Allowed inbound CIDR ranges
    alb.ingress.kubernetes.io/manage-backend-security-group-rules: "true"

Note: If SERVER_MODE is set to all (i.e., both AI Gateway and MCP Gateway are enabled), you must enable host-based routing by setting hostBased to true and provide the hostname on which the AI Gateway and MCP Gateway will be accessible. Load Balancer Controller provides additional annotations (like TLS, custom health checks etc ) for managing ALB. For a comprehensive list of available annotations, refer to the AWS Load Balancer Controller documentation.

AWS Network Load Balancer

To create Network Balancer update the values.yaml with following configuration:

service:
  type: LoadBalancer
  port: 80                                                                                          # NLB listener port                                                 
  containerPort: 8787                                             
  annotations: 
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"                   
    service.beta.kubernetes.io/aws-load-balancer-internal: "true"                                   # Set to 'true' to create an internal NLB, set to 'false' to create an internet-facing NLB.
    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"     
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: "/v1/health"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: "http" 
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: "8787"        
    service.beta.kubernetes.io/aws-load-balancer-manage-backend-security-group-rules: "true"         

Note: service.containerPort must be same as environment.data.PORT. Load Balancer Controller provides additional annotations (like TLS, custom health checks etc ) for managing NLB. For a comprehensive list of available annotations, refer to the AWS Load Balancer Controller documentation.

Deploying Portkey Gateway

# Add the Portkey AI Gateway helm repository
helm repo add portkey-ai https://portkey-ai.github.io/helm
helm repo update

# Install the chart
helm upgrade --install portkey-ai portkey-ai/gateway -f ./values.yaml -n $namespace --create-namespace

Verify the deployment

To confirm that the deployment was successful, follow these steps:

Verify that all pods are running correctly.

# 
kubectl get pods -n $namespace
# You should see all pods with a 'STATUS' of 'Running'.

Note: If pods are in a Pending, CrashLoopBackOff, or other error state, inspect the pod logs and events to diagnose potential issues.

Test Gateway by sending a cURL request.

Port-forward the Gateway pod

  kubectl port-forward  <POD_NAME> -n $namespace 9000:8787       # Replace <POD_NAME> with your Gateway pod's actual name.

Once port forwarding is active, open a new terminal window or tab and send a test request by running:

# Specify LLM provider and Portkey API keys
OPENAI_API_KEY=<OPENAI_API_KEY>                           # Replace <OPENAI_API_KEY> with an actual API key
PORTKEY_API_KEY=<PORTKEY_API_KEY>                         # Replace <PORTKEY_API_KEY> with Portkey API key which can be created from Portkey website(https://app.portkey.ai/api-keys).

# Configure and send the curl request
curl 'http://localhost:9000/v1/chat/completions' \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY"  \
-H "x-portkey-provider: openai" \
-H "x-portkey-api-key: $PORTKEY_API_KEY"  \
-d '{ 
    "model": "gpt-4o-mini", 
    "messages": [{"role": "user","content": "What is a fractal?"}]  
}'

Test gateway service integration with Load Balancer.

# Replace <LOAD_BALANCER_IP> and <LISTENER_PORT> with the Load Balancer's IP/DNS and listener port respectively. 

curl 'http://<LB_IP>:<LISTENER_PORT>/v1/chat/completions' \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY"  \
-H "x-portkey-provider: openai" \
-H "x-portkey-api-key: $PORTKEY_API_KEY"  \
-d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user","content": "What is a fractal?"}]
}'

Integrating Gateway with Control Plane

Outbound Connectivity (Data Plane to Control Plane) Portkey supports the following methods for integrating the Data Plane with the Control Plane for outbound connectivity:

AWS PrivateLink
Over the Internet

Ensure Outbound Network Access By default, Kubernetes allows full outbound access, but if your cluster has NetworkPolicies that restrict egress, configure them to allow outbound traffic. Example NetworkPolicy for Outbound Access:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-all-egress
  namespace: portkeyai
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - ipBlock:
        cidr: 0.0.0.0/0

This allows the gateway to access LLMs hosted both within your VPC and externally. This also enables connection for the sync service to the Portkey Control Plane.

AWS PrivateLink

Establishes a secure, private connection between the Control Plane and Data Plane within the AWS network. Steps to establish AWS PrivateLink connectivity:

Contact Portkey and provide AWS account ARN so it can be whitelisted in Portkey’s Control Plane.
Once you get confirmation from Portkey that your AWS account is whitelisted, go to the VPC Console.
Select the AWS Region where the Portkey Gateway is deployed.
Navigate to the Endpoints section in the VPC console.
Click on Create endpoint and enter the required details.
Select the PrivateLink Ready partner services category and, under Service settings, provide the following details.
- For Service name, enter com.amazonaws.vpce.us-east-1.vpce-svc-0c2c1c323d9f56d95
- (Optional) If the Gateway is deployed in a region other than us-east-1, select Enable Cross Region endpoint, choose the us-east-1 region, and click the Verify service button.
Under Network settings
- Select the VPC and subnets (at least two in different AZs for high availability) where the endpoint should be created. Ideally, this should be the same VPC where the Gateway is deployed.
- Select the security group to associate with the endpoint. The security group must allow inbound connections on port 443 from the Gateway.
After all details are filled in, click on Create endpoint.
Wait for the Status to change to Available.
Once the status changes to Available, click on Actions > Modify private DNS name > Select Enable for this endpoint.

Update the values.yaml file with following config.

environment:
  create: true
  secret: true
  data:
      ALBUS_BASEPATH: "https://aws-cp.portkey.ai/albus"
      CONTROL_PLANE_BASEPATH: "https://aws-cp.portkey.ai/api/v1"
      SOURCE_SYNC_API_BASEPATH: "https://aws-cp.portkey.ai/api/v1/sync"
      CONFIG_READER_PATH: "https://aws-cp.portkey.ai/api/model-configs" 

Re-deploy the gateway.

helm upgrade --install portkey-ai portkey-ai/gateway -f ./values.yaml  -n portkeyai  --create-namespace

Over the Internet

Ensure Gateway has access to following endpoints over the internet.

https://api.portkey.ai
https://albus.portkey.ai

Inbound Connectivity (Control Plane to Data Plane)

AWS PrivateLink
IP Whitelisting

AWS PrivateLink

Establishes a secure, private connection between the Control Plane and Data Plane within the AWS network. Steps to establish AWS PrivateLink connectivity: To use AWS PrivateLink, you must create an AWS Network Load Balancer (NLB)—either internal or internet-facing—to expose the Gateway outside the EKS cluster. For detailed instructions on creating and integrating an NLB, please refer to the Networking Configuration Create Endpoint Service

Navigate to the AWS VPC Console.
In the top-right corner of the AWS Console, select the region where the Portkey Gateway is deployed.
Provide the following details -
- Name of endpoint service
- Select Network Load Balancer to associate with Endpoint.
- Choose region in which endpoint service will be available.
- Select whether acceptance is required or not for requested connections.
- Choose whether to enable private DNS name - If enabled provide the Private DNS Name.
- Select IPv4 under Supported IP address types.
Click Create.

(Optional) Verify ownership of Private DNS name This step needs to be performed if you are using Private DNS Name. Open created Endpoint Service > click on Actions > select Verify domain ownership for private DNS name > Create the recommended record in your DNS server > Click Verify. Authorize Portkey’s Control Plane to initiate connection requests

Open to Endpoint Service > click on Actions > select Allow principals, and enter the Control Plane’s ARN(arn:aws:iam::299329113195:root). Reach out to portkey team and share the following details -
- Service name
- DNS names
- Private DNS name
- Region selected while creating Endpoint Service.
- Port number on which Load Balancer is listening for connections.
Wait for the Portkey team to initiate a connection request from the control plane’s AWS account to your Gateway AWS account. Navigate to the Endpoint connections section and once the request appears, approve it.

IP Whitelisting

Allows control plane to access the Data Plane over the internet by restricting inbound traffic to specific IP address of Control Plane. This method requires the Data Plane to have a publicly accessible endpoint. To whitelist, add an inbound rule to the Load Balancer’s security group allowing connections from the Portkey Control Plane’s IPs (54.81.226.149, 34.200.113.35, 44.221.117.129) on NLB listener port. To integrate the Control Plane with the Data Plane, contact the Portkey team and provide the Public Endpoint of the Data Plane.

Verifying Gateway Integration with the Control Plane

Send a test request to Gateway using curl.
Go to Portkey website -> Logs.
Verify that the test request appears in the logs and that you can view its full details by selecting the log entry.

Uninstalling Portkey Gateway

helm uninstall portkey-ai --namespace $namespace

Setting up IAM Permission

Follow the steps below to configure permissions based on your chosen access method.

Create IAM Role

IRSA
EKS Pod Identity

Specify the details:

role_name=<IAM_ROLE_NAME>                     # Provide a name for the role to be associated with Service Account

# Retrieve AWS Account ID
aws_account_id=$(aws sts get-caller-identity --query Account --output text)

# Retrieve EKS cluster's OIDC issuer
oidc_issuer=$(aws eks describe-cluster --name $cluster_name --query "cluster.identity.oidc.issuer" --output text | sed -e "s~https://~~")

# Check if an IAM OIDC provider is already created for EKS cluster in your account
aws iam list-open-id-connect-providers | grep $oidc_issuer

# (Optional) If no output is returned, create an IAM OIDC provider for your EKS cluster
eksctl utils associate-iam-oidc-provider --cluster $cluster_name --approve

Create a trust policy and IAM role.

cat >trust-relationship.json <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${aws_account_id}:oidc-provider/${oidc_issuer}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${oidc_issuer}:aud": "sts.amazonaws.com",
          "${oidc_issuer}:sub": "system:serviceaccount:${namespace}:${service_account_name}"
        }
      }
    }
  ]
}
EOF

aws iam create-role --role-name $role_name --assume-role-policy-document file://trust-relationship.json

# Fetch ARN of IAM role
role_arn=$(aws iam get-role --role-name $role_name --query "Role.Arn" --output text)
echo "$role_arn"

Note: Record the IAM role ARN for future reference, as it will be required when configuring the Gateway’s service account in values.yaml.

Prerequisites: If the EKS Pod Identity Agent is not already installed on your cluster, install it before proceeding. For detailed instructions, refer to the AWS documentation.

Specify the details:

role_name=<IAM_ROLE_NAME>                     # Provide a name for the role to be associated with Service Account

Create a trust policy and IAM role.

cat >trust-relationship.json <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowEksAuthToAssumeRoleForPodIdentity",
      "Effect": "Allow",
      "Principal": {
        "Service": "pods.eks.amazonaws.com"
      },
      "Action": [
        "sts:AssumeRole",
        "sts:TagSession"
      ]
    }
  ]
}
EOF

aws iam create-role --role-name $role_name --assume-role-policy-document file://trust-relationship.json

Create a Pod Identity association.

aws_account_id=$(aws sts get-caller-identity --query Account --output text)

aws eks create-pod-identity-association \
  --cluster-name $cluster_name \
  --role-arn "arn:aws:iam::${aws_account_id}:role/${role_name}" \
  --namespace $namespace \
  --service-account $service_account_name

Attach Permissions to IAM Role

Once the IAM role is created using either method above, attach the required policies based on the AWS services your gateway needs to access.

Amazon S3

To allow the Portkey Gateway to access Amazon S3 for log storage, attach the following policy to the IAM role.

bucket_name=<S3_BUCKET_NAME>                  # Specify the name of S3 bucket which will store logs

cat >s3-access-policy.json <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:PutObject", "s3:GetObject"],
      "Resource": ["arn:aws:s3:::${bucket_name}/*"]
    }
  ]
}
EOF

aws iam put-role-policy --role-name $role_name --policy-name s3-access-policy --policy-document file://s3-access-policy.json

Amazon ElastiCache (Optional)

To allow the Portkey Gateway to authenticate with Amazon ElastiCache using IAM, attach the following policy to the IAM role.

elasticache_cluster_arn=<ELASTICACHE_REPLICATION_GROUP_ARN>   # e.g., arn:aws:elasticache:<region>:<account-id>:replicationgroup:<cluster-name>
elasticache_user_arn=<ELASTICACHE_USER_ARN>                   # e.g., arn:aws:elasticache:<region>:<account-id>:user:<user-id>

cat >elasticache-access-policy.json <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["elasticache:Connect"],
      "Resource": [
        "${elasticache_cluster_arn}",
        "${elasticache_user_arn}"
      ]
    }
  ]
}
EOF

aws iam put-role-policy --role-name $role_name --policy-name elasticache-access-policy --policy-document file://elasticache-access-policy.json

Amazon Bedrock (Optional)

To allow the Portkey Gateway to invoke Amazon Bedrock models, attach the following policy to the IAM role.

cat >bedrock-access-policy.json <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"],
      "Resource": ["*"]
    }
  ]
}
EOF

aws iam put-role-policy --role-name $role_name --policy-name bedrock-access-policy --policy-document file://bedrock-access-policy.json

Google Vertex AI via Workload Identity Federation (Optional)

To allow the Portkey Gateway running on AWS EKS to invoke Google Vertex AI models, you can use GCP Workload Identity Federation. This enables the Gateway’s AWS IAM role to authenticate directly with Google Cloud without requiring static GCP service account keys.

This section requires the IAM role created in the steps above (via IRSA or EKS Pod Identity). The IAM role ARN will be used as the trusted identity in the GCP Workload Identity Federation pool.

gcp_project_id=<GCP_PROJECT_ID>                # GCP project where Vertex AI is enabled.
pool_name=<POOL_NAME>                          # Name for the Workload Identity Pool (e.g., eks-id-pool).
provider_name=<PROVIDER_NAME>                  # Name for the AWS IAM provider (e.g., aws-iam-provider).
aws_account_id=<AWS_ACCOUNT_ID>                # Your AWS account ID.
role_name=<AWS_IAM_ROLE>                       # Set AWS IAM Role associated with Gateway deployment

gcloud auth login
gcloud config set project $gcp_project_id

Create a Workload Identity Pool in your GCP project.

gcloud iam workload-identity-pools create $pool_name \
  --location="global" \
  --display-name="EKS Identity Pool"

Add an AWS IAM provider to the Workload Identity Pool.

gcloud iam workload-identity-pools providers create-aws $provider_name \
  --location="global" \
  --workload-identity-pool=$pool_name \
  --account-id=$aws_account_id \
  --display-name="AWS IAM Provider"

Configure attribute mapping on the AWS IAM provider.

gcloud iam workload-identity-pools providers update-aws $provider_name \
  --location="global" \
  --workload-identity-pool=$pool_name \
  --attribute-mapping="\
google.subject=assertion.arn,\
attribute.aws_role=assertion.arn.extract('assumed-role/{role}/'),\
attribute.account=assertion.account"

Restrict access to a specific AWS IAM role by adding an attribute condition.

gcloud iam workload-identity-pools providers update-aws $provider_name \
  --location="global" \
  --workload-identity-pool=$pool_name \
  --attribute-condition="attribute.aws_role == '${role_name}'"

Retrieve the full Workload Identity Federation audience. Make a note of this value — it will be required when configuring the Gateway’s values.yaml.

WIF_AUDIENCE=$(gcloud iam workload-identity-pools providers describe $provider_name \
  --location="global" \
  --workload-identity-pool=$pool_name \
  --format="value(name)")

echo $WIF_AUDIENCE

Grant Vertex AI access using one of the following methods:

Direct Access
Service Account Impersonation

Grant the Vertex AI User role directly to the federated identity.

gcloud projects add-iam-policy-binding $gcp_project_id \
  --role="roles/aiplatform.user" \
  --member="principalSet://iam.googleapis.com/projects/$(gcloud projects describe $gcp_project_id --format='value(projectNumber)')/locations/global/workloadIdentityPools/$pool_name/attribute.aws_role/${role_name}"

Create a GCP service account and allow the federated identity to impersonate it.

service_account=<GSA_NAME>                          # Name for the Google Service Account (e.g., portkey-vertex-sa).

# Create the Google Service Account
gcloud iam service-accounts create $service_account \
  --display-name="Portkey Vertex AI Service Account"

# Grant the service account Vertex AI access
gcloud projects add-iam-policy-binding $gcp_project_id \
  --role="roles/aiplatform.user" \
  --member="serviceAccount:${service_account}@${gcp_project_id}.iam.gserviceaccount.com"

# Allow the federated identity to impersonate the service account
gcloud iam service-accounts add-iam-policy-binding ${service_account}@${gcp_project_id}.iam.gserviceaccount.com \
  --role="roles/iam.workloadIdentityUser" \
  --member="principalSet://iam.googleapis.com/projects/$(gcloud projects describe $gcp_project_id --format='value(projectNumber)')/locations/global/workloadIdentityPools/$pool_name/attribute.aws_role/${role_name}"

Examples

Built-in Redis The following sample values.yaml below shows how to configure the built-in Redis cache and Amazon S3 log store using IRSA.

images:
  gatewayImage:
    repository: "docker.io/portkeyai/gateway_enterprise"
    pullPolicy: Always
    tag: "latest"
  dataserviceImage:
    repository: "docker.io/portkeyai/data-service"
    pullPolicy: Always
    tag: "latest"
  redisImage:
    repository: "docker.io/redis"
    pullPolicy: IfNotPresent
    tag: "7.2-alpine"
imageCredentials:
  - name: portkeyenterpriseregistrycredentials
    create: true
    registry: https://index.docker.io/v1/
    username: <DOCKER_USERNAME>
    password: <DOCKER_PASSWORD>

environment:
  create: true
  secret: true
  data:
    ANALYTICS_STORE: control_plane
    SERVICE_NAME: gateway                                                  
    PORTKEY_CLIENT_AUTH: <CLIENT_AUTH>                      # REPLACE <CLIENT_AUTH> with client auth shared by Portkey team.
    ORGANISATIONS_TO_SYNC: <ORGANIZATION_ID>                # REPLACE <ORGANIZATION_ID> with organisation_id of your account.
    PORT: "8787"

    # Configuration for using built-in redis
    CACHE_STORE: redis
    REDIS_URL: "redis://redis:6379"
    REDIS_TLS_ENABLED: "false"
   
    # Configuration for enabling IRSA access to Amazon S3
    LOG_STORE: s3_assume
    LOG_STORE_REGION: <S3_BUCKET_REGION>                    # Specify the AWS region where the S3 log bucket resides (e.g., us-east-1).
    LOG_STORE_GENERATIONS_BUCKET: <S3_BUCKET_NAME>          # Specify the name of the Amazon S3 bucket (e.g., portkey-log-store).



# Configuration for enabling Data Service
dataservice:
  name: "dataservice"
  enabled: true
  env:
    DEBUG_ENABLED: false
    SERVICE_NAME: "portkeyenterprise-dataservice"

# Enabling IRSA for providing Gateway access to Amazon S3 and, optionally Amazon Bedrock.
serviceAccount:
  create: true
  automount: true
  name: gateway-sa
  annotations:
    eks.amazonaws.com/role-arn: <IAM_ROLE_ARN>               # Specify the IAM Role ARN created for enabling IRSA access to Amazon S3 bucket


# Enabling Load Balancer to provide access outside of cluster
service:
  type: LoadBalancer
  port: 80                                                                                    
  containerPort: 8787
  annotations: 
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-internal: "true"
    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"     
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: "/v1/health"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: "http" 
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: "8787" 
    service.beta.kubernetes.io/aws-load-balancer-manage-backend-security-group-rules: "true"

ElastiCache (IAM Auth) with S3 using IRSA and PrivateLink The following sample values.yaml shows how to configure Amazon ElastiCache with IAM authentication, Amazon S3 for log storage using IRSA, and AWS PrivateLink for outbound connectivity to the Control Plane.

images:
  gatewayImage:
    repository: "docker.io/portkeyai/gateway_enterprise"
    pullPolicy: Always
    tag: "latest"
  dataserviceImage:
    repository: "docker.io/portkeyai/data-service"
    pullPolicy: Always
    tag: "latest"
imageCredentials:
  - name: portkeyenterpriseregistrycredentials
    create: true
    registry: https://index.docker.io/v1/
    username: <DOCKER_USERNAME>
    password: <DOCKER_PASSWORD>

serviceAccount:
  create: true
  automount: true
  name: gateway-sa
  annotations:
    eks.amazonaws.com/role-arn: <IAM_ROLE_ARN>

environment:
  create: true
  secret: true
  data:
    ANALYTICS_STORE: control_plane
    SERVICE_NAME: gateway
    PORTKEY_CLIENT_AUTH: <CLIENT_AUTH>
    ORGANISATIONS_TO_SYNC: <ORGANIZATION_ID>
    PORT: "8787"

    # ElastiCache with IAM Auth
    CACHE_STORE: aws-elastic-cache
    REDIS_URL: "redis://<ElastiCache_Endpoint>:<Port>"
    REDIS_TLS_ENABLED: "true"
    AWS_REDIS_AUTH_MODE: iam
    AWS_REDIS_CLUSTER_NAME: <ELASTICACHE_CLUSTER_NAME>
    REDIS_USERNAME: <ELASTICACHE_USER_ID>

    # Amazon S3 for log storage
    LOG_STORE: s3_assume
    LOG_STORE_REGION: <S3_BUCKET_REGION>
    LOG_STORE_GENERATIONS_BUCKET: <S3_BUCKET_NAME>

    # PrivateLink outbound connectivity to Control Plane
    ALBUS_BASEPATH: "https://aws-cp.portkey.ai/albus"
    CONTROL_PLANE_BASEPATH: "https://aws-cp.portkey.ai/api/v1"
    SOURCE_SYNC_API_BASEPATH: "https://aws-cp.portkey.ai/api/v1/sync"
    CONFIG_READER_PATH: "https://aws-cp.portkey.ai/api/model-configs"

dataservice:
  name: "dataservice"
  enabled: true
  env:
    DEBUG_ENABLED: false
    SERVICE_NAME: "portkeyenterprise-dataservice"

service:
  type: LoadBalancer
  port: 80
  containerPort: 8787
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-internal: "true"
    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: "/v1/health"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: "http"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: "8787"
    service.beta.kubernetes.io/aws-load-balancer-manage-backend-security-group-rules: "true"

ElastiCache (Auth Token) with S3 using EKS Pod Identity The following sample values.yaml shows how to configure Amazon ElastiCache with auth token, Amazon S3 for log storage using EKS Pod Identity, and an Application Load Balancer.

images:
  gatewayImage:
    repository: "docker.io/portkeyai/gateway_enterprise"
    pullPolicy: Always
    tag: "latest"
  dataserviceImage:
    repository: "docker.io/portkeyai/data-service"
    pullPolicy: Always
    tag: "latest"
imageCredentials:
  - name: portkeyenterpriseregistrycredentials
    create: true
    registry: https://index.docker.io/v1/
    username: <DOCKER_USERNAME>
    password: <DOCKER_PASSWORD>

serviceAccount:
  create: true
  automount: true
  name: gateway-sa

environment:
  create: true
  secret: true
  data:
    ANALYTICS_STORE: control_plane
    SERVICE_NAME: gateway
    PORTKEY_CLIENT_AUTH: <CLIENT_AUTH>
    ORGANISATIONS_TO_SYNC: <ORGANIZATION_ID>
    PORT: "8787"

    # ElastiCache with Auth Token
    CACHE_STORE: aws-elastic-cache
    REDIS_URL: "redis://<ElastiCache_Endpoint>:<Port>"
    REDIS_TLS_ENABLED: "true"
    REDIS_PASSWORD: <Auth_Token>

    # Amazon S3 for log storage
    LOG_STORE: s3_assume
    LOG_STORE_REGION: <S3_BUCKET_REGION>
    LOG_STORE_GENERATIONS_BUCKET: <S3_BUCKET_NAME>

dataservice:
  name: "dataservice"
  enabled: true
  env:
    DEBUG_ENABLED: false
    SERVICE_NAME: "portkeyenterprise-dataservice"

service:
  type: ClusterIP
  port: 8787

ingress:
  enabled: true
  ingressClassName: "alb"
  annotations:
    alb.ingress.kubernetes.io/load-balancer-name: portkey-gateway
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/healthcheck-path: /v1/health
    alb.ingress.kubernetes.io/inbound-cidrs: 0.0.0.0/0
    alb.ingress.kubernetes.io/manage-backend-security-group-rules: "true"

AI Gateway + MCP Gateway with ALB and Host-Based Routing The following sample values.yaml shows how to deploy both AI Gateway and MCP Gateway with host-based routing using an Application Load Balancer, built-in Redis, and IRSA for S3.

images:
  gatewayImage:
    repository: "docker.io/portkeyai/gateway_enterprise"
    pullPolicy: Always
    tag: "latest"
  dataserviceImage:
    repository: "docker.io/portkeyai/data-service"
    pullPolicy: Always
    tag: "latest"
  redisImage:
    repository: "docker.io/redis"
    pullPolicy: IfNotPresent
    tag: "7.2-alpine"
imageCredentials:
  - name: portkeyenterpriseregistrycredentials
    create: true
    registry: https://index.docker.io/v1/
    username: <DOCKER_USERNAME>
    password: <DOCKER_PASSWORD>

serviceAccount:
  create: true
  automount: true
  name: gateway-sa
  annotations:
    eks.amazonaws.com/role-arn: <IAM_ROLE_ARN>

environment:
  create: true
  secret: true
  data:
    ANALYTICS_STORE: control_plane
    SERVICE_NAME: gateway
    PORTKEY_CLIENT_AUTH: <CLIENT_AUTH>
    ORGANISATIONS_TO_SYNC: <ORGANIZATION_ID>
    PORT: "8787"
    SERVER_MODE: "all"
    MCP_PORT: "8788"
    MCP_GATEWAY_BASE_URL: "https://mcp.example.com"

    # Built-in Redis
    CACHE_STORE: redis
    REDIS_URL: "redis://redis:6379"
    REDIS_TLS_ENABLED: "false"

    # Amazon S3 for log storage
    LOG_STORE: s3_assume
    LOG_STORE_REGION: <S3_BUCKET_REGION>
    LOG_STORE_GENERATIONS_BUCKET: <S3_BUCKET_NAME>

service:
  type: ClusterIP
  port: 8787

ingress:
  enabled: true
  hostname: "gateway.example.com"
  hostBased: true
  mcpHostname: "mcp.example.com"
  ingressClassName: "alb"
  annotations:
    alb.ingress.kubernetes.io/load-balancer-name: portkey-gateway
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/healthcheck-path: /v1/health
    alb.ingress.kubernetes.io/inbound-cidrs: 0.0.0.0/0
    alb.ingress.kubernetes.io/manage-backend-security-group-rules: "true"

Introduction

Product

Self-Hosting

Support

Components and Sizing Recommendations

Prerequisites

Create a Portkey Account

Setup Project Environment

Image Credentials Configuration

Configure Components

MCP Gateway (Optional)

Cache Store

Built-in Redis

Amazon ElastiCache

Log Store

Amazon S3

Data Service (Optional)

Network Configuration

Set Up External Access

AWS Application Load Balancer

AWS Network Load Balancer

Deploying Portkey Gateway

Verify the deployment

Integrating Gateway with Control Plane

AWS PrivateLink

Over the Internet

Inbound Connectivity (Control Plane to Data Plane)

AWS PrivateLink

IP Whitelisting

Verifying Gateway Integration with the Control Plane

Uninstalling Portkey Gateway

Setting up IAM Permission

Create IAM Role

Attach Permissions to IAM Role

Amazon S3

Amazon ElastiCache (Optional)

Amazon Bedrock (Optional)

Google Vertex AI via Workload Identity Federation (Optional)

Examples

Introduction

Product

Self-Hosting

Support

Documentation Index

​Components and Sizing Recommendations

​Prerequisites

​Create a Portkey Account

​Setup Project Environment

​Image Credentials Configuration

​Configure Components

​MCP Gateway (Optional)

​Cache Store

​Built-in Redis

​Amazon ElastiCache

​Log Store

​Amazon S3

​Data Service (Optional)

​Network Configuration

​Set Up External Access

​AWS Application Load Balancer

​AWS Network Load Balancer

​Deploying Portkey Gateway

​Verify the deployment

​Integrating Gateway with Control Plane

​AWS PrivateLink

​Over the Internet

​Inbound Connectivity (Control Plane to Data Plane)

​AWS PrivateLink

​IP Whitelisting

​Verifying Gateway Integration with the Control Plane

​Uninstalling Portkey Gateway

​Setting up IAM Permission

​Create IAM Role

​Attach Permissions to IAM Role

​Amazon S3

​Amazon ElastiCache (Optional)

​Amazon Bedrock (Optional)

​Google Vertex AI via Workload Identity Federation (Optional)

​Examples

Components and Sizing Recommendations

Prerequisites

Create a Portkey Account

Setup Project Environment

Image Credentials Configuration

Configure Components

MCP Gateway (Optional)

Cache Store

Built-in Redis

Amazon ElastiCache

Log Store

Amazon S3

Data Service (Optional)

Network Configuration

Set Up External Access

AWS Application Load Balancer

AWS Network Load Balancer

Deploying Portkey Gateway

Verify the deployment

Integrating Gateway with Control Plane

AWS PrivateLink

Over the Internet

Inbound Connectivity (Control Plane to Data Plane)

AWS PrivateLink

IP Whitelisting

Verifying Gateway Integration with the Control Plane

Uninstalling Portkey Gateway

Setting up IAM Permission

Create IAM Role

Attach Permissions to IAM Role

Amazon S3

Amazon ElastiCache (Optional)

Amazon Bedrock (Optional)

Google Vertex AI via Workload Identity Federation (Optional)

Examples