Advanced

Google Vertex AI (Claude)

Claude is available through Google Cloud's Vertex AI platform as part of Vertex's Model Garden. Like AWS Bedrock, this lets GCP-native organisations use Claude without a separate Anthropic account — billing, IAM, VPC, and compliance controls all flow through GCP.

Why Use Vertex Instead of Direct API?

Reasons to use Vertex

  • GCP billing consolidation: Charge Claude usage to your GCP project — apply CUD commitments and billing controls
  • Service account auth: Use Google service accounts and ADC — no separate API key management
  • VPC-SC: Route Claude calls through VPC Service Controls — data stays within your GCP perimeter
  • Data residency: Select a GCP region — data stays within that region
  • GCP IAM: Grant Claude access via IAM roles consistent with your other GCP services
  • Integrated with Vertex pipelines: Use Claude as a step in ML Pipelines, Feature Store workflows, or BigQuery ML

Reasons to use direct API

  • Latest Claude models available immediately (Vertex lags on new releases)
  • Simpler setup — API key vs service account + project configuration
  • All Anthropic API features available (some may lag on Vertex)
  • Cloud-agnostic code — not locked to GCP

Enabling Claude on Vertex AI

To use Claude through Vertex AI:

  1. Open Google Cloud Console → Vertex AI → Model Garden
  2. Search for "Claude" — you'll see available Claude models (Haiku, Sonnet, Opus variants)
  3. Click "Enable" on the Claude model card — this enables the API for your project
  4. Accept the Anthropic terms of service (shown during enablement)
  5. Enable the Vertex AI API for your project if not already enabled: gcloud services enable aiplatform.googleapis.com

Once enabled, Claude is available via the Vertex AI Prediction API endpoint for your project and region.

Authentication: Service Accounts and ADC

Vertex AI uses Google authentication, not API keys. There are two approaches:

  • Application Default Credentials (ADC): The recommended approach for code running on GCP infrastructure (GCE, GKE, Cloud Run, Cloud Functions). The runtime environment automatically provides credentials. Run gcloud auth application-default login for local development.
  • Service account key file: For workloads running outside GCP. Create a service account with the Vertex AI User role, download a JSON key, and set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the key path. Avoid this in production if possible — ADC is more secure.

Vertex SDK vs Anthropic SDK

You have two options for calling Claude on Vertex:

Anthropic SDK with Vertex backend

Anthropic's Python and TypeScript SDKs support a Vertex AI backend. Set the project, region, and use AnthropicVertex instead of Anthropic. The request format is identical to the direct API.

Best for: teams already using the Anthropic SDK who want minimal code changes when switching to Vertex.

Google Cloud Vertex AI SDK

Use google-cloud-aiplatform library with the generative_models module. Consistent with other Vertex AI model calls; supports GCP-native features like Vertex Pipelines integration.

Best for: teams building GCP-native workflows where consistency with other Vertex models matters.

Integration with GCP Services

Running Claude on Vertex unlocks native integrations with other GCP services:

  • BigQuery: Use Claude as a SQL generation assistant or data analysis tool within BigQuery workflows — trigger Claude from BigQuery ML or via Cloud Functions
  • Cloud Storage: Reference documents in GCS buckets directly in Vertex pipeline steps — Claude can process files pulled from GCS without manual download
  • Vertex Pipelines (Kubeflow): Include Claude as a pipeline component — useful for ML workflows that involve text processing, labelling, or evaluation steps
  • Cloud Logging / Cloud Monitoring: All Vertex AI API calls log to Cloud Logging automatically — queryable for auditing, cost analysis, and debugging

Cost and Quota

  • Claude on Vertex is priced per token (input + output), similar to Bedrock and direct API — check the Vertex AI pricing page for current rates
  • Quotas are set at the project level — request quota increases via the GCP Console if you hit limits
  • Use GCP budget alerts and cost allocation labels to monitor Claude spend alongside other GCP costs
  • Vertex charges are separate from any Anthropic direct API spend — track each independently if using both

Checklist: Do You Understand This?

  • Vertex AI hosts Claude within GCP — enables GCP billing, service account auth, VPC-SC, and data residency
  • Enable Claude via Model Garden in the GCP Console; accept Anthropic ToS during enablement
  • Use ADC for authentication — avoid service account key files where possible
  • Anthropic SDK supports a Vertex backend (AnthropicVertex) with the same request format as direct API
  • New Claude versions lag on Vertex vs direct API — check Model Garden for current available versions
  • Native integrations with BigQuery, Cloud Storage, and Vertex Pipelines make it useful for GCP-native ML workflows

Page built: 01 Jun 2026