Google Vertex AI (Claude)
Claude is available through Google Cloud's Vertex AI platform as part of Vertex's Model Garden. Like AWS Bedrock, this lets GCP-native organisations use Claude without a separate Anthropic account — billing, IAM, VPC, and compliance controls all flow through GCP.
Why Use Vertex Instead of Direct API?
Reasons to use Vertex
- GCP billing consolidation: Charge Claude usage to your GCP project — apply CUD commitments and billing controls
- Service account auth: Use Google service accounts and ADC — no separate API key management
- VPC-SC: Route Claude calls through VPC Service Controls — data stays within your GCP perimeter
- Data residency: Select a GCP region — data stays within that region
- GCP IAM: Grant Claude access via IAM roles consistent with your other GCP services
- Integrated with Vertex pipelines: Use Claude as a step in ML Pipelines, Feature Store workflows, or BigQuery ML
Reasons to use direct API
- Latest Claude models available immediately (Vertex lags on new releases)
- Simpler setup — API key vs service account + project configuration
- All Anthropic API features available (some may lag on Vertex)
- Cloud-agnostic code — not locked to GCP
Enabling Claude on Vertex AI
To use Claude through Vertex AI:
- Open Google Cloud Console → Vertex AI → Model Garden
- Search for "Claude" — you'll see available Claude models (Haiku, Sonnet, Opus variants)
- Click "Enable" on the Claude model card — this enables the API for your project
- Accept the Anthropic terms of service (shown during enablement)
- Enable the Vertex AI API for your project if not already enabled:
gcloud services enable aiplatform.googleapis.com
Once enabled, Claude is available via the Vertex AI Prediction API endpoint for your project and region.
Authentication: Service Accounts and ADC
Vertex AI uses Google authentication, not API keys. There are two approaches:
- Application Default Credentials (ADC): The recommended approach for code running on GCP infrastructure (GCE, GKE, Cloud Run, Cloud Functions). The runtime environment automatically provides credentials. Run
gcloud auth application-default loginfor local development. - Service account key file: For workloads running outside GCP. Create a service account with the
Vertex AI Userrole, download a JSON key, and set theGOOGLE_APPLICATION_CREDENTIALSenvironment variable to the key path. Avoid this in production if possible — ADC is more secure.
Vertex SDK vs Anthropic SDK
You have two options for calling Claude on Vertex:
Anthropic SDK with Vertex backend
Anthropic's Python and TypeScript SDKs support a Vertex AI backend. Set the project, region, and use AnthropicVertex instead of Anthropic. The request format is identical to the direct API.
Best for: teams already using the Anthropic SDK who want minimal code changes when switching to Vertex.
Google Cloud Vertex AI SDK
Use google-cloud-aiplatform library with the generative_models module. Consistent with other Vertex AI model calls; supports GCP-native features like Vertex Pipelines integration.
Best for: teams building GCP-native workflows where consistency with other Vertex models matters.
Integration with GCP Services
Running Claude on Vertex unlocks native integrations with other GCP services:
- BigQuery: Use Claude as a SQL generation assistant or data analysis tool within BigQuery workflows — trigger Claude from BigQuery ML or via Cloud Functions
- Cloud Storage: Reference documents in GCS buckets directly in Vertex pipeline steps — Claude can process files pulled from GCS without manual download
- Vertex Pipelines (Kubeflow): Include Claude as a pipeline component — useful for ML workflows that involve text processing, labelling, or evaluation steps
- Cloud Logging / Cloud Monitoring: All Vertex AI API calls log to Cloud Logging automatically — queryable for auditing, cost analysis, and debugging
Cost and Quota
- Claude on Vertex is priced per token (input + output), similar to Bedrock and direct API — check the Vertex AI pricing page for current rates
- Quotas are set at the project level — request quota increases via the GCP Console if you hit limits
- Use GCP budget alerts and cost allocation labels to monitor Claude spend alongside other GCP costs
- Vertex charges are separate from any Anthropic direct API spend — track each independently if using both
Checklist: Do You Understand This?
- Vertex AI hosts Claude within GCP — enables GCP billing, service account auth, VPC-SC, and data residency
- Enable Claude via Model Garden in the GCP Console; accept Anthropic ToS during enablement
- Use ADC for authentication — avoid service account key files where possible
- Anthropic SDK supports a Vertex backend (
AnthropicVertex) with the same request format as direct API - New Claude versions lag on Vertex vs direct API — check Model Garden for current available versions
- Native integrations with BigQuery, Cloud Storage, and Vertex Pipelines make it useful for GCP-native ML workflows