Skip to main content

GCP Ai Platform Endpoint

A Vertex AI Endpoint (formerly AI Platform Endpoint) is a fully-managed network interface that serves online (real-time) predictions for one or more machine-learning models. After a model is uploaded to Vertex AI, it is deployed to an Endpoint, which exposes a stable HTTPS URL and optional private service connect address, automatically scales the serving infrastructure, and allows advanced traffic-splitting, logging and monitoring features.
Official documentation: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints#Endpoint

Supported Methods​

  • GET: Get a gcp-ai-platform-endpoint by its "name"
  • LIST: List all gcp-ai-platform-endpoint
  • SEARCH

gcp-ai-platform-model-deployment-monitoring-job​

A Model Deployment Monitoring Job is configured for an Endpoint to analyse prediction traffic and detect drift or anomalies. An Endpoint can therefore be the parent resource of one or more monitoring jobs.

dns​

Each Endpoint is reached through a Google-managed DNS name such as *.aiplatform.googleapis.com. Overmind may link the Endpoint to the corresponding DNS record that resolves that hostname.

gcp-big-query-table​

Prediction requests and responses served by an Endpoint can be logged to a BigQuery table. When logging is enabled, the Endpoint will be related to the destination gcp-big-query-table.

gcp-cloud-kms-crypto-key​

An Endpoint can be configured with a Customer-Managed Encryption Key (CMEK) to encrypt the underlying storage and traffic metadata. In such a case it references a gcp-cloud-kms-crypto-key.

gcp-compute-network​

If private service connect is enabled, the Endpoint is exposed through an internal IP range within a specified VPC. The Endpoint is therefore related to the gcp-compute-network that provides that connectivity.

gcp-ai-platform-model​

One or more gcp-ai-platform-model resources are deployed to an Endpoint as DeployedModels. The Endpoint distributes incoming traffic to these models according to the configured traffic split.