Skip to main content

GCP Dataform Repository

A GCP Dataform Repository is a managed Git-style repository that stores the SQLX files, configuration and workflow definitions used by Cloud Dataform to build, test and deploy BigQuery data pipelines. Each repository lives under projects/{project}/locations/{location}/repositories/{repository} and can be connected to an external Git provider or managed directly inside Google Cloud. Repositories are version-controlled, can contain multiple branches and are the primary unit manipulated by the Dataform API when compiling or running workflows.
Official documentation: https://cloud.google.com/dataform/reference/rest/v1/projects.locations.repositories/get

Terrafrom Mappings:

  • google_dataform_repository.id

Supported Methods​

  • GET: Get a gcp-dataform-repository by its "locations|repositories"
  • LIST
  • SEARCH: Search for Dataform repositories in a location. Use the format "location" or "projects/[project_id]/locations/[location]/repositories/[repository_name]" which is supported for terraform mappings.

gcp-secret-manager-secret​

A Dataform repository can reference Secret Manager secrets for environment variables or credentials used by SQLX scripts during workflow execution. Overmind links the repository to any secrets it reads so you can see the blast-radius of a leaked or rotated secret.

gcp-iam-service-account​

Cloud Dataform executes compilation and workflow jobs with a Google-managed service agent or an explicitly-configured customer service account. Overmind links the repository to those IAM service accounts to expose the permissions that govern what the repository’s jobs can do.

gcp-cloud-kms-crypto-key​

If Customer-Managed Encryption Keys (CMEK) are enabled, the content of the Dataform repository is encrypted using a Cloud KMS CryptoKey. This link highlights which keys protect the repository so you can assess key rotation or revocation impacts.