Skip to main content

Using Service Accounts for GKE workloads

As per GCP's documentation "A service account is a special type of Google account that belongs to your application or a virtual machine (VM), instead of to an individual end user. Your application assumes the identity of the service account to call Google APIs, so that the users aren't directly involved."

Service accounts are used to authenticate users of a specific application, whilst maintaining proper access control to cloud resources. They come in handy when you have multiple applications running on GCP, each requiring a distinct set of privileges to function. The idea is to grant the service account only the minimum set of permissions required to achieve their goal. I'll list the basic steps required to get up and running with a Python app running on GKE using a SA.

Setup

A Python app running on GKE, that interfaces with PubSub, Cloud Storage, and BigQuery through relevant client libraries. It's a good practice to start writing the app with the SA already set up, as the libraries will throw a Permission Denied exception anytime the relevant privileges are not configured. I prefer this approach to the one where you move from a GCP user credential to a SA after having written the app, and face these exceptions all at once.

For this example, we need the app to fetch PubSub messages, read Cloud Storage objects, create BigQuery tables, and start BigQuery load jobs.

So, we'll create a service account with the following roles:

  • Pub Sub Subscriber
  • Big Query User
  • BigQuery Data Editor
  • Cloud Storage Reader

BigQuery has a 5 kinds of predefined roles, which are described here. Broadly these are distinguished in their capabilities of being able to "create tables and datasets" and "use these tables for various tasks".

Project level IAM policies can be viewed by:

gcloud projects get-iam-policy PROJECT_ID

All of these permissions can be tested locally by modifying the environment variable GOOGLE_APPLICATION_CREDENTIALS in the app code itself. This can be done as:

import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'sa-credentials.json'

Don't forget to remove these lines when moving to production!

GKE configuration

Once the app is ready to be deployed, GKE needs to be configured with the appropriate credentials for the SA. This can be done using Kubernetes secret.

kubectl create secret generic sa-key --from-file=sa-credentials.json

On GKE, this makes the credentials file available to the container via a volume mounted at path /var/secrets/google. The deployment configuration needs to refer to this volume. This is done by adding volumeMounts to the containers property in the deployment configuration. env creates an environment variable in the container which will be used by default by the python application.

containers:
- name: subscriber
  image: gcr.io/PROJECT_ID/app_image:v1
  volumeMounts:
    - name: google-cloud-key
      mountPath: /var/secrets/google
  env:
    - name: GOOGLE_APPLICATION_CREDENTIALS
      value: /var/secrets/google/sa-credentials.json

The name of the credentials file in the yaml configuration needs to match with the one that was used while creating the secret.