Using Service Accounts for GKE workloads
As per GCP's documentation "A service account is a special type of Google account that belongs to your application or a virtual machine (VM), instead of to an individual end user. Your application assumes the identity of the service account to call Google APIs, so that the users aren't directly involved."
Service accounts are used to authenticate users of a specific application, whilst maintaining proper access control to cloud resources. They come in handy when you have multiple applications running on GCP, each requiring a distinct set of privileges to function. The idea is to grant the service account only the minimum set of permissions required to achieve their goal. I'll list the basic steps required to get up and running with a Python app running on GKE using a SA.
Setup
A Python app running on GKE, that interfaces with PubSub, Cloud Storage, and BigQuery through relevant client libraries. It's a good practice to start writing the app with the SA already set up, as the libraries will throw a Permission Denied
exception anytime the relevant privileges are not configured. I prefer this approach to the one where you move from a GCP user credential to a SA after having written the app, and face these exceptions all at once.
For this example, we need the app to fetch PubSub messages, read Cloud Storage objects, create BigQuery tables, and start BigQuery load jobs.
So, we'll create a service account with the following roles:
- Pub Sub Subscriber
- Big Query User
- BigQuery Data Editor
- Cloud Storage Reader
BigQuery has a 5
kinds of predefined roles, which are described here. Broadly these are distinguished in their capabilities of being able to "create tables and datasets" and "use these tables for various tasks".
Project level IAM policies can be viewed by:
gcloud projects get-iam-policy PROJECT_ID
All of these permissions can be tested locally by modifying the environment variable GOOGLE_APPLICATION_CREDENTIALS
in the app code itself. This can be done as:
import os os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'sa-credentials.json'
Don't forget to remove these lines when moving to production!
GKE configuration
Once the app is ready to be deployed, GKE needs to be configured with the appropriate credentials for the SA. This can be done using Kubernetes secret.
kubectl create secret generic sa-key --from-file=sa-credentials.json
On GKE, this makes the credentials file available to the container via a volume mounted at path /var/secrets/google
. The deployment configuration needs to refer to this volume. This is done by adding volumeMounts
to the containers
property in the deployment configuration. env
creates an environment variable in the container which will be used by default by the python application.
containers: - name: subscriber image: gcr.io/PROJECT_ID/app_image:v1 volumeMounts: - name: google-cloud-key mountPath: /var/secrets/google env: - name: GOOGLE_APPLICATION_CREDENTIALS value: /var/secrets/google/sa-credentials.json
The name of the credentials file in the yaml
configuration needs to match with the one that was used while creating the secret.